snippetsqlMajor
How to recursively find gaps where 90 days passed, between rows
Viewed 0 times
rowshowwherepassedrecursivelybetweenfinddaysgaps
Problem
This is a kind of trivial task in my C# homeworld, but I don't yet make it in SQL and would prefer to solve it set-based (without cursors). A resultset should come from a query like this.
How should it work
I send those three params into a UDF.
The UDF internally use params to fetch related <= 90 days older rows, from a view.
The UDF traverse 'MyDate' and return 1 if it should be included in a total calculation.
If it should not, then it return 0.
Named here as "qualifying".
What the udf will do
List the rows in date order.
Calculate the days between rows.
First row in resultset defaults to Hit = 1.
If the difference is up to 90,
- then pass to next row until the sum of gaps is 90 days (90th day must pass)
When reached, set Hit to 1 and reset gap to 0.
It would also work to instead omit the row from result.
In the table above, MaxDiff column is the gap from date in previous line. The problem with my attempts so far is that I can't ignore second last row in the sample above.
[EDIT]
As per comment I add a tag and also paste the udf I have compiled just now. Though, is just a placeholder and won't give useful result.
```
;WITH cte (someid, otherkey, mydate, cost) AS
(
SELECT someid, otherkey, mydate, cost
FROM dbo.vGetVisits
WHERE someid = @someid AND VisitCode = 3 AND otherkey = @otherkey
AND CONVERT(Date,mydate) = @VisitDate
UNION ALL
SELECT top 1 e.someid, e.otherkey, e.mydate, e.cost
FROM dbo.vGetVisits AS E
WHERE CONVERT(date, e.mydate)
BETWEEN D
SELECT SomeId, MyDate,
dbo.udfLastHitRecursive(param1, param2, MyDate) as 'Qualifying'
FROM THow should it work
I send those three params into a UDF.
The UDF internally use params to fetch related <= 90 days older rows, from a view.
The UDF traverse 'MyDate' and return 1 if it should be included in a total calculation.
If it should not, then it return 0.
Named here as "qualifying".
What the udf will do
List the rows in date order.
Calculate the days between rows.
First row in resultset defaults to Hit = 1.
If the difference is up to 90,
- then pass to next row until the sum of gaps is 90 days (90th day must pass)
When reached, set Hit to 1 and reset gap to 0.
It would also work to instead omit the row from result.
|(column by udf, which not work yet)
Date Calc_date MaxDiff | Qualifying
2014-01-01 11:00 2014-01-01 0 | 1
2014-01-03 10:00 2014-01-01 2 | 0
2014-01-04 09:30 2014-01-03 1 | 0
2014-04-01 10:00 2014-01-04 87 | 0
2014-05-01 11:00 2014-04-01 30 | 1In the table above, MaxDiff column is the gap from date in previous line. The problem with my attempts so far is that I can't ignore second last row in the sample above.
[EDIT]
As per comment I add a tag and also paste the udf I have compiled just now. Though, is just a placeholder and won't give useful result.
```
;WITH cte (someid, otherkey, mydate, cost) AS
(
SELECT someid, otherkey, mydate, cost
FROM dbo.vGetVisits
WHERE someid = @someid AND VisitCode = 3 AND otherkey = @otherkey
AND CONVERT(Date,mydate) = @VisitDate
UNION ALL
SELECT top 1 e.someid, e.otherkey, e.mydate, e.cost
FROM dbo.vGetVisits AS E
WHERE CONVERT(date, e.mydate)
BETWEEN D
Solution
As I read the question, the basic recursive algorithm required is:
This is relatively easy to implement with a recursive common table expression.
For example, using the following sample data (based on the question):
The recursive code is:
The results are:
With an index having
You could choose to wrap this in a function and execute it directly against the view mentioned in the question, but my instincts are against it. Usually, performance is better when you select rows from a view into a temporary table, provide the appropriate index on the temporary table, then apply the logic above. The details depend on the details of the view, but this is my general experience.
For completeness (and prompted by ypercube's answer) I should mention that my other go-to solution for this type of problem (until T-SQL gets proper ordered set functions) is a SQLCLR cursor (see my answer here for an example of the technique). This performs much better than a T-SQL cursor, and is convenient for those with skills in .NET languages and the ability to run SQLCLR in their production environment. It may not offer much in this scenario over the recursive solution because the majority of the cost is the sort, but it is worth mentioning.
- Return the row with the earliest date in the set
- Set that date as "current"
- Find the row with the earliest date more than 90 days after the current date
- Repeat from step 2 until no more rows are found
This is relatively easy to implement with a recursive common table expression.
For example, using the following sample data (based on the question):
DECLARE @T AS table (TheDate datetime PRIMARY KEY);
INSERT @T (TheDate)
VALUES
('2014-01-01 11:00'),
('2014-01-03 10:00'),
('2014-01-04 09:30'),
('2014-04-01 10:00'),
('2014-05-01 11:00'),
('2014-07-01 09:00'),
('2014-07-31 08:00');The recursive code is:
WITH CTE AS
(
-- Anchor:
-- Start with the earliest date in the table
SELECT TOP (1)
T.TheDate
FROM @T AS T
ORDER BY
T.TheDate
UNION ALL
-- Recursive part
SELECT
SQ1.TheDate
FROM
(
-- Recursively find the earliest date that is
-- more than 90 days after the "current" date
-- and set the new date as "current".
-- ROW_NUMBER + rn = 1 is a trick to get
-- TOP in the recursive part of the CTE
SELECT
T.TheDate,
rn = ROW_NUMBER() OVER (
ORDER BY T.TheDate)
FROM CTE
JOIN @T AS T
ON T.TheDate > DATEADD(DAY, 90, CTE.TheDate)
) AS SQ1
WHERE
SQ1.rn = 1
)
SELECT
CTE.TheDate
FROM CTE
OPTION (MAXRECURSION 0);The results are:
╔═════════════════════════╗
║ TheDate ║
╠═════════════════════════╣
║ 2014-01-01 11:00:00.000 ║
║ 2014-05-01 11:00:00.000 ║
║ 2014-07-31 08:00:00.000 ║
╚═════════════════════════╝With an index having
TheDate as a leading key, the execution plan is very efficient:You could choose to wrap this in a function and execute it directly against the view mentioned in the question, but my instincts are against it. Usually, performance is better when you select rows from a view into a temporary table, provide the appropriate index on the temporary table, then apply the logic above. The details depend on the details of the view, but this is my general experience.
For completeness (and prompted by ypercube's answer) I should mention that my other go-to solution for this type of problem (until T-SQL gets proper ordered set functions) is a SQLCLR cursor (see my answer here for an example of the technique). This performs much better than a T-SQL cursor, and is convenient for those with skills in .NET languages and the ability to run SQLCLR in their production environment. It may not offer much in this scenario over the recursive solution because the majority of the cost is the sort, but it is worth mentioning.
Code Snippets
DECLARE @T AS table (TheDate datetime PRIMARY KEY);
INSERT @T (TheDate)
VALUES
('2014-01-01 11:00'),
('2014-01-03 10:00'),
('2014-01-04 09:30'),
('2014-04-01 10:00'),
('2014-05-01 11:00'),
('2014-07-01 09:00'),
('2014-07-31 08:00');WITH CTE AS
(
-- Anchor:
-- Start with the earliest date in the table
SELECT TOP (1)
T.TheDate
FROM @T AS T
ORDER BY
T.TheDate
UNION ALL
-- Recursive part
SELECT
SQ1.TheDate
FROM
(
-- Recursively find the earliest date that is
-- more than 90 days after the "current" date
-- and set the new date as "current".
-- ROW_NUMBER + rn = 1 is a trick to get
-- TOP in the recursive part of the CTE
SELECT
T.TheDate,
rn = ROW_NUMBER() OVER (
ORDER BY T.TheDate)
FROM CTE
JOIN @T AS T
ON T.TheDate > DATEADD(DAY, 90, CTE.TheDate)
) AS SQ1
WHERE
SQ1.rn = 1
)
SELECT
CTE.TheDate
FROM CTE
OPTION (MAXRECURSION 0);╔═════════════════════════╗
║ TheDate ║
╠═════════════════════════╣
║ 2014-01-01 11:00:00.000 ║
║ 2014-05-01 11:00:00.000 ║
║ 2014-07-31 08:00:00.000 ║
╚═════════════════════════╝Context
StackExchange Database Administrators Q#83421, answer score: 25
Revisions (0)
No revisions yet.