patternMinor
Selecting 'edge' records in a sequence of data
Viewed 0 times
recordsedgesequenceselectingdata
Problem
I have a table of items which run sequentially by date with both a start and end for each entry:
Note that there are gaps in the data (for a weekend and a bank holiday). Is it possible to write a query which selects the records at the edges of the gaps? What I want is a general purpose way to end up with something like this:
Note that the start time of 1 and 6 has been updated, as has the end time of 5 and 8.
In the real database I am forced to used this day by day data layout as it has to retain compatibility with a legacy client, so I can't simply insert the ranges directly. Also in the real data I could be dealing with several hundred of these records at a time with multiple gaps.
The target platform is SQL2005/2008, though bonus points for solutions which work in SQL2000 compatibility mode.
ID | Start | End
----------------------------------------
1 | 2012-08-20 00:00 | 2012-08-20 23:59
2 | 2012-08-21 00:00 | 2012-08-21 23:59
3 | 2012-08-22 00:00 | 2012-08-22 23:59
4 | 2012-08-23 00:00 | 2012-08-23 23:59
5 | 2012-08-24 00:00 | 2012-08-24 23:59
6 | 2012-08-28 00:00 | 2012-08-28 23:59
7 | 2012-08-29 00:00 | 2012-08-29 23:59
8 | 2012-08-30 00:00 | 2012-08-30 23:59Note that there are gaps in the data (for a weekend and a bank holiday). Is it possible to write a query which selects the records at the edges of the gaps? What I want is a general purpose way to end up with something like this:
ID | Start | End
----------------------------------------
1 | 2012-08-20 09:00 | 2012-08-20 23:59
2 | 2012-08-21 00:00 | 2012-08-21 23:59
3 | 2012-08-22 00:00 | 2012-08-22 23:59
4 | 2012-08-23 00:00 | 2012-08-23 23:59
5 | 2012-08-24 00:00 | 2012-08-24 17:00
6 | 2012-08-28 09:00 | 2012-08-28 23:59
7 | 2012-08-29 00:00 | 2012-08-29 23:59
8 | 2012-08-30 00:00 | 2012-08-30 17:00Note that the start time of 1 and 6 has been updated, as has the end time of 5 and 8.
In the real database I am forced to used this day by day data layout as it has to retain compatibility with a legacy client, so I can't simply insert the ranges directly. Also in the real data I could be dealing with several hundred of these records at a time with multiple gaps.
The target platform is SQL2005/2008, though bonus points for solutions which work in SQL2000 compatibility mode.
Solution
You can use
The query above works from SQL Server 2005 even with compatibility SQL Server 2000 (80).
If you find your way to SQL Server 2012 you can use lead and lag instead of simulating with an
SQL Fiddle
outer apply the get the lead and lag values and in the column list you can use datediff to see if the gap is something other than one day.select T1.ID,
case
when datediff(day, Lag.Start, T1.Start) = 1 then T1.Start
else dateadd(hour, 9, T1.Start)
end as Start,
case
when datediff(day, T1.Start, Lead.Start) = 1 then T1.[End]
else dateadd(hour, 17, T1.Start)
end as [End]
from YourTable as T1
outer apply
(
select top(1) T2.Start
from YourTable as T2
where T2.Start T1.Start
order by T2.Start
) as Lead(Start)The query above works from SQL Server 2005 even with compatibility SQL Server 2000 (80).
If you find your way to SQL Server 2012 you can use lead and lag instead of simulating with an
outer apply.select T1.ID,
case
when datediff(day, lag(T1.Start) over(order by T1.Start), T1.Start) = 1
then T1.Start
else dateadd(hour, 9, T1.Start)
end as Start,
case
when datediff(day, T1.Start, lead(T1.Start) over(order by T1.Start)) = 1
then T1.[End]
else dateadd(hour, 17, T1.Start)
end as [End]
from YourTable as T1SQL Fiddle
Code Snippets
select T1.ID,
case
when datediff(day, Lag.Start, T1.Start) = 1 then T1.Start
else dateadd(hour, 9, T1.Start)
end as Start,
case
when datediff(day, T1.Start, Lead.Start) = 1 then T1.[End]
else dateadd(hour, 17, T1.Start)
end as [End]
from YourTable as T1
outer apply
(
select top(1) T2.Start
from YourTable as T2
where T2.Start < T1.Start
order by T2.Start desc
) as Lag(Start)
outer apply
(
select top(1) T2.Start
from YourTable as T2
where T2.Start > T1.Start
order by T2.Start
) as Lead(Start)select T1.ID,
case
when datediff(day, lag(T1.Start) over(order by T1.Start), T1.Start) = 1
then T1.Start
else dateadd(hour, 9, T1.Start)
end as Start,
case
when datediff(day, T1.Start, lead(T1.Start) over(order by T1.Start)) = 1
then T1.[End]
else dateadd(hour, 17, T1.Start)
end as [End]
from YourTable as T1Context
StackExchange Database Administrators Q#22882, answer score: 9
Revisions (0)
No revisions yet.