snippetsqlMinor
How to sum up the distinct Total Time of an Event ignoring duplicate overlaps in Times?
Viewed 0 times
totaldistinctthehowduplicateoverlapstimesumignoringtimes
Problem
I have the following
With the following data:
Notice:
End Goal:
I'm trying to calculate the
The
Expected final results for the provided examples in the scripts above:
dbfiddle.uk for reference.
Other Info:
-
If it's easier to work with datetimes, feel free to alter the data types from
EventTimes table:DROP TABLE IF EXISTS dbo.EventTimes;
CREATE TABLE dbo.EventTimes
(
EventTimeKeyId INT IDENTITY(1,1) PRIMARY KEY,
EventId INT NOT NULL,
StartTime TIME NOT NULL,
EndTime TIME NOT NULL
);With the following data:
-- Event 1
INSERT INTO dbo.EventTimes (EventId, StartTime, EndTime)
VALUES
(1, '04:00:00', '14:00:00'),
(1, '06:00:00', '11:00:00'),
(1, '09:00:00', '12:00:00'),
(1, '13:00:00', '14:00:00'), -- Gap between this row and the next row
(1, '02:30:00', '04:00:00'); -- Notice the half-hour on this one
-- Event 2
INSERT INTO dbo.EventTimes (EventId, StartTime, EndTime)
VALUES
(2, '00:00:00', '06:00:00'), -- Gap between this row and the next row
(2, '09:00:00', '13:00:00'),
(2, '11:00:00', '15:00:00');Notice:
- The same
Eventcan have two time ranges that overlap each other. E.g. sameEventfrom 4 AM to 2 PM and and also from 6 AM to 11 AM.
- There can also be gaps between the two time ranges. E.g. same
Eventfrom 1 PM to 2 PM and also from 2:30 PM to 4 PM.
End Goal:
I'm trying to calculate the
TotalTime of a given Event ignoring the duplicate overlapping time. E.g. for the set of ranges of 9 AM to 1 PM and 11 AM to 3 PM, the distinct TotalTime should be 6 hours (9 AM to 3 PM). Conversely, I also don't want to count the time in the gaps between two time ranges. So for the set of ranges of 1 PM to 2 PM and 2:30 PM to 4 PM the TotalTime should be 2.5 hours. (Note these are just subsets of the full example above, and the final result should be the sum of all of these unique times per Event.)The
TotalTime should never exceed 24 hours, these times are all within a single day (as is the TIME data type).Expected final results for the provided examples in the scripts above:
dbfiddle.uk for reference.
Other Info:
-
If it's easier to work with datetimes, feel free to alter the data types from
TIME to DATETIME. I can convert the results back, no probleSolution
All the complexity results from the overlapping intervals.
If we pack the intervals so there are no overlaps, the solution is a straightforward group and sum.
The following code uses Itzik Ben-Gan's interval packing solution:
Packing intervals is a classic T-SQL problem that involves packing groups of intersecting intervals into their respective continuous intervals. I set a challenge to myself to try and find an elegant solution that can achieve the task by using only one supporting index and a single scan of the data, and I found one.
EventId
TotalTime
1
11.500000
2
12.000000
The initial sort can be avoided with an index on
If we pack the intervals so there are no overlaps, the solution is a straightforward group and sum.
The following code uses Itzik Ben-Gan's interval packing solution:
Packing intervals is a classic T-SQL problem that involves packing groups of intersecting intervals into their respective continuous intervals. I set a challenge to myself to try and find an elegant solution that can achieve the task by using only one supporting index and a single scan of the data, and I found one.
-- Itzik Ben-Gan's interval packing solution
-- https://www.itprotoday.com/sql-server/new-solution-packing-intervals-problem
WITH
C1 AS
(
SELECT
*,
prvend = MAX(ET.EndTime) OVER (
PARTITION BY ET.EventId
ORDER BY ET.StartTime, ET.EndTime
ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING)
FROM dbo.EventTimes AS ET
),
C2 AS
(
SELECT
*,
grp = SUM(isstart) OVER (
PARTITION BY C1.EventId
ORDER BY C1.StartTime, C1.EndTime
ROWS UNBOUNDED PRECEDING)
FROM C1
CROSS APPLY
(
VALUES
(
CASE
WHEN C1.StartTime <= C1.prvend
THEN NULL
ELSE 1
END
)
) AS A (isstart)
),
Packed AS
(
SELECT
C2.EventId,
StartTime = MIN(C2.StartTime),
EndTime = MAX(C2.EndTime)
FROM C2
GROUP BY
C2.EventId,
C2.grp
)
-- Sum the packed intervals by EventId
SELECT
P.EventId,
TotalTime = SUM(DATEDIFF(MINUTE, P.StartTime, P.EndTime)) / 60.0
FROM Packed AS P
GROUP BY
P.EventId;EventId
TotalTime
1
11.500000
2
12.000000
The initial sort can be avoided with an index on
(EventId, StartTime, EndTime).Code Snippets
-- Itzik Ben-Gan's interval packing solution
-- https://www.itprotoday.com/sql-server/new-solution-packing-intervals-problem
WITH
C1 AS
(
SELECT
*,
prvend = MAX(ET.EndTime) OVER (
PARTITION BY ET.EventId
ORDER BY ET.StartTime, ET.EndTime
ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING)
FROM dbo.EventTimes AS ET
),
C2 AS
(
SELECT
*,
grp = SUM(isstart) OVER (
PARTITION BY C1.EventId
ORDER BY C1.StartTime, C1.EndTime
ROWS UNBOUNDED PRECEDING)
FROM C1
CROSS APPLY
(
VALUES
(
CASE
WHEN C1.StartTime <= C1.prvend
THEN NULL
ELSE 1
END
)
) AS A (isstart)
),
Packed AS
(
SELECT
C2.EventId,
StartTime = MIN(C2.StartTime),
EndTime = MAX(C2.EndTime)
FROM C2
GROUP BY
C2.EventId,
C2.grp
)
-- Sum the packed intervals by EventId
SELECT
P.EventId,
TotalTime = SUM(DATEDIFF(MINUTE, P.StartTime, P.EndTime)) / 60.0
FROM Packed AS P
GROUP BY
P.EventId;Context
StackExchange Database Administrators Q#333229, answer score: 3
Revisions (0)
No revisions yet.