snippetsqlMinor
How to sample records by time
Viewed 0 times
recordssampletimehow
Problem
So I've got a table in SQL Server 2014 that stores events. Events are generated no faster than once per 10 seconds, but can be much slower (minutes or hours between events). The time between events is not uniform.
I'd like to sample the records by time.
For instance, I'd like to ask for a 10 minute sampling - so it would grab record 1, and the next record should be one that is at least 10 minutes later, and then the next is at least 10 minutes later than row 2, and so on.
I can achieve this with an iterative approach, looping over each record, but I'd like to let the database do the work. The problem is, I don't know how to do this using T-SQL set/window statements.
Can anyone help me out?
Example data:
Desired result:
I'd like to sample the records by time.
For instance, I'd like to ask for a 10 minute sampling - so it would grab record 1, and the next record should be one that is at least 10 minutes later, and then the next is at least 10 minutes later than row 2, and so on.
I can achieve this with an iterative approach, looping over each record, but I'd like to let the database do the work. The problem is, I don't know how to do this using T-SQL set/window statements.
Can anyone help me out?
Example data:
EventID | RecordTime
--------+--------------------
1 | 2017-04-01 12:00:00
2 | 2017-04-01 12:00:10
3 | 2017-04-01 12:00:20
4 | 2017-04-01 12:00:32
5 | 2017-04-01 12:05:42
6 | 2017-04-01 12:09:00
7 | 2017-04-01 12:24:12
8 | 2017-04-01 12:36:46
9 | 2017-04-01 12:36:57
10 | 2017-04-01 15:00:00Desired result:
EventID | RecordTime
--------+--------------------
1 | 2017-04-01 12:00:00
7 | 2017-04-01 12:24:12
8 | 2017-04-01 12:36:46
10 | 2017-04-01 15:00:00- EventIDs 2-6 are all within 10 minutes of record 1, so I don't want them.
- EventID 7 is 24 minutes 12 seconds later than the last record in my result set (EventID 1), so I do want it.
- EventID 8 is 12 minutes 34 seconds later than the last record in my result set (EventID 7), so I do want it.
- EventID 9 is 11 seconds later than the last record in my result set (EventID 8), so I do not want it.
- EventID 10 is apx. 2.5 hours later than the last record in my result set (EventID 9), so I do want it.
Solution
If you don't want event at
In other words, there has to be gap of more than 10 minutes between two events to add a row to the result set.
Sample data
Query
Result
12:10:01, then answer is very simple - use LEAD or LAG to compare timestamps of the two consecutive rows. Just be aware that if you have a long sequence of events in which each pair of events is less than 10 minutes apart, then the query will return only the first event of this sequence.In other words, there has to be gap of more than 10 minutes between two events to add a row to the result set.
Sample data
DECLARE @T TABLE(EventID int, RecordTime datetime2(0));
INSERT INTO @T (EventID, RecordTime) VALUES
( 1, '2017-04-01 12:00:00'),
( 2, '2017-04-01 12:00:10'),
( 3, '2017-04-01 12:00:20'),
( 4, '2017-04-01 12:00:32'),
( 5, '2017-04-01 12:05:42'),
( 6, '2017-04-01 12:09:00'),
( 7, '2017-04-01 12:24:12'),
( 8, '2017-04-01 12:36:46'),
( 9, '2017-04-01 12:36:57'),
(10, '2017-04-01 15:00:00');Query
WITH
CTE
AS
(
SELECT
EventID
,RecordTime
,LAG(RecordTime) OVER (ORDER BY EventID) AS PrevRecordTime
FROM @T
)
SELECT
EventID
,RecordTime
FROM
CTE
CROSS APPLY
(
SELECT DATEDIFF(second, PrevRecordTime, RecordTime) AS RecordDiffSeconds
) AS CA
WHERE
RecordDiffSeconds IS NULL
OR RecordDiffSeconds > 600
ORDER BY EventID;Result
+---------+---------------------+
| EventID | RecordTime |
+---------+---------------------+
| 1 | 2017-04-01 12:00:00 |
| 7 | 2017-04-01 12:24:12 |
| 8 | 2017-04-01 12:36:46 |
| 10 | 2017-04-01 15:00:00 |
+---------+---------------------+Code Snippets
DECLARE @T TABLE(EventID int, RecordTime datetime2(0));
INSERT INTO @T (EventID, RecordTime) VALUES
( 1, '2017-04-01 12:00:00'),
( 2, '2017-04-01 12:00:10'),
( 3, '2017-04-01 12:00:20'),
( 4, '2017-04-01 12:00:32'),
( 5, '2017-04-01 12:05:42'),
( 6, '2017-04-01 12:09:00'),
( 7, '2017-04-01 12:24:12'),
( 8, '2017-04-01 12:36:46'),
( 9, '2017-04-01 12:36:57'),
(10, '2017-04-01 15:00:00');WITH
CTE
AS
(
SELECT
EventID
,RecordTime
,LAG(RecordTime) OVER (ORDER BY EventID) AS PrevRecordTime
FROM @T
)
SELECT
EventID
,RecordTime
FROM
CTE
CROSS APPLY
(
SELECT DATEDIFF(second, PrevRecordTime, RecordTime) AS RecordDiffSeconds
) AS CA
WHERE
RecordDiffSeconds IS NULL
OR RecordDiffSeconds > 600
ORDER BY EventID;+---------+---------------------+
| EventID | RecordTime |
+---------+---------------------+
| 1 | 2017-04-01 12:00:00 |
| 7 | 2017-04-01 12:24:12 |
| 8 | 2017-04-01 12:36:46 |
| 10 | 2017-04-01 15:00:00 |
+---------+---------------------+Context
StackExchange Database Administrators Q#168881, answer score: 4
Revisions (0)
No revisions yet.