snippetMinor
How to group data within a time span
Viewed 0 times
groupspantimewithinhowdata
Problem
I have a group of rows sorted by date as below:
DDL & DML for those willing to help:
I am trying to group the rows and add an additional column to mark the group. The criterion is that any two rows should be in one group if the time gap between them is less than 3 days, something like below:
How can this be achieved?
2017-01-01
2017-01-01
2017-01-02
2017-01-03
2017-01-04
2017-01-17
2017-01-18
2017-01-18
2017-01-18
2017-01-19
2017-01-19
2017-01-26
2017-01-27
2017-01-27DDL & DML for those willing to help:
CREATE TABLE foo ( DateCol date );
INSERT INTO foo ( DateCol )
VALUES
( '2017-01-01' ),
( '2017-01-01' ),
( '2017-01-02' ),
( '2017-01-03' ),
( '2017-01-04' ),
( '2017-01-17' ),
( '2017-01-18' ),
( '2017-01-18' ),
( '2017-01-18' ),
( '2017-01-19' ),
( '2017-01-19' ),
( '2017-01-26' ),
( '2017-01-27' ),
( '2017-01-27' )
);I am trying to group the rows and add an additional column to mark the group. The criterion is that any two rows should be in one group if the time gap between them is less than 3 days, something like below:
2017-01-01 A
2017-01-01 A
2017-01-02 A
2017-01-03 A
2017-01-04 A
2017-01-17 B
2017-01-18 B
2017-01-18 B
2017-01-18 B
2017-01-19 B
2017-01-19 B
2017-01-26 C
2017-01-27 C
2017-01-27 CHow can this be achieved?
Solution
Window Function Method
First we generate resets with a window function
Then we generate numeric groups
Then we use
SELECT date,
chr(
65 + CAST(count(reset) OVER (ORDER BY date) AS int)
) AS dategrp
FROM (
SELECT date, CASE WHEN date - lag(date) OVER () > 3 THEN 1 END AS reset
FROM foo
ORDER BY date
) AS tFirst we generate resets with a window function
SELECT date, CASE WHEN date - lag(date) OVER () > 3 THEN 1 END AS reset
FROM foo
ORDER BY date
date | reset
------------+-------
2017-01-01 |
2017-01-01 |
2017-01-02 |
2017-01-03 |
2017-01-04 |
2017-01-17 | 1
2017-01-18 |
2017-01-18 |
2017-01-18 |
2017-01-19 |
2017-01-19 |
2017-01-26 | 1
2017-01-27 |
2017-01-27 |Then we generate numeric groups
SELECT date, count(reset) OVER (ORDER BY date)
FROM (
SELECT date, CASE WHEN date - lag(date) OVER () > 3 THEN 1 END AS reset
FROM foo
ORDER BY date
) AS t
date | count
------------+-------
2017-01-01 | 0
2017-01-01 | 0
2017-01-02 | 0
2017-01-03 | 0
2017-01-04 | 0
2017-01-17 | 1
2017-01-18 | 1
2017-01-18 | 1
2017-01-18 | 1
2017-01-19 | 1
2017-01-19 | 1
2017-01-26 | 2
2017-01-27 | 2
2017-01-27 | 2Then we use
chr(65+) to get the alpha group names you want. Original query at the top..date | dategrp
------------+---------
2017-01-01 | A
2017-01-01 | A
2017-01-02 | A
2017-01-03 | A
2017-01-04 | A
2017-01-17 | B
2017-01-18 | B
2017-01-18 | B
2017-01-18 | B
2017-01-19 | B
2017-01-19 | B
2017-01-26 | C
2017-01-27 | C
2017-01-27 | C
(14 rows)Code Snippets
SELECT date,
chr(
65 + CAST(count(reset) OVER (ORDER BY date) AS int)
) AS dategrp
FROM (
SELECT date, CASE WHEN date - lag(date) OVER () > 3 THEN 1 END AS reset
FROM foo
ORDER BY date
) AS tSELECT date, CASE WHEN date - lag(date) OVER () > 3 THEN 1 END AS reset
FROM foo
ORDER BY date
date | reset
------------+-------
2017-01-01 |
2017-01-01 |
2017-01-02 |
2017-01-03 |
2017-01-04 |
2017-01-17 | 1
2017-01-18 |
2017-01-18 |
2017-01-18 |
2017-01-19 |
2017-01-19 |
2017-01-26 | 1
2017-01-27 |
2017-01-27 |SELECT date, count(reset) OVER (ORDER BY date)
FROM (
SELECT date, CASE WHEN date - lag(date) OVER () > 3 THEN 1 END AS reset
FROM foo
ORDER BY date
) AS t
date | count
------------+-------
2017-01-01 | 0
2017-01-01 | 0
2017-01-02 | 0
2017-01-03 | 0
2017-01-04 | 0
2017-01-17 | 1
2017-01-18 | 1
2017-01-18 | 1
2017-01-18 | 1
2017-01-19 | 1
2017-01-19 | 1
2017-01-26 | 2
2017-01-27 | 2
2017-01-27 | 2date | dategrp
------------+---------
2017-01-01 | A
2017-01-01 | A
2017-01-02 | A
2017-01-03 | A
2017-01-04 | A
2017-01-17 | B
2017-01-18 | B
2017-01-18 | B
2017-01-18 | B
2017-01-19 | B
2017-01-19 | B
2017-01-26 | C
2017-01-27 | C
2017-01-27 | C
(14 rows)Context
StackExchange Database Administrators Q#163819, answer score: 4
Revisions (0)
No revisions yet.