HiveBrain v1.2.0
Get Started
← Back to all entries
snippetMinor

How to group data within a time span

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
groupspantimewithinhowdata

Problem

I have a group of rows sorted by date as below:

2017-01-01 
2017-01-01
2017-01-02
2017-01-03
2017-01-04
2017-01-17
2017-01-18
2017-01-18
2017-01-18
2017-01-19
2017-01-19
2017-01-26
2017-01-27
2017-01-27


DDL & DML for those willing to help:

CREATE TABLE foo ( DateCol date );
INSERT INTO foo ( DateCol )
VALUES
  ( '2017-01-01' ),
  ( '2017-01-01' ),
  ( '2017-01-02' ),
  ( '2017-01-03' ),
  ( '2017-01-04' ),
  ( '2017-01-17' ),
  ( '2017-01-18' ),
  ( '2017-01-18' ),
  ( '2017-01-18' ),
  ( '2017-01-19' ),
  ( '2017-01-19' ),
  ( '2017-01-26' ),
  ( '2017-01-27' ),
  ( '2017-01-27' )
);


I am trying to group the rows and add an additional column to mark the group. The criterion is that any two rows should be in one group if the time gap between them is less than 3 days, something like below:

2017-01-01  A
2017-01-01  A
2017-01-02  A
2017-01-03  A
2017-01-04  A
2017-01-17  B
2017-01-18  B
2017-01-18  B
2017-01-18  B
2017-01-19  B
2017-01-19  B
2017-01-26  C
2017-01-27  C
2017-01-27  C


How can this be achieved?

Solution

Window Function Method

SELECT date,
  chr(
    65 + CAST(count(reset) OVER (ORDER BY date) AS int)
  ) AS dategrp
FROM (
  SELECT date, CASE WHEN date - lag(date) OVER () > 3 THEN 1 END AS reset
  FROM foo
  ORDER BY date
) AS t


First we generate resets with a window function

SELECT date, CASE WHEN date - lag(date) OVER () > 3 THEN 1 END AS reset
FROM foo
ORDER BY date
    date    | reset 
------------+-------
 2017-01-01 |      
 2017-01-01 |      
 2017-01-02 |      
 2017-01-03 |      
 2017-01-04 |      
 2017-01-17 |     1
 2017-01-18 |      
 2017-01-18 |      
 2017-01-18 |      
 2017-01-19 |      
 2017-01-19 |      
 2017-01-26 |     1
 2017-01-27 |      
 2017-01-27 |


Then we generate numeric groups

SELECT date, count(reset) OVER (ORDER BY date)
FROM (
  SELECT date, CASE WHEN date - lag(date) OVER () > 3 THEN 1 END AS reset
  FROM foo
  ORDER BY date
) AS t
    date    | count 
------------+-------
 2017-01-01 |     0
 2017-01-01 |     0
 2017-01-02 |     0
 2017-01-03 |     0
 2017-01-04 |     0
 2017-01-17 |     1
 2017-01-18 |     1
 2017-01-18 |     1
 2017-01-18 |     1
 2017-01-19 |     1
 2017-01-19 |     1
 2017-01-26 |     2
 2017-01-27 |     2
 2017-01-27 |     2


Then we use chr(65+) to get the alpha group names you want. Original query at the top..

date    | dategrp 
------------+---------
 2017-01-01 | A
 2017-01-01 | A
 2017-01-02 | A
 2017-01-03 | A
 2017-01-04 | A
 2017-01-17 | B
 2017-01-18 | B
 2017-01-18 | B
 2017-01-18 | B
 2017-01-19 | B
 2017-01-19 | B
 2017-01-26 | C
 2017-01-27 | C
 2017-01-27 | C
(14 rows)

Code Snippets

SELECT date,
  chr(
    65 + CAST(count(reset) OVER (ORDER BY date) AS int)
  ) AS dategrp
FROM (
  SELECT date, CASE WHEN date - lag(date) OVER () > 3 THEN 1 END AS reset
  FROM foo
  ORDER BY date
) AS t
SELECT date, CASE WHEN date - lag(date) OVER () > 3 THEN 1 END AS reset
FROM foo
ORDER BY date
    date    | reset 
------------+-------
 2017-01-01 |      
 2017-01-01 |      
 2017-01-02 |      
 2017-01-03 |      
 2017-01-04 |      
 2017-01-17 |     1
 2017-01-18 |      
 2017-01-18 |      
 2017-01-18 |      
 2017-01-19 |      
 2017-01-19 |      
 2017-01-26 |     1
 2017-01-27 |      
 2017-01-27 |
SELECT date, count(reset) OVER (ORDER BY date)
FROM (
  SELECT date, CASE WHEN date - lag(date) OVER () > 3 THEN 1 END AS reset
  FROM foo
  ORDER BY date
) AS t
    date    | count 
------------+-------
 2017-01-01 |     0
 2017-01-01 |     0
 2017-01-02 |     0
 2017-01-03 |     0
 2017-01-04 |     0
 2017-01-17 |     1
 2017-01-18 |     1
 2017-01-18 |     1
 2017-01-18 |     1
 2017-01-19 |     1
 2017-01-19 |     1
 2017-01-26 |     2
 2017-01-27 |     2
 2017-01-27 |     2
date    | dategrp 
------------+---------
 2017-01-01 | A
 2017-01-01 | A
 2017-01-02 | A
 2017-01-03 | A
 2017-01-04 | A
 2017-01-17 | B
 2017-01-18 | B
 2017-01-18 | B
 2017-01-18 | B
 2017-01-19 | B
 2017-01-19 | B
 2017-01-26 | C
 2017-01-27 | C
 2017-01-27 | C
(14 rows)

Context

StackExchange Database Administrators Q#163819, answer score: 4

Revisions (0)

No revisions yet.