HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

Group Database Entries by time difference

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
grouptimedatabasedifferenceentries

Problem

I have a large postgreSQL database with log data. All this data has timestamps and I want to group consecutive rows where the difference between the timestamps is less then 1500 miliseconds for example.

For example:

1349427083272
1349427083669
1349427083707
1349427084277
1349427084787
1349427093471
1349427094031
1349427094307
1349427094980
1349427095879
1349427097211
1349437622947
1349437623813
1349437624316
1349437624815
1349437624938


Should result in the following groups:

1349427083272
1349427083669
1349427083707
1349427084277
1349427084787

1349427093471
1349427094031
1349427094307
1349427094980
1349427095879
1349427097211

1349437622947
1349437623813
1349437624316
1349437624815
1349437624938


The group identifier can just be a unique integer.

In MySQL I found a similar example which used lots of variables, but I have no clue how to to this in postgreSQL.
Could someone help me out?

Solution

With the comment from FrustratedWithFormsDesigner, I came to the following solution:

SELECT subq2.*, sum(new_group) OVER (ORDER BY t ASC) AS group_id
FROM (
  SELECT subq.*, CASE WHEN delta > 1500 THEN 1 ELSE 0 END AS new_group
  FROM (
    SELECT t, lag(t) over (ORDER BY t ASC),
      t - lag(t) over (ORDER BY t ASC) AS delta
    FROM time_points
  ) AS subq
) AS subq2


I am using the sum of all new_group values to create different groups.
Thank you very much!

Code Snippets

SELECT subq2.*, sum(new_group) OVER (ORDER BY t ASC) AS group_id
FROM (
  SELECT subq.*, CASE WHEN delta > 1500 THEN 1 ELSE 0 END AS new_group
  FROM (
    SELECT t, lag(t) over (ORDER BY t ASC),
      t - lag(t) over (ORDER BY t ASC) AS delta
    FROM time_points
  ) AS subq
) AS subq2

Context

StackExchange Database Administrators Q#41686, answer score: 5

Revisions (0)

No revisions yet.