HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

Mysql select rows with 7 continuous dates

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
rowswithdatesmysqlselectcontinuous

Problem

Imagine a simple table with these columns:
item_id, date

And values:

CREATE TABLE foo (item_id int, date date);

INSERT INTO foo(item_id, date)
VALUES
    ( 1, '2017-02-10' ),
    ( 2, '2017-02-10' ),
    ( 1, '2017-02-11' ),
    ( 1, '2017-02-12' ),
    ( 1, '2017-02-13' ),
    ( 2, '2017-02-13' ),
    ( 1, '2017-02-14' );


How to select item_ids that have 7 continuous day records in table?

Start and end dates are unknown. It should just be available from any starting date up to 7 continuous days after.

Solution

MySQL 8

MySQL 8 provides window functions...

SELECT item_id
FROM (
  SELECT
    item_id,
    date,
    count(coalesce(diff, 1)=1 OR null) OVER (PARTITION BY item_id ORDER BY date) seq
  FROM (
    SELECT
      item_id,
      date,
      date - lag(date) OVER (PARTITION BY item_id ORDER BY date) AS diff
    FROM foo
  ) AS t
) AS t2
GROUP BY item_id
HAVING max(seq) > 7;


Explanation

This is what we're doing on the inner.

SELECT
  item_id,
  date,
  date - lag(date) OVER (PARTITION BY item_id ORDER BY date) AS diff
FROM foo

 item_id |    date    | diff 
---------+------------+------
       1 | 2017-02-10 |     
       1 | 2017-02-11 |    1
       1 | 2017-02-12 |    1
       1 | 2017-02-13 |    1
       1 | 2017-02-14 |    1
       2 | 2017-02-10 |     
       2 | 2017-02-13 |    3
(7 rows)


Here we return the differences. What we need to now do is isolate the ones where the date difference is 1. We assume here if the result of the difference is null it is because there was no previous date to subtract, so we set it to 1. Then we if we don't have a 1, we set the value to null so count() skips it.

SELECT
  item_id,
  date,
  count(coalesce(diff, 1)=1 OR null) OVER (PARTITION BY item_id ORDER BY date) seq
FROM (
  SELECT
    item_id,
    date,
    date - lag(date) OVER (PARTITION BY item_id ORDER BY date) AS diff
  FROM foo
) AS t; 

 item_id |    date    | seq 
---------+------------+-----
       1 | 2017-02-10 |   1
       1 | 2017-02-11 |   2
       1 | 2017-02-12 |   3
       1 | 2017-02-13 |   4
       1 | 2017-02-14 |   5
       2 | 2017-02-10 |   1
       2 | 2017-02-13 |   1
(7 rows)


From this point, it's just a GROUP BY and HAVING.

This was tested in PostgreSQL because MySQL 8 wasn't out yet. If you haven't used PostgreSQL, download it for free and check it out. It's like MySQL but better in every single way.

Code Snippets

SELECT item_id
FROM (
  SELECT
    item_id,
    date,
    count(coalesce(diff, 1)=1 OR null) OVER (PARTITION BY item_id ORDER BY date) seq
  FROM (
    SELECT
      item_id,
      date,
      date - lag(date) OVER (PARTITION BY item_id ORDER BY date) AS diff
    FROM foo
  ) AS t
) AS t2
GROUP BY item_id
HAVING max(seq) > 7;
SELECT
  item_id,
  date,
  date - lag(date) OVER (PARTITION BY item_id ORDER BY date) AS diff
FROM foo

 item_id |    date    | diff 
---------+------------+------
       1 | 2017-02-10 |     
       1 | 2017-02-11 |    1
       1 | 2017-02-12 |    1
       1 | 2017-02-13 |    1
       1 | 2017-02-14 |    1
       2 | 2017-02-10 |     
       2 | 2017-02-13 |    3
(7 rows)
SELECT
  item_id,
  date,
  count(coalesce(diff, 1)=1 OR null) OVER (PARTITION BY item_id ORDER BY date) seq
FROM (
  SELECT
    item_id,
    date,
    date - lag(date) OVER (PARTITION BY item_id ORDER BY date) AS diff
  FROM foo
) AS t; 

 item_id |    date    | seq 
---------+------------+-----
       1 | 2017-02-10 |   1
       1 | 2017-02-11 |   2
       1 | 2017-02-12 |   3
       1 | 2017-02-13 |   4
       1 | 2017-02-14 |   5
       2 | 2017-02-10 |   1
       2 | 2017-02-13 |   1
(7 rows)

Context

StackExchange Database Administrators Q#182973, answer score: 5

Revisions (0)

No revisions yet.