HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

MySQL - Merge or split datetime intervals (start date to end date)

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
mergedatemysqlintervalssplitstartenddatetime

Problem

I have a table which stores a list of activities with an interval of time delimited by 2 dates.

Sample:

+------+---------------------+---------------------+-------------+
| name |        start        |         end         | time (calc) |
+------+---------------------+---------------------+-------------+
|  me  | 2017-04-03 11:00:00 | 2017-04-03 11:30:00 |          30 |
|  me  | 2017-04-03 23:45:00 | 2017-04-04 00:15:00 |          30 |
|  me  | 2017-04-04 10:00:00 | 2017-04-04 11:00:00 |          60 |
|  me  | 2017-04-04 10:30:00 | 2017-04-04 11:30:00 |          60 |
|  me  | 2017-04-05 23:00:00 | 2017-04-05 23:30:00 |          30 |
|  me  | 2017-04-05 23:15:00 | 2017-04-07 00:45:00 |        1530 |
+------+---------------------+---------------------+-------------+


I would like to know how many minutes are occupied each day per user (and then per week), so I need to transform current table into where intervals which share a partial space time are merged in a single one, and intervals which are in several days are split, like the next one:

+------+---------------------+---------------------+-------------+
| name |        start        |         end         | time (calc) |
+------+---------------------+---------------------+-------------+
|  me  | 2017-04-03 11:00:00 | 2017-04-03 11:30:00 |          30 |
|  me  | 2017-04-03 23:45:00 | 2017-04-03 23:59:59 |          15 |
|  me  | 2017-04-04 00:00:00 | 2017-04-04 00:15:00 |          15 |
|  me  | 2017-04-04 10:00:00 | 2017-04-04 11:30:00 |          90 |
|  me  | 2017-04-05 23:00:00 | 2017-04-05 23:59:59 |          60 |
|  me  | 2017-04-06 00:00:00 | 2017-04-06 23:59:59 |        1440 |
|  me  | 2017-04-07 00:00:00 | 2017-04-07 00:45:00 |          45 |
+------+---------------------+---------------------+-------------+


To then query it easily in order to obtain minutes per day:

```
+------+------------+------+
| name | day | time |
+------+------------+------+
| me | 2017-04-03 | 45 |
| me

Solution

First due you need to generate a series of dates, I'd suggest to use a calendar table.

CREATE TABLE if not exists calendar (
    mdate date PRIMARY KEY NOT NULL
);

INSERT INTO calendar values
('20170403'),('20170404'),('20170405'),('20170406'),('20170407'),('20170408');


How do they do it

Just to get overlapped activities I've used the query that you provided on your question.

create view overlaped_activities
as
SELECT name, min(start) AS start, end, TIMESTAMPDIFF(MINUTE, MIN(start), end) AS time
FROM (
    SELECT x.name, x.start, min(y.end) AS end 
    FROM activities AS x 
    JOIN activities AS y 
        ON x.name = y.name 
       AND x.start = z.start 
             AND y.end  u.start 
          AND x.start <= u.start
    ) 
    GROUP BY x.name, x.start
) AS v GROUP BY name, end;


First I calculate minutes from start date till midnight:

if(date(start) = date(end), 
  time_to_sec(timediff(end, start)) / 60, 
  (1440 - time_to_sec(time(start)) / 60)) mstart


Then, if start <> end, I calculate minutes from midnight till end date:

if(date(start) = date(end), 0, time_to_sec(time(end)) / 60) mend


This returns a table like this:

| start               | end                 |     mdiff |  mstart |    mend |
|---------------------|---------------------|----------:|--------:|--------:|
| 03.04.2017 11:00:00 | 03.04.2017 11:30:00 |   30,0000 | 30,0000 |       0 |
| 03.04.2017 23:45:00 | 04.04.2017 00:15:00 |   30,0000 | 15,0000 | 15,0000 |
| 04.04.2017 10:00:00 | 04.04.2017 11:30:00 |   90,0000 | 90,0000 |       0 |
| 05.04.2017 23:00:00 | 07.04.2017 00:45:00 | 1545,0000 | 60,0000 | 45,0000 |


That's nice, but there is another problem here:

| 05.04.2017 23:00:00 | 07.04.2017 00:45:00 | 1545,0000 | 60,0000 | 45,0000 |


Of course: 1545 <> 60 + 45

We need to generate a series of dates between start and end date, and add 1440 minutes to each day.

We can get it using the calendar table:

select   name,
            mdate date_activity,
            sum(1440) minutes
   from     calendar
   join     overlaped_activities
   on       calendar.mdate > date(start)
   and      calendar.mdate  1
   group by name, mdate


Ok, we got all ingredients, it's time to cook the recipe:

select name, date_activity, sum(minutes) min_activity
from (
       select name, 
              date(start) date_activity,
              if(date(start) = date(end), time_to_sec(timediff(end, start)) / 60, (1440 - time_to_sec(time(start)) / 60)) minutes
       from overlaped_activities

       UNION ALL

       select name, 
              date(end) date_activity,
              if(date(start) = date(end), 0, time_to_sec(time(end)) / 60) minutes
       from overlaped_activities

       UNION ALL

       select   name,
                mdate date_activity,
                sum(1440) minutes
       from     calendar
       join     overlaped_activities
       on       calendar.mdate > date(start)
       and      calendar.mdate  1
       group by name, mdate
    ) act
group by name, date_activity;


Final result:

| name |       date_activity | min_activity |
|------|--------------------:|-------------:|
| me   | 03.04.2017 00:00:00 |      45,0000 |
| me   | 04.04.2017 00:00:00 |     105,0000 |
| me   | 05.04.2017 00:00:00 |      60,0000 |
| me   | 06.04.2017 00:00:00 |    1440,0000 |
| me   | 07.04.2017 00:00:00 |      45,0000 |


Almost forget it, the recipe: http://rextester.com/EIJOI20983

Code Snippets

CREATE TABLE if not exists calendar (
    mdate date PRIMARY KEY NOT NULL
);

INSERT INTO calendar values
('20170403'),('20170404'),('20170405'),('20170406'),('20170407'),('20170408');
create view overlaped_activities
as
SELECT name, min(start) AS start, end, TIMESTAMPDIFF(MINUTE, MIN(start), end) AS time
FROM (
    SELECT x.name, x.start, min(y.end) AS end 
    FROM activities AS x 
    JOIN activities AS y 
        ON x.name = y.name 
       AND x.start <= y.end 
       AND NOT EXISTS (
           SELECT 1 
           FROM activities AS z 
           WHERE y.name = z.name 
             AND y.end >= z.start 
             AND y.end < z.end
       ) 
    WHERE NOT EXISTS (
        SELECT 1 
        FROM activities AS u 
        WHERE x.name = u.name 
          AND x.start > u.start 
          AND x.start <= u.start
    ) 
    GROUP BY x.name, x.start
) AS v GROUP BY name, end;
if(date(start) = date(end), 
  time_to_sec(timediff(end, start)) / 60, 
  (1440 - time_to_sec(time(start)) / 60)) mstart
if(date(start) = date(end), 0, time_to_sec(time(end)) / 60) mend
| start               | end                 |     mdiff |  mstart |    mend |
|---------------------|---------------------|----------:|--------:|--------:|
| 03.04.2017 11:00:00 | 03.04.2017 11:30:00 |   30,0000 | 30,0000 |       0 |
| 03.04.2017 23:45:00 | 04.04.2017 00:15:00 |   30,0000 | 15,0000 | 15,0000 |
| 04.04.2017 10:00:00 | 04.04.2017 11:30:00 |   90,0000 | 90,0000 |       0 |
| 05.04.2017 23:00:00 | 07.04.2017 00:45:00 | 1545,0000 | 60,0000 | 45,0000 |

Context

StackExchange Database Administrators Q#169186, answer score: 4

Revisions (0)

No revisions yet.