patternsqlMinor
identify gaps and islands
Viewed 0 times
andidentifyislandsgaps
Problem
I have a database that has the following structure
I would like to generate a report like the following:
I should add that it is not critical that the stop record is the first one where the value = 0, it could be the last record where the value = 1
How might I easily create a well performing query that would give me the information I need? I am using Python for the app, so I would not need the SQL to output the report in this exact form, just get me close with the guts of the information.
+---------------------+------------+---------------+
| timestamp | EngOilP_sd | CompOilLVL_sd |
+---------------------+------------+---------------+
| 2015-06-24 20:28:07 | 0 | 0 |
| 2015-06-24 20:30:20 | 1 | 0 |
| 2015-06-24 20:36:47 | 1 | 0 |
| 2015-06-24 20:41:11 | 1 | 0 |
| 2015-06-24 20:43:29 | 1 | 0 |
| 2015-06-24 20:45:42 | 0 | 0 |
| 2015-06-24 20:47:51 | 0 | 0 |
| 2015-06-24 20:49:59 | 0 | 1 |
| 2015-06-24 20:52:01 | 0 | 1 |
| 2015-06-24 20:54:17 | 0 | 0 |
+---------------------+------------+---------------+I would like to generate a report like the following:
sd duration start stop
EngOilP_sd 15min 22s 2015-06-24 20:30:20 2015-06-24 20:45:42
CompOilLVL_sd 4min 18s 2015-06-24 20:49:59 2015-06-24 20:54:17I should add that it is not critical that the stop record is the first one where the value = 0, it could be the last record where the value = 1
How might I easily create a well performing query that would give me the information I need? I am using Python for the app, so I would not need the SQL to output the report in this exact form, just get me close with the guts of the information.
Solution
There's probably something more elegant than this, but since you are doing a kind of pivot (transforming from columns to rows, i.e. metadata to data) it is never going to be pretty. Assuming a table like:
Since we are going to reference the "pivot" several times we create a view for it:
The start time (using @Pauls comment as the definition) is the earliest time where each attribute is 1:
The stop time is the earliest time after that where each attribute is 0:
Adding another level of nesting makes it convenient to determine the duration:
create table t
( ts timestamp not null
, EngOilP_sd smallint not null
, CompOilLVL_sd smallint not null
) engine = innodb;
insert into t (ts, EngOilP_sd, CompOilLVL_sd)
values
('2015-06-24 20:28:07',0 ,0 )
,('2015-06-24 20:30:20',1 ,0 )
,('2015-06-24 20:36:47',1 ,0 )
,('2015-06-24 20:41:11',1 ,0 )
,('2015-06-24 20:43:29',1 ,0 )
,('2015-06-24 20:45:42',0 ,0 )
,('2015-06-24 20:47:51',0 ,0 )
,('2015-06-24 20:49:59',0 ,1 )
,('2015-06-24 20:52:01',0 ,1 )
,('2015-06-24 20:54:17',0 ,0 );Since we are going to reference the "pivot" several times we create a view for it:
create view t_pivot as
select 'EngOilP_sd' as sd, ts, EngOilP_sd as val from t
union all
select 'CompOilLVL_sd', ts, CompOilLVL_sd from t;The start time (using @Pauls comment as the definition) is the earliest time where each attribute is 1:
select sd, min(case val when 1 then ts end) as start
from t_pivot
group by sdThe stop time is the earliest time after that where each attribute is 0:
select x.sd, x.start, min(case y.val when 0 then y.ts end) as stop
from (
select sd, min(case val when 1 then ts end) as start
from t_pivot
group by sd
) as x
join t_pivot as y
on x.sd = y.sd
and y.ts > x.start
group by x.sd, x.startAdding another level of nesting makes it convenient to determine the duration:
select sd, timediff(stop, start) as duration, start, stop
from (
select x.sd, x.start, min(case y.val when 0 then y.ts end) as stop
from (
select sd, min(case val when 1 then ts end) as start
from t_pivot
group by sd
) as x
join t_pivot as y
on x.sd = y.sd
and y.ts > x.start
group by x.sd, x.start
) as z
order by sd desc;
+---------------+----------+---------------------+---------------------+
| sd | duration | start | stop |
+---------------+----------+---------------------+---------------------+
| EngOilP_sd | 00:15:22 | 2015-06-24 20:30:20 | 2015-06-24 20:45:42 |
| CompOilLVL_sd | 00:04:18 | 2015-06-24 20:49:59 | 2015-06-24 20:54:17 |
+---------------+----------+---------------------+---------------------+
2 rows in set (0.00 sec)Code Snippets
create table t
( ts timestamp not null
, EngOilP_sd smallint not null
, CompOilLVL_sd smallint not null
) engine = innodb;
insert into t (ts, EngOilP_sd, CompOilLVL_sd)
values
('2015-06-24 20:28:07',0 ,0 )
,('2015-06-24 20:30:20',1 ,0 )
,('2015-06-24 20:36:47',1 ,0 )
,('2015-06-24 20:41:11',1 ,0 )
,('2015-06-24 20:43:29',1 ,0 )
,('2015-06-24 20:45:42',0 ,0 )
,('2015-06-24 20:47:51',0 ,0 )
,('2015-06-24 20:49:59',0 ,1 )
,('2015-06-24 20:52:01',0 ,1 )
,('2015-06-24 20:54:17',0 ,0 );create view t_pivot as
select 'EngOilP_sd' as sd, ts, EngOilP_sd as val from t
union all
select 'CompOilLVL_sd', ts, CompOilLVL_sd from t;select sd, min(case val when 1 then ts end) as start
from t_pivot
group by sdselect x.sd, x.start, min(case y.val when 0 then y.ts end) as stop
from (
select sd, min(case val when 1 then ts end) as start
from t_pivot
group by sd
) as x
join t_pivot as y
on x.sd = y.sd
and y.ts > x.start
group by x.sd, x.startselect sd, timediff(stop, start) as duration, start, stop
from (
select x.sd, x.start, min(case y.val when 0 then y.ts end) as stop
from (
select sd, min(case val when 1 then ts end) as start
from t_pivot
group by sd
) as x
join t_pivot as y
on x.sd = y.sd
and y.ts > x.start
group by x.sd, x.start
) as z
order by sd desc;
+---------------+----------+---------------------+---------------------+
| sd | duration | start | stop |
+---------------+----------+---------------------+---------------------+
| EngOilP_sd | 00:15:22 | 2015-06-24 20:30:20 | 2015-06-24 20:45:42 |
| CompOilLVL_sd | 00:04:18 | 2015-06-24 20:49:59 | 2015-06-24 20:54:17 |
+---------------+----------+---------------------+---------------------+
2 rows in set (0.00 sec)Context
StackExchange Database Administrators Q#112178, answer score: 3
Revisions (0)
No revisions yet.