HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

Exclude rows with consecutive data that may repeat

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
rowsmayrepeatwiththatdataexcludeconsecutive

Problem

I have a table showing statuses (0 - broken, 1 - live) for a particular device at a given timestamp:

device_owner   device_id    timestamp          status
owner1         device_1     2001-01-01 09:00   0    -- leave this
owner1         device_1     2001-01-01 09:05   0    -- exclude
owner1         device_1     2001-01-01 09:10   0    -- exclude
owner1         device_2     2001-01-01 09:15   0    -- leave this
owner1         device_1     2001-01-01 09:30   1    
owner1         device_2     2001-01-01 09:35   0    -- exclude
owner1         device_2     2001-01-01 09:45   1
owner1         device_1     2001-01-01 09:55   0    --leave this
owner1         device_1     2001-01-01 10:10   0    --exclude
owner1         device_1     2001-01-01 10:11   1


and so on.

I need to exclude consecutive zero statuses, and leave the one with earliest datetime. (So that I could define datetime difference between the device broken and live.)

The result:

device_owner   device_id    timestamp          status
owner1         device_1     2001-01-01 09:00   0    
owner1         device_2     2001-01-01 09:15   0    
owner1         device_1     2001-01-01 09:30   1
owner1         device_2     2001-01-01 09:45   1
owner1         device_1     2001-01-01 09:55   0    
owner1         device_1     2001-01-01 10:11   1


and so on.

Then in my further calculations, total breakdown for device_1 would be between 09:00 and 09:30, 09:55 and 10:11, for device_2 – 09:15—09:45.

DDL:

```
DECLARE @t TABLE
(
device_owner VARCHAR(10),
device_id VARCHAR(10),
timestamp DATETIME,
status BIT
);

INSERT @t ( device_owner, device_id, timestamp, status )
SELECT *
FROM
(
VALUES ('owner1', 'device_1', '2001-01-01 09:00', 0), -- leave this
('owner1', 'device_1', '2001-01-01 09:05', 0), -- exclude
('owner1', 'device_1', '2001-01-01 09:10', 0), -- exclude
('owner1', 'device_2', '2001-01-01 09:15', 0),

Solution

This seems to be similar to a solution from Itzik Ben-Gan to gaps and islands

SELECT [device_owner], [device_id],MIN(timestamp) as timestamp,status

FROM
(
SELECT *
   ,diff = ROW_NUMBER()OVER(PARTITION BY device_owner,device_id ORDER BY timestamp ASC) 
         - ROW_NUMBER()OVER(PARTITION BY device_owner,device_id , status ORDER BY timestamp ASC) 
FROM @t
)A
--WhERE status = 0
GROUP BY [device_owner], [device_id],status , diff
ORDER BY timestamp


output of it:

device_owner    device_id   timestamp           status
owner1          device_1    01/01/2001 09:00:00 False
owner1          device_2    01/01/2001 09:15:00 False
owner1          device_1    01/01/2001 09:30:00 True
owner1          device_2    01/01/2001 09:45:00 True
owner1          device_1    01/01/2001 09:55:00 False
owner1          device_1    01/01/2001 10:11:00 True


dbfiddle

Code Snippets

SELECT [device_owner], [device_id],MIN(timestamp) as timestamp,status

FROM
(
SELECT *
   ,diff = ROW_NUMBER()OVER(PARTITION BY device_owner,device_id ORDER BY timestamp ASC) 
         - ROW_NUMBER()OVER(PARTITION BY device_owner,device_id , status ORDER BY timestamp ASC) 
FROM @t
)A
--WhERE status = 0
GROUP BY [device_owner], [device_id],status , diff
ORDER BY timestamp
device_owner    device_id   timestamp           status
owner1          device_1    01/01/2001 09:00:00 False
owner1          device_2    01/01/2001 09:15:00 False
owner1          device_1    01/01/2001 09:30:00 True
owner1          device_2    01/01/2001 09:45:00 True
owner1          device_1    01/01/2001 09:55:00 False
owner1          device_1    01/01/2001 10:11:00 True

Context

StackExchange Database Administrators Q#211066, answer score: 5

Revisions (0)

No revisions yet.