HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

Select telemetry data based on relational data in PostgreSQL/TimescaleDB

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
postgresqltimescaledbtelemetryrelationalbasedselectdata

Problem

I'm trying to solve a particularly difficult problem. I am storing some telemetry data from some sensors in an SQL table (PostgreSQL) and I want to know how I can I write a query that will group the telemetry data using relational information from two other tables.

I have one table which stores the telemetry data from the sensors. This table contains three fields, one for the timestamp, one for the sensor ID, one for the value of the sensor at that time. The value column is an incrementing count (it only increases and never resets)

Sensor_Telemetry table

timestamp
sensor_id
value

2022-01-01 00:00:00
5
3

2022-01-01 00:00:01
5
5

2022-01-01 00:00:02
5
6

...
...
...

2022-01-01 01:00:00
5
675

I have another table which stores the state of the sensor, whether it was stationary or in motion and the start/end dates of that particular state for that sensor:

Status table

start_date
end_date
status
sensor_id

2022-01-01 00:00:00
2022-01-01 00:20:00
in_motion
5

2022-01-01 00:20:00
2022-01-01 00:40:00
stationary
5

2022-01-01 00:40:00
2022-01-01 01:00:00
in_motion
5

...
...
...
...

The sensor is located at a particular location. The Sensor table stores this metadata:

Sensor table

sensor_id
location_id

5
16

In the final table, I have the shifts that occur in each location. The shift table is a list of occurrences of all shifts, ie in this case Shift A is defined to occur every day between 00:00:00 and 00:30:00, Shift B is defined to occur every day between 00:30:00 and 01:00:00. So then the Shift table would have records like so:

Shift table

shift
location_id
occurrence_id
start_date
end_date

A Shift
16
123
2022-01-01 00:00:00
2022-01-01 00:30:00

B Shift
16
124
2022-01-01 00:30:00
2022-01-01 01:00:00

A Shift
16
123
2022-01-02 00:00:00
2022-01-02 00:30:00

B Shift
16
124
2022-01-02 00:30:00
2022-01-02 01:00:00

...
...
...
...
...

I want to write a query so that I can retrieve telemetry data that is grouped both by the shifts at the location of the sensor a

Solution

Tested in dbfiddle.uk:

select distinct
    sensor . sensor_id,
    shift  . shift,
    shift  . occurence_id,
    status . status,
    status . status_id,
    MIN(st.datetime) OVER w as start_date,
    MAX(st.datetime) OVER w as end_date,
    FIRST_VALUE(st.reading) OVER w as reading_start,
    LAST_VALUE(st.reading) OVER w as reading_end,
    MIN(st.reading) OVER w as reading_min,
    MAX(st.reading) OVER w as reading_max
from 
    status
  join sensor
           using (sensor_id)
  join shift
           on sensor.location_id = shift.location_id
          and (status.start_date, status.end_date) OVERLAPS 
              (shift.start_date, shift.end_date)
  join sensor_telemetry as st
           on st.sensor_id = sensor.sensor_id
          and tstzrange( 
                  GREATEST(status.start_date, shift.start_date), 
                  LEAST(status.end_date,   shift.end_date) 
              ) @> st.datetime
where 
    status.sensor_id = 5 
window w as
( partition by 
    sensor . sensor_id, 
    shift  . shift, 
    shift  . occurence_id, 
    status . status, 
    status . status_id
  order by 
    st.datetime
  rows between unbounded preceding 
           and unbounded following
)
order by 
    start_date ;


And a bit simpler answer if we only require the MIN/MAX of the readings, not the first/last (or as the OP says we know that they always coincide): https://dbfiddle.uk/hbIcLpNn

Code Snippets

select distinct
    sensor . sensor_id,
    shift  . shift,
    shift  . occurence_id,
    status . status,
    status . status_id,
    MIN(st.datetime) OVER w as start_date,
    MAX(st.datetime) OVER w as end_date,
    FIRST_VALUE(st.reading) OVER w as reading_start,
    LAST_VALUE(st.reading) OVER w as reading_end,
    MIN(st.reading) OVER w as reading_min,
    MAX(st.reading) OVER w as reading_max
from 
    status
  join sensor
           using (sensor_id)
  join shift
           on sensor.location_id = shift.location_id
          and (status.start_date, status.end_date) OVERLAPS 
              (shift.start_date, shift.end_date)
  join sensor_telemetry as st
           on st.sensor_id = sensor.sensor_id
          and tstzrange( 
                  GREATEST(status.start_date, shift.start_date), 
                  LEAST(status.end_date,   shift.end_date) 
              ) @> st.datetime
where 
    status.sensor_id = 5 
window w as
( partition by 
    sensor . sensor_id, 
    shift  . shift, 
    shift  . occurence_id, 
    status . status, 
    status . status_id
  order by 
    st.datetime
  rows between unbounded preceding 
           and unbounded following
)
order by 
    start_date ;

Context

StackExchange Database Administrators Q#317176, answer score: 4

Revisions (0)

No revisions yet.