patternsqlMinor
PostgreSQL window functions very slow on big table
Viewed 0 times
postgresqlfunctionsslowbigverywindowtable
Problem
I have a PostgreSQL table constructed as
I need to select the last 2 records ordered by
Any ideas?
device_id | point | dt_edit
----------+-----------+----------
100 | geometry | timestamp
101 | geometry | timestamp
100 | geometry | timestamp
102 | geometry | timestamp
102 | geometry | timestamp
101 | geometry | timestampI need to select the last 2 records ordered by
dt_edit from unique device_id. This query works very slow, on 1 billion records - 500 seconds:SELECT dt.device_id,
dt.point,
dt.dt_edit
FROM ( SELECT gps_data.device_id,
gps_data.point,
gps_data.dt_edit,
rank() OVER (PARTITION BY gps_data.device_id
ORDER BY gps_data.dt_edit DESC) AS rank
FROM gps_data) dt
WHERE dt.rank <= 2Any ideas?
Solution
If we can assume a
The query can be very simple:
Assuming a much smaller cardinality in table
If
All you need for this to be fast is a multicolumn index;
The
If you don't have a device table, I would suggest to create one (and keep it current).
But there are ways to make this fast even without
Details:
device table holding all devices of interest. Example:CREATE TABLE device (device_id int, device text);
INSERT INTO device (device_id, device) VALUES
(100, 'a')
, (101, 'b')
, (102, 'c');The query can be very simple:
SELECT d.device_id, g.point, g.dt_edit
FROM device d
, LATERAL (
SELECT point, dt_edit
FROM gps_data
WHERE device_id = d.device_id
ORDER BY dt_edit DESC NULLS LAST -- more items?
LIMIT 2
) g;Assuming a much smaller cardinality in table
device than in table gps_data, i.e. many rows per device in the latter.If
dt_edit is not UNIQUE, add more items to ORDER BY to disambiguate the sort order.All you need for this to be fast is a multicolumn index;
CREATE INDEX gps_data_foo_idx
ON gps_data (device_id, dt_edit DESC NULLS LAST); -- more items?The
NULLS LAST modifier is useful if the column can be NULL, but does not hurt in any case. Either way, the query must match the index.- PostgreSQL sort by datetime asc, null first?
If you don't have a device table, I would suggest to create one (and keep it current).
But there are ways to make this fast even without
device table ...Details:
- Optimize GROUP BY query to retrieve latest record per user
- How do I efficiently get "the most recent corresponding row"?
- SELECT DISTINCT on multiple columns
Code Snippets
CREATE TABLE device (device_id int, device text);
INSERT INTO device (device_id, device) VALUES
(100, 'a')
, (101, 'b')
, (102, 'c');SELECT d.device_id, g.point, g.dt_edit
FROM device d
, LATERAL (
SELECT point, dt_edit
FROM gps_data
WHERE device_id = d.device_id
ORDER BY dt_edit DESC NULLS LAST -- more items?
LIMIT 2
) g;CREATE INDEX gps_data_foo_idx
ON gps_data (device_id, dt_edit DESC NULLS LAST); -- more items?Context
StackExchange Database Administrators Q#106987, answer score: 4
Revisions (0)
No revisions yet.