snippetsqlMinor
Postgres choosing a filter instead of index cond when OR is involved
Viewed 0 times
condpostgresinsteadfilterwheninvolvedindexchoosing
Problem
I have a table that daily gets about 20 million records added to it, and I'm trying to paginate through it to let people get access to all the data in it, but the query time has to be "decent" (in my case defined as less than 30s/query).
To do this I've been using keyset pagination in the past but for this particular query and table I've gotten some really slow query times, and it seems to be because the query planner decides to filter out a day's worth of data and then run a filter on it instead of an index condition scan.
The table looks like this:
The column I'm trying to paginate by is the
The table is also partitioned by quarter, but at the moment I'm only operating on the data for a single partition.
Fast query
Which gives the following query plan (not
To do this I've been using keyset pagination in the past but for this particular query and table I've gotten some really slow query times, and it seems to be because the query planner decides to filter out a day's worth of data and then run a filter on it instead of an index condition scan.
The table looks like this:
create table mmsi_positions_archive
(
id bigserial not null
constraint mmsi_positions_archive_pkey
primary key,
position_id uuid,
previous_id uuid,
mmsi bigint not null,
collection_type varchar not null,
accuracy numeric,
maneuver numeric,
rate_of_turn numeric,
status integer,
speed numeric,
course numeric,
heading numeric,
position geometry(Point,4326),
timestamp timestamp with time zone not null,
updated_at timestamp with time zone default now(),
created_at timestamp with time zone default now()
);
create index ix_mmsi_positions_archive_mmsi
on mmsi_positions_archive (mmsi);
create index ix_mmsi_positions_archive_position_id
on mmsi_positions_archive (position_id);
create index ix_mmsi_positions_archive_timestamp_mmsi_id_asc
on mmsi_positions_archive (timestamp, id);The column I'm trying to paginate by is the
timestamp and id, to help out I've also updated the table statistics target for timestamp and set it to the max value of 10 000 and analyzed the table.The table is also partitioned by quarter, but at the moment I'm only operating on the data for a single partition.
Fast query
SELECT id
FROM mmsi_positions_archive
WHERE timestamp > '2019-03-10 00:00:00.000000+00:00'
AND timestamp <= '2019-03-11 00:00:00+00:00'
ORDER BY timestamp, id
LIMIT 100Which gives the following query plan (not
Solution
Fortunately this is very simple in PostgreSQL as it supports comparisons between “row values” (or compound values) that can make use of an index.
So you can write:
The comparison of such row values is lexicographical, that is exactly like you want it.
Here is a documentation link for that feature.
So you can write:
WHERE (timestamp, id) > ('2019-03-10 00:00:00+00:00', 1032749689)
AND timestamp <= '2019-03-11 00:00:00+00:00'
ORDER BY timestamp, id
LIMIT 100The comparison of such row values is lexicographical, that is exactly like you want it.
Here is a documentation link for that feature.
Code Snippets
WHERE (timestamp, id) > ('2019-03-10 00:00:00+00:00', 1032749689)
AND timestamp <= '2019-03-11 00:00:00+00:00'
ORDER BY timestamp, id
LIMIT 100Context
StackExchange Database Administrators Q#241591, answer score: 6
Revisions (0)
No revisions yet.