snippetsqlMinor
How to reduce query size with many repeated UNION subqueries?
Viewed 0 times
repeatedwithsizequeryunionreducemanyhowsubqueries
Problem
I use Postgres 13 and have a table defined with the following DDL:
I use the following query:
The code is generated dynamically and the number of subqueries with
Naively rewriting the query to, what I think, is equivalent
makes it many times slower and unacceptable.
Two main questions:
-
Is it possible to rewrite the original query in a more concise way without duplicating subqueries for each
-
Why can Postrgres not optimize the second query? Am I missing something and it's not an equivalent?
Query plan of the original query with
```
Limit (cost=1.12..7.33 rows=100 width=41)
-> Merge Append (cost=1.12..13.53 rows=200 width=41)
Sort Key: btc_tx_addresses.tx_time, btc_tx_addresses.tx_id"
-> Limit (cost=0.56..4.76 rows=100 width=41)
-> Index Only Scan using btc_tx_addr
CREATE TABLE item_codes (
code bytea NOT NULL,
item_id bytea NOT NULL,
time TIMESTAMP WITH TIME ZONE NOT NULL,
PRIMARY KEY (item_id, code)
);
CREATE INDEX ON item_codes (code, time, item_id);I use the following query:
SELECT DISTINCT time, item_id
FROM (
(SELECT time, item_id
FROM item_codes
WHERE code = '\x3965623166306238383033393437613338373162313934383034366139653239'
ORDER BY time, item_id
LIMIT 100)
UNION ALL
(SELECT time, item_id
FROM item_codes
WHERE code = '\x3836653432356638366638636338393364373935343938303233343363373561'
ORDER BY time, item_id
LIMIT 100)
) AS items
ORDER BY time, item_id
LIMIT 100;The code is generated dynamically and the number of subqueries with
UNION ALL depends on how many different code values are needed. It can get pretty long.Naively rewriting the query to, what I think, is equivalent
SELECT DISTINCT time, item_id
FROM item_codes
WHERE code IN ('\x3965623166306238383033393437613338373162313934383034366139653239',
'\x3836653432356638366638636338393364373935343938303233343363373561')
ORDER BY time, item_id
LIMIT 100makes it many times slower and unacceptable.
Two main questions:
-
Is it possible to rewrite the original query in a more concise way without duplicating subqueries for each
code value while still keeping a fast execution plan?-
Why can Postrgres not optimize the second query? Am I missing something and it's not an equivalent?
Query plan of the original query with
UNIONs:```
Limit (cost=1.12..7.33 rows=100 width=41)
-> Merge Append (cost=1.12..13.53 rows=200 width=41)
Sort Key: btc_tx_addresses.tx_time, btc_tx_addresses.tx_id"
-> Limit (cost=0.56..4.76 rows=100 width=41)
-> Index Only Scan using btc_tx_addr
Solution
Q1. Alternative without repeating subqueries
Possible. Provide a set of input values, and then attach a
I am unnesting an input array to provide the set. As an aside: escape
Or:
Alternatively, a
Q2. Why?
Because Postgres just doesn't have this kind of index skip scan implemented as query plan, yet. So we have to shoehorn it in.
Related:
Possible. Provide a set of input values, and then attach a
LATERAL subquery:SELECT DISTINCT time, item_id
FROM unnest('{\\x3965623166306238383033393437613338373162313934383034366139653239
, \\x3836653432356638366638636338393364373935343938303233343363373561}'::bytea[]) c(code)
CROSS JOIN LATERAL (
SELECT time, item_id
FROM item_codes ic
WHERE ic.code = c.code
ORDER BY 1, 2
LIMIT 100
) ic
ORDER BY 1, 2
LIMIT 100;I am unnesting an input array to provide the set. As an aside: escape
\ with \ inside the array literal, or use an ARRAY constructor instead:ARRAY['\x3965623166306238383033393437613338373162313934383034366139653239'
, '\x3836653432356638366638636338393364373935343938303233343363373561']::bytea[]Or:
ARRAY['\x3965623166306238383033393437613338373162313934383034366139653239'::bytea
, '\x3836653432356638366638636338393364373935343938303233343363373561']Alternatively, a
VALUES expression does the trick, too:SELECT DISTINCT time, item_id
FROM (
VALUES
('\x3965623166306238383033393437613338373162313934383034366139653239'::bytea)
, ('\x3836653432356638366638636338393364373935343938303233343363373561')
) c(code)
CROSS JOIN LATERAL ( ...Q2. Why?
Because Postgres just doesn't have this kind of index skip scan implemented as query plan, yet. So we have to shoehorn it in.
Related:
- Can spatial index help a "range - order by - limit" query
- Optimize a query with small LIMIT, predicate on one column and order by another
- Query last N related rows per row
Code Snippets
SELECT DISTINCT time, item_id
FROM unnest('{\\x3965623166306238383033393437613338373162313934383034366139653239
, \\x3836653432356638366638636338393364373935343938303233343363373561}'::bytea[]) c(code)
CROSS JOIN LATERAL (
SELECT time, item_id
FROM item_codes ic
WHERE ic.code = c.code
ORDER BY 1, 2
LIMIT 100
) ic
ORDER BY 1, 2
LIMIT 100;ARRAY['\x3965623166306238383033393437613338373162313934383034366139653239'
, '\x3836653432356638366638636338393364373935343938303233343363373561']::bytea[]ARRAY['\x3965623166306238383033393437613338373162313934383034366139653239'::bytea
, '\x3836653432356638366638636338393364373935343938303233343363373561']SELECT DISTINCT time, item_id
FROM (
VALUES
('\x3965623166306238383033393437613338373162313934383034366139653239'::bytea)
, ('\x3836653432356638366638636338393364373935343938303233343363373561')
) c(code)
CROSS JOIN LATERAL ( ...Context
StackExchange Database Administrators Q#323884, answer score: 9
Revisions (0)
No revisions yet.