snippetsqlMinor
How to optimize a recursive query in Postgresql?
Viewed 0 times
postgresqlqueryrecursiveoptimizehow
Problem
I have a recursive query that takes too long - 30+ ms where doing the individual queries to extract the same data manually takes
Data might look like this:
subject_group_id
subject_id
1
2
1
3
1
4
2
5
3
5
I want to know all the groups that 5 is a member of (1 by inheritance, 2 & 3 directly, not 4 or any other subject ids).
This query works as expected:
But run with real data I get this query plan:
```
CTE Scan on flat_members (cost=36759729.47..59962757.76 rows=5156229 width=16) (actual time=26.526..55.166 rows=3 loops=1)
Filter: (subject_id = 30459)
Rows Removed by Filter: 48984
CTE flat_members
-> Recursive Union (cost=0.00..36759729.47 rows=1031245702 width=16) (actual ti
create table subjects
(
subject_id bigint not null
constraint pk_subjects
primary key
);
create table subject_group_members
(
subject_group_id bigint not null
constraint fk_subject_group_members_subject_group_id_subjects_subject_id
references subjects(subject_id)
on delete cascade,
subject_id bigint not null
constraint fk_subject_group_members_subject_id_subjects_subject_id
references subjects(subject_id)
on delete cascade,
constraint pk_subject_group_members
primary key (subject_group_id, subject_id)
);
create index idx_subject_group_members_subject_id
on subject_group_members (subject_id);
create index idx_subject_group_members_subject_group_id
on subject_group_members (subject_group_id);
Data might look like this:
subject_group_id
subject_id
1
2
1
3
1
4
2
5
3
5
I want to know all the groups that 5 is a member of (1 by inheritance, 2 & 3 directly, not 4 or any other subject ids).
This query works as expected:
with recursive flat_members(subject_group_id, subject_id) as (
select subject_group_id, subject_id
from subject_group_members gm
union
select
flat_members.subject_group_id as subject_group_id,
subject_group_members.subject_id as subject_id
from subject_group_members
join flat_members on flat_members.subject_id = subject_group_members.subject_group_id
)
select * from flat_members where subject_id = 5
But run with real data I get this query plan:
```
CTE Scan on flat_members (cost=36759729.47..59962757.76 rows=5156229 width=16) (actual time=26.526..55.166 rows=3 loops=1)
Filter: (subject_id = 30459)
Rows Removed by Filter: 48984
CTE flat_members
-> Recursive Union (cost=0.00..36759729.47 rows=1031245702 width=16) (actual ti
Solution
Looks like you inverted the join condition by accident.
fiddle
Plus, move the filter
(Might as well be
About index-only scans:
It's typically best to just have a PK on
WITH RECURSIVE flat_members AS (
SELECT subject_group_id
FROM subject_group_members gm
WHERE subject_id = 5
UNION
SELECT gm.subject_group_id
FROM flat_members fm
JOIN subject_group_members gm ON gm.subject_id = fm.subject_group_id
)
TABLE flat_members;fiddle
Plus, move the filter
WHERE subject_id = 5 up to the initial SELECT to filter irrelevant rows early - and allow for an optimized query plan, typically using an index. Speaking of which, this multicolumn index would serve much better, allowing index-only scans:CREATE INDEX subject_group_members_subject_id_subject_group_id
ON subject_group_members (subject_id, subject_group_id);(Might as well be
UNIQUE.) In addition to your PK on (subject_group_id, subject_id). Or invert the columns in the PK definition, either might be useful.About index-only scans:
- Can Postgres use an index-only scan for this query with joined tables?
It's typically best to just have a PK on
(subject_id, subject_group_id), another multicolumn index on (subject_group_id, subject_id), and drop the two indexes on only (subject_id) and (subject_group_id). See:- Is a composite index also good for queries on the first field?
Code Snippets
WITH RECURSIVE flat_members AS (
SELECT subject_group_id
FROM subject_group_members gm
WHERE subject_id = 5
UNION
SELECT gm.subject_group_id
FROM flat_members fm
JOIN subject_group_members gm ON gm.subject_id = fm.subject_group_id
)
TABLE flat_members;CREATE INDEX subject_group_members_subject_id_subject_group_id
ON subject_group_members (subject_id, subject_group_id);Context
StackExchange Database Administrators Q#317027, answer score: 4
Revisions (0)
No revisions yet.