patternsqlMinor
Query performance degraded after upgrading Postgres
Viewed 0 times
afterpostgresupgradingqueryperformancedegraded
Problem
I have a table with several million records in a PostgreSQL 12 database, and after an upgrade from 11 to 12, a few queries started performing horribly. They went from taking ~1 second to ~5 minutes. I tried rebuilding all indexes, vacuuming, and all usual Postgres low-hanging fruit, but performance is still terrible.
This is the query:
In other words, find all active users matching the given account number, and sort from most active to least.
This query takes about 5 minutes to return just 2 records. Something's not right.
The column
I have a GIN index created for this column with:
I also have an index for the
And I have an orderd index for the
Yet when I run
Why is it only using the user_activity_count index and not the more efficient GIN index? How do I fix this?
This is the query:
SELECT id, activity_count
FROM user
WHERE (search_index) @@ (to_tsquery('pg_catalog.english', '''1234567890'':*') AND active = true
ORDER BY activity_count DESC LIMIT 101In other words, find all active users matching the given account number, and sort from most active to least.
This query takes about 5 minutes to return just 2 records. Something's not right.
The column
search_index is a tsvector storing all the keywords from the table's various text fields (just things like account_number, name, etc).I have a GIN index created for this column with:
CREATE INDEX user_search_index_gin
ON public.user USING gin
(search_index)
TABLESPACE pg_default;I also have an index for the
active column with:CREATE INDEX user_active
ON public.user USING btree
(active ASC NULLS LAST)
TABLESPACE pg_default;And I have an orderd index for the
activity_count with:CREATE INDEX user_activity_count
ON public.user USING btree
(activity_count ASC NULLS LAST)
TABLESPACE pg_default;Yet when I run
EXPLAIN, I get:"Limit (cost=0.56..11443.66 rows=101 width=1552)"
" -> Index Scan Backward using user_activity_count on user (cost=0.56..36010185.91 rows=317836 width=1552)"
" Filter: (active AND (search_index @@ '''1234567890'':*'::tsquery))"Why is it only using the user_activity_count index and not the more efficient GIN index? How do I fix this?
Solution
Which index to use?
While your
However, your predicates strike me as pretty selective:
That's unless your example
See:
Statistics
So it seems your statistics are not up to date.
after an upgrade from 11 to 12, a few queries starting performing
horribly. They went from taking ~1 second to ~5 minutes. I tried
rebuilding all indexes, vacuuming, and all usual Postgres low-hanging
fruit, but performance is still terrible.
Why is it only using the user_activity_count index and not the more efficient GIN index?
Did you also add
-
Statistics
Because optimizer statistics are not transferred by
will be instructed to run a command to regenerate that information at
the end of the upgrade. You might need to set connection parameters to
match your new cluster.
If your DB is big, you may be interested in
This option is useful to analyze a database that was newly populated
from a restored dump or by
some statistics as fast as possible, to make the database usable, and
then produce full statistics in the subsequent stages.
Related:
While your
WHERE conditions are not very selective, the current query plan makes a lot of sense with ORDER BY activity_count DESC LIMIT 101. See:- How to search a table with 80 million records faster?
However, your predicates strike me as pretty selective:
WHERE (search_index) @@ (to_tsquery('pg_catalog.english', '''1234567890'':*') AND active = trueThat's unless your example
'1234567890' is misleading. (Are you sure you want the single quotes in the search term?) It's obviously not the actual query you used. Unmatched parentheses. In any case, you probably want the "simple" text search configuration (plus some improvements):WHERE search_index @@ to_tsquery('pg_catalog.simple', '''1234567890'':*')
AND activeSee:
- Get partial match from GIN indexed TSVECTOR column
Statistics
So it seems your statistics are not up to date.
after an upgrade from 11 to 12, a few queries starting performing
horribly. They went from taking ~1 second to ~5 minutes. I tried
rebuilding all indexes, vacuuming, and all usual Postgres low-hanging
fruit, but performance is still terrible.
Why is it only using the user_activity_count index and not the more efficient GIN index?
Did you also add
ANALYZE to your basket of all usual Postgres low-hanging fruit? You certainly didn't mention it. Consider instructions in the manual:-
Statistics
Because optimizer statistics are not transferred by
pg_upgrade, youwill be instructed to run a command to regenerate that information at
the end of the upgrade. You might need to set connection parameters to
match your new cluster.
If your DB is big, you may be interested in
vacuumdb with the option --analyze-in-stages. (That ship may have sailed for the case at hand.) The manual:This option is useful to analyze a database that was newly populated
from a restored dump or by
pg_upgrade. This option will try to createsome statistics as fast as possible, to make the database usable, and
then produce full statistics in the subsequent stages.
Related:
- Optimizing slow performance of simple SELECT query
Code Snippets
WHERE (search_index) @@ (to_tsquery('pg_catalog.english', '''1234567890'':*') AND active = trueWHERE search_index @@ to_tsquery('pg_catalog.simple', '''1234567890'':*')
AND activeContext
StackExchange Database Administrators Q#269834, answer score: 5
Revisions (0)
No revisions yet.