HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

Query performance degraded after upgrading Postgres

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
afterpostgresupgradingqueryperformancedegraded

Problem

I have a table with several million records in a PostgreSQL 12 database, and after an upgrade from 11 to 12, a few queries started performing horribly. They went from taking ~1 second to ~5 minutes. I tried rebuilding all indexes, vacuuming, and all usual Postgres low-hanging fruit, but performance is still terrible.

This is the query:

SELECT id, activity_count
FROM user
WHERE (search_index) @@ (to_tsquery('pg_catalog.english', '''1234567890'':*') AND active = true
ORDER BY activity_count DESC LIMIT 101


In other words, find all active users matching the given account number, and sort from most active to least.

This query takes about 5 minutes to return just 2 records. Something's not right.

The column search_index is a tsvector storing all the keywords from the table's various text fields (just things like account_number, name, etc).

I have a GIN index created for this column with:

CREATE INDEX user_search_index_gin
    ON public.user USING gin
    (search_index)
    TABLESPACE pg_default;


I also have an index for the active column with:

CREATE INDEX user_active
    ON public.user USING btree
    (active ASC NULLS LAST)
    TABLESPACE pg_default;


And I have an orderd index for the activity_count with:

CREATE INDEX user_activity_count
    ON public.user USING btree
    (activity_count ASC NULLS LAST)
    TABLESPACE pg_default;


Yet when I run EXPLAIN, I get:

"Limit  (cost=0.56..11443.66 rows=101 width=1552)"
"  ->  Index Scan Backward using user_activity_count on user  (cost=0.56..36010185.91 rows=317836 width=1552)"
"        Filter: (active AND (search_index @@ '''1234567890'':*'::tsquery))"


Why is it only using the user_activity_count index and not the more efficient GIN index? How do I fix this?

Solution

Which index to use?

While your WHERE conditions are not very selective, the current query plan makes a lot of sense with ORDER BY activity_count DESC LIMIT 101. See:

  • How to search a table with 80 million records faster?



However, your predicates strike me as pretty selective:

WHERE (search_index) @@ (to_tsquery('pg_catalog.english', '''1234567890'':*') AND active = true


That's unless your example '1234567890' is misleading. (Are you sure you want the single quotes in the search term?) It's obviously not the actual query you used. Unmatched parentheses. In any case, you probably want the "simple" text search configuration (plus some improvements):

WHERE  search_index @@ to_tsquery('pg_catalog.simple', '''1234567890'':*')
AND    active


See:

  • Get partial match from GIN indexed TSVECTOR column



Statistics

So it seems your statistics are not up to date.

after an upgrade from 11 to 12, a few queries starting performing
horribly. They went from taking ~1 second to ~5 minutes. I tried
rebuilding all indexes, vacuuming, and all usual Postgres low-hanging
fruit, but performance is still terrible.

Why is it only using the user_activity_count index and not the more efficient GIN index?

Did you also add ANALYZE to your basket of all usual Postgres low-hanging fruit? You certainly didn't mention it. Consider instructions in the manual:

-
Statistics

Because optimizer statistics are not transferred by pg_upgrade, you
will be instructed to run a command to regenerate that information at
the end of the upgrade. You might need to set connection parameters to
match your new cluster.

If your DB is big, you may be interested in vacuumdb with the option --analyze-in-stages. (That ship may have sailed for the case at hand.) The manual:

This option is useful to analyze a database that was newly populated
from a restored dump or by pg_upgrade. This option will try to create
some statistics as fast as possible, to make the database usable, and
then produce full statistics in the subsequent stages.

Related:

  • Optimizing slow performance of simple SELECT query

Code Snippets

WHERE (search_index) @@ (to_tsquery('pg_catalog.english', '''1234567890'':*') AND active = true
WHERE  search_index @@ to_tsquery('pg_catalog.simple', '''1234567890'':*')
AND    active

Context

StackExchange Database Administrators Q#269834, answer score: 5

Revisions (0)

No revisions yet.