HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

PostgreSql changes same query plan at runtime

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
postgresqlsamequeryruntimeplanchanges

Problem

im having a really weird issue here. We have search query that receives multiple parameters and searches by multiple joins in different tables. The query works fine first few time we run it and then sometimes it would slow down terribly for example: From 200ms to 15000/20000ms. On the same exact query.

Initially i thought it was a backend ORM issue but after checking it thoroughly and turning on slow query log, turning on auto_explain feature i realized the query plan changes after few calls. We put a loop in backend that executes query on the same thread mulitple times(for testing) and i discovered that almost always after 8 executions db starts logging slow query. And finally it turned out query plan changes.
Most important change being(i think) on TimeSlots table, from seq scan to Bitmap index scan which estimates much fewer rows than there is in a table. estimate: 566 actual: 100 000 ish.

Things i tried beforehand:

  • Upgrading Hardware from 2core 6GB to 8 core 26gb ram (100GB SSD)



  • Disabling any other application(thus stopping any other load) from reaching it.



  • performing manual(there are also automated jobs of them running) VACUUM, ANALYZE, REINDEX-ing on the tables.



  • Setting statistics for various columns, such as SlotDateTime (to 10000)



  • Creating ndistinct statistics.



  • Increased work_mem, maintenance_work_mem, shared_buffers size.



  • Decreased seq_page_cost to 1.



Here is the select statement:

```
SELECT DISTINCT u."UserId" AS UserId,
cast(cuj.company_id as bigint) AS CompanyId,
cast(ts.SlotsCount as bigint) as SlotsCount,
case when cuj.vip is null then false else cuj.vip end vip,
u.is_vip as userVip,

case
when ts2.EarliestDate isnull then '7777-12-01 21:00:00.000000'

Solution

The problem is that in your case, PostgreSQL switches to a generic plan after the first 5 executions, and the generic plan performs worse. In particular, the estimate on the index scan on slotdatetime is terribly off in the generic plan.

PostgreSQL has a built-in heuristics form statements whose plans can be cached: it plans them with the actual parameter values for the first 5 times, and then it switches to a generic plan if that plan promises not to perform worse. Unfortunately the generic plan doesn't keep the promise in this case.

Since there is no good way to fix the generic plan, the best you can do is avoid it. You should upgrade to PostgreSQL v12 or better and set the parameter plan_cache_mode to force_custom_plan when you run this query. If the query is executed in a PL/pgSQL function, you can simply

ALTER FUNCTION myfunc(argtype1,argtype2) SET plan_cache_mode = force_custom_plan;


An alternative is to use dynamic SQL with EXECUTE, that will always use a custom plan.

Code Snippets

ALTER FUNCTION myfunc(argtype1,argtype2) SET plan_cache_mode = force_custom_plan;

Context

StackExchange Database Administrators Q#289130, answer score: 3

Revisions (0)

No revisions yet.