gotchasqlMinor
Why does this fast postgres query show up in the slow query log?
Viewed 0 times
thisfastwhyshowthepostgreslogqueryslowdoes
Problem
On investigating an entry in our Postgres log, I noticed a large discrepancy between the reported duration and the speed of the query.
Running an explain on that query yielded much faster speeds:
```
explain analyze
SELECT prospect.id AS prospect_id
, 0, am.id, am.customer_id, 0
FROM xxxxx.audience_member am
LEFT OUTER JOIN xxxxx.campaign campaign
ON campaign.id = 4000013578869
LEFT OUTER JOIN xxxxx.end_user prospect
ON campaign.id = prospect.campaign_id
AND prospect.email_id = am.customer_id
AND prospect.end_user_type != 1
WHERE am.audience_id = 4000013744916
ORDER BY am.id
limit 500;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------
-----------------------
Limit (cost=326.61..326.67 rows=26 width=24) (actual time=0.500..0.517 rows=4 loops=1)
-> Sort (cost=326.61..326.67 rows=26 width=24) (actual time=0.491..0.498 rows=4 loops=1)
Sort Key: am.id
Sort Method: quicksort Memory: 25kB
-> Nested Loop Left Join (cost=0.00..326.00 rows=26 width=24) (actual time=0.260..0.402 rows=4 loops=1)
Join Filter: (campaign.id = prospect.campaign_id)
-> Nested Loop Left Join (cost=0.00..107.25 rows=26 width=24) (actual time=0.2
2015-07-28 17:27:26 UTC [dms_segment_3] LOG: duration: 2924.763 ms bind : SELECT prospect.id AS prospect_id, 0, am.id, am.customer_id, 0 FROM xxxxx.audience_member am LEFT OUTER JOIN xxxxx.campaign campaign ON campaign.id = $1 LEFT OUTER JOIN xxxxx.end_user prospect ON campaign.id=prospect.campaign_id AND prospect.email_id=am.customer_id AND prospect.end_user_type != 1 WHERE am.audience_id = $2 ORDER BY am.id limit $3
2015-07-28 17:27:26 UTC [dms_segment_3] DETAIL: parameters: $1 = '4000013578869', $2 = '4000013744916', $3 = '500'Running an explain on that query yielded much faster speeds:
```
explain analyze
SELECT prospect.id AS prospect_id
, 0, am.id, am.customer_id, 0
FROM xxxxx.audience_member am
LEFT OUTER JOIN xxxxx.campaign campaign
ON campaign.id = 4000013578869
LEFT OUTER JOIN xxxxx.end_user prospect
ON campaign.id = prospect.campaign_id
AND prospect.email_id = am.customer_id
AND prospect.end_user_type != 1
WHERE am.audience_id = 4000013744916
ORDER BY am.id
limit 500;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------
-----------------------
Limit (cost=326.61..326.67 rows=26 width=24) (actual time=0.500..0.517 rows=4 loops=1)
-> Sort (cost=326.61..326.67 rows=26 width=24) (actual time=0.491..0.498 rows=4 loops=1)
Sort Key: am.id
Sort Method: quicksort Memory: 25kB
-> Nested Loop Left Join (cost=0.00..326.00 rows=26 width=24) (actual time=0.260..0.402 rows=4 loops=1)
Join Filter: (campaign.id = prospect.campaign_id)
-> Nested Loop Left Join (cost=0.00..107.25 rows=26 width=24) (actual time=0.2
Solution
What you see in the logs is a prepared statement execution. Using prepared statements is the usual way to interact with the database from an application layer.
For example, using Hibernate, one could write something like this (hopefully, the code snippet below is valid, did not test it):
Among others, this is a good way to avoid SQL injection - unlike building a full query text, including the parameter, using concatenation.
One can do this from a PostgreSQL client, too:
On the other hand, and this is what you experience, a prepared query can result in inferior performance. A prepared statement cannot make an advantage from knowing what the values passed to it will be - simply because they are unknown at the preparation time. Thus, it has to choose a query plan that is generic enough to get any possible result in an acceptable time.
For example, imagine you had a partial index like
(which does not make much sense in real life, but never mind).
When the planner knows it has 4000013578869 as
For example, using Hibernate, one could write something like this (hopefully, the code snippet below is valid, did not test it):
String sql = "SELECT first_name, last_name FROM customer WHERE email = :customer_email";
...
query.setParameter("customer_email", Customer.email);Among others, this is a good way to avoid SQL injection - unlike building a full query text, including the parameter, using concatenation.
One can do this from a PostgreSQL client, too:
PREPARE fetch_customer (text) AS
SELECT first_name, last_name FROM customer WHERE email = $1;
[EXPLAIN ANALYZE] EXECUTE fetch_customer ('john@gmail.com');On the other hand, and this is what you experience, a prepared query can result in inferior performance. A prepared statement cannot make an advantage from knowing what the values passed to it will be - simply because they are unknown at the preparation time. Thus, it has to choose a query plan that is generic enough to get any possible result in an acceptable time.
For example, imagine you had a partial index like
CREATE INDEX ON xxxx.campaign (campaign_id) WHERE campaign.id > 4000000000000;(which does not make much sense in real life, but never mind).
When the planner knows it has 4000013578869 as
campaign_id, it can opt for using this index, possibly significantly reducing the number of data pages to fetch. However, a prepared statement cannot do this, as some (most?) of the possible campaigns would be excluded.Code Snippets
String sql = "SELECT first_name, last_name FROM customer WHERE email = :customer_email";
...
query.setParameter("customer_email", Customer.email);PREPARE fetch_customer (text) AS
SELECT first_name, last_name FROM customer WHERE email = $1;
[EXPLAIN ANALYZE] EXECUTE fetch_customer ('john@gmail.com');CREATE INDEX ON xxxx.campaign (campaign_id) WHERE campaign.id > 4000000000000;Context
StackExchange Database Administrators Q#108482, answer score: 3
Revisions (0)
No revisions yet.