HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

Why would adding LIMIT 200 cause a query to slow down?

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
whylimit200queryaddingslowwoulddowncause

Problem

I am trying to debug a slow query on a PostgreSQL 9.1.13 database, and I am a bit at loss. The exact query generated by the ORM framework is:

SELECT "core_product"."sales_price", "core_product"."recommended_price", "core_productgroup"."name", "core_product"."number", "core_product"."name", "core_product"."description", "core_product"."cost_price", "core_product"."bar_code", "core_product"."accessible"
FROM "core_product" INNER JOIN "core_productgroup" ON ( "core_product"."product_group_id" = "core_productgroup"."id" )
WHERE "core_productgroup"."company_id" = 1056
ORDER BY "core_product"."id" ASC
LIMIT 200;


This query takes 28 seconds to return 200 rows, which is too slow for our use case.

In a first attempt to understand where the performance bottleneck might be. I tried first removing the LIMIT 200 expecting it to be even slower. However without LIMIT 200 the query takes just 2 seconds to return approximately 293000 rows.

How can it be faster to return all 293000 matching rows rather than only the first 200?

I have tried using EXPLAIN to see how the query plans for the two queries differ. It turns out these two almost identical queries have quite different query plans. With LIMIT:

```
QUERY PLAN
----------------------------------------------------------------------------------------------------------------
Limit (cost=10.69..52229.70 rows=200 width=76)
-> Nested Loop (cost=10.69..17054740.55 rows=65320 width=76)
Join Filter: (core_product.product_group_id = core_productgroup.id)
-> Index Scan using core_product_pkey on core_product (cost=0.00..3124799.28 rows=2957497 width=71)
-> Materialize (cost=10.69..131.18 rows=314 width=13)
-> Bitmap Heap Scan on core_productgroup (cost=10.69..129.61 rows=314 width=13)
Recheck Cond: (company_id = 1056)
-> Bitmap Index Scan on core_productgroup_company_id (c

Solution

The planner thinks that it can run through in core_product.id order, and rapidly find 200 matches where company_id=1056, at which point it is done.

But that doesn't work out, because all the things with a small core_product.id are things which don't have company_id=1056. (For example, company_id=1056 is a recently-joined client of yours, so all of their data falls on the upper end of the id sequence. But PostgreSQL doesn't understand that.)

You can probably force the plan you want by using a CTE and writing it like this:

with t as (
   
)
select * from t limit 200;

Code Snippets

with t as (
   <your query, without the limit, goes here>
)
select * from t limit 200;

Context

StackExchange Database Administrators Q#130233, answer score: 9

Revisions (0)

No revisions yet.