HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

Why so many loops in pkey index scan?

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
whyscanloopspkeymanyindex

Problem

My query executes very slowly despite all indexes in place:

SELECT * FROM "entry"
    INNER JOIN "entrytag" ON ("entry"."id" = "entrytag"."entry_id")
    WHERE "entrytag"."tag_id" = 323456
    ORDER BY "entry"."date"
    DESC LIMIT 10'


the explain shows too many loops, why? how to fix this?

Limit  (cost=1241.85..1241.87 rows=10 width=666) (actual time=23576.449..23576.454 rows=10 loops=1)
  ->  Sort  (cost=1241.85..1242.10 rows=99 width=666) (actual time=23576.446..23576.447 rows=10 loops=1)
        Sort Key: entry.date DESC
        Sort Method: top-N heapsort  Memory: 31kB
        ->  Nested Loop  (cost=0.87..1239.71 rows=99 width=666) (actual time=0.168..22494.187 rows=989105 loops=1)
              ->  Index Scan using entrytag_tag_id_row_idx on entrytag  (cost=0.44..402.17 rows=99 width=4)
                  (actual time=0.093..535.664 **rows=989105** loops=1)
                    Index Cond: (tag_id = 323456)
              ->  Index Scan using entry_pkey on entry  (cost=0.43..8.45 rows=1 width=666) 
l time=0.020..0.021 rows=1 **loops=989105**)
                    Index Cond: (id = entrytag.entry_id)
Planning time: 0.829 ms
Execution time: 23576.504 ms


Indexes i have on table entry:

('id', 'date', ...other irrelevant cols)
('date', ...other irrelevant cols)


On association table entrytag:

(tag_id, entry_id)
(tag_id, row)  -- this index is used according to the explain


PostgreSQL v 9.5. There are many rows, the db is quite big. The same queries for other tags (with the same number of entries) take fractions of seconds and no such huge row and loop counts.

Solution

The issue is here

Index Scan using entrytag_tag_id_row_idx on entrytag
    (cost=0.44..402.17 rows=99 width=4)
    (actual time=0.093..535.664 rows=989105 loops=1)


Your statistics are off for

WHERE "entrytag"."tag_id" = 323456


Your planner thinks there are far fewer tag_id=323456 then there are. You may want to try ANALYZE entrytag;, and trying again. Or upping statistics.

ALTER TABLE entrytag
  ALTER COLUMN tag_id
  SET STATISTICS 1000;


Then try ANALYZE entrytag; and trying again. Sounds like a typical case of bad statistics.

You may want to improve your schema, or denormallize. You're joining, selecting, and ordering million rows. It's not going to be instant.

Code Snippets

Index Scan using entrytag_tag_id_row_idx on entrytag
    (cost=0.44..402.17 rows=99 width=4)
    (actual time=0.093..535.664 rows=989105 loops=1)
WHERE "entrytag"."tag_id" = 323456
ALTER TABLE entrytag
  ALTER COLUMN tag_id
  SET STATISTICS 1000;

Context

StackExchange Database Administrators Q#184014, answer score: 3

Revisions (0)

No revisions yet.