HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

Postgres 9.4.4 query takes forever

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
querytakesforeverpostgres

Problem

We're running Postgres 9.4.4 on CentOS 6.5 and have a SELECT query that has worked for years, but stopped working and hangs after we upgraded from 9.2 (it took a while to notice it, so I don't know if it was immediately after we upgraded or not).

SELECT id || ':' || group_number AS uniq_id
FROM   table_one
WHERE  id || ':' || group_number NOT IN (
   SELECT id || ':' || group_number
   FROM table_two
   )
AND    id NOT IN (
   SELECT id
   FROM table_three
   WHERE timestamp > NOW() - INTERVAL '30 days' 
   AND client_id > 0
   );


In all tables id is an integer, but is stored as character varying (15) (legacy system). group_number is stored as a smallint.

The sub-query for table_two returns about 2.5 million records. The sub-query for table_three returns about 2,500 records. Both return in about 1 second if run separately. But adding in either query (or both) as sub-queries causes the query to just hang indefinitely (for days, if we let it run).

I've seen others online with the same problem (query not returning when using NOT IN). NOT IN seems like such a straight forward sub-query.

We have plenty of hardware (384 GB RAM, Xeon 64 cores, 16 disk 15k RPM RAID 10).

  • Why is this happening? (ie. is this a major ongoing bug in Postgres?)



  • How can I fix / debug it in the meantime?



Here are the results of EXPLAIN:

```
QUERY PLAN
Index Only Scan using table_one_id_pk on table_one (cost=19690.90..64045129699.10 rows=370064 width=9)
Filter: ((NOT (hashed SubPlan 2)) AND (NOT (SubPlan 1)))
SubPlan 2
-> Bitmap Heap Scan on table_three (cost=2446.92..19686.74 rows=8159 width=7)
Recheck Cond: (("timestamp" > (now() - '30 days'::interval)) AND (client_id > 0))
-> BitmapAnd (cost=2446.92..2446.92 rows=8159 width=0)
-> Bitmap Index Scan on table_one_timestamp_idx (cost=0.00..1040.00 rows=79941 width=0)
Index Cond: ("timestamp" > (now() - '30 days'::interval))

Solution

Assuming you checked off usual suspects in the wiki page as commented by @a_horse.

Also see the spin-off addressing the discussion of bitmap index scans and the size of work_mem.

  • "Recheck Cond:" line in query plans with a bitmap index scan



Query

This rewritten query should be substantially faster:

SELECT id || ':' || group_number AS uniq_id
    -- id::text || ':' || group_number AS uniq_id  -- with integer
FROM   table_one t1
WHERE  NOT EXISTS ( 
   SELECT 1
   FROM   table_two t2
   WHERE  t2.id = t1.id
   AND    t2.group_number = t1.group_number
   ) 
AND NOT EXISTS (
   SELECT 1
   FROM   table_three t3
   WHERE  t3.timestamp > NOW() - interval '30 days' 
   AND    t3.client_id > 0
   AND    t3.id = t1.id
   );


-
The most important problem is comparing a concatenated string between table_one and table_two, which is generally more expensive than necessary and specifically not sargable.

-
Storing integer numbers as strings is expensive nonsense. You seem to be aware of that. Convert to integer if at all possible. If you only have valid integer numbers in the varchar column id, all you need to do is:

ALTER TABLE table_one ALTER COLUMN id TYPE integer USING id::int;


And possibly the same for table_two.

-
NOT IN carries a trap for NULL values on either side. That's why NOT EXISTS is almost always better. (Typically performs better on top of that.)

  • Select rows which are not present in other table



Indexes

Either way, key to performance are matching indexes.

Be sure to have multicolumn index on table_one and table_two:

CREATE INDEX t1_foo_idx ON table_one (id, group_number)
CREATE INDEX t2_foo_idx ON table_two (id, group_number)


Possibly allows index-only scans.

With integer instead of varchar, these would be smaller and more efficient, yet:

  • Is a composite index also good for queries on the first field?



And I suggest a partial multicolumn index on table_three:

CREATE INDEX t3_foo_idx ON table_three (timestamp, id)
WHERE  client_id > 0
AND    timestamp > '2015-06-07 0:0';


Usefulness deteriorates over time. Recreate the index with increased lower bound at opportune moments - which takes an exclusive lock on the table, so consider CREATE INDEX CONCURRENTLY. Detailed explanation:

  • Index optimization with dates



You need to match the (updated) index condition in your queries. Add the condition even if that seems redundant. Like:

...
AND NOT EXISTS (
   SELECT 1
   FROM   table_three t3
   WHERE  t3.timestamp > NOW() - interval '30 days' 
   AND    t3 timestamp > '2015-06-07 0:0'  -- match index condition
   AND    t3.client_id > 0
   AND    t3.id = t1.id
   );


You can use a function as pseudo-constant in partial index and query and automate the process. Last chapter in this related answer:

  • Get latest child per parent from big table - query is too slow



SET LOCAL

Like you found yourself, increasing work_mem locally for the query helps if the query needs that much RAM. Consider SET LOCAL:

  • work_mem in postgresql on Linux



  • Speed up creation of Postgres partial index



With all suggested improvements in place you may not need to increase work_mem any more.

Code Snippets

SELECT id || ':' || group_number AS uniq_id
    -- id::text || ':' || group_number AS uniq_id  -- with integer
FROM   table_one t1
WHERE  NOT EXISTS ( 
   SELECT 1
   FROM   table_two t2
   WHERE  t2.id = t1.id
   AND    t2.group_number = t1.group_number
   ) 
AND NOT EXISTS (
   SELECT 1
   FROM   table_three t3
   WHERE  t3.timestamp > NOW() - interval '30 days' 
   AND    t3.client_id > 0
   AND    t3.id = t1.id
   );
ALTER TABLE table_one ALTER COLUMN id TYPE integer USING id::int;
CREATE INDEX t1_foo_idx ON table_one (id, group_number)
CREATE INDEX t2_foo_idx ON table_two (id, group_number)
CREATE INDEX t3_foo_idx ON table_three (timestamp, id)
WHERE  client_id > 0
AND    timestamp > '2015-06-07 0:0';
...
AND NOT EXISTS (
   SELECT 1
   FROM   table_three t3
   WHERE  t3.timestamp > NOW() - interval '30 days' 
   AND    t3 timestamp > '2015-06-07 0:0'  -- match index condition
   AND    t3.client_id > 0
   AND    t3.id = t1.id
   );

Context

StackExchange Database Administrators Q#106126, answer score: 8

Revisions (0)

No revisions yet.