patternsqlMinor
speeding up a deletion in postgresql
Viewed 0 times
speedingdeletionpostgresql
Problem
I came up with:
This is very slow when there are 120k records in taskflag and I'm keeping 10k.
Taskflag looks like:
Explain says:
Should I just arrange for the temp table to contain the ones I am keeping?
drop table if exists idtemp;
create temp table idtemp as
select documentid from taskflag where taskid='coref' and state = 2
order by statechanged asc limit howmany;
create unique index on idtemp(documentid);
-- trim taskflag to the first N docs ordered by coref.
delete from taskflag where documentid not in (select documentid from idtemp) ;This is very slow when there are 120k records in taskflag and I'm keeping 10k.
Taskflag looks like:
\d taskflag
Table "public.taskflag"
Column | Type | Modifiers
--------------+-----------------------------+-----------
documentid | character varying(64) | not null
taskid | character varying(64) | not null
state | smallint |
statechanged | timestamp without time zone |
Indexes:
"taskflag_pkey" PRIMARY KEY, btree (documentid, taskid)
"task_index2" btree (documentid)
"task_index4" btree (taskid, state, statechanged)Explain says:
QUERY PLAN
----------------------------------------------------------------------------------
Delete on taskflag (cost=0.00..105811822.25 rows=223210 width=6)
-> Seq Scan on taskflag (cost=0.00..105811822.25 rows=223210 width=6)
Filter: (NOT (SubPlan 1))
SubPlan 1
-> Materialize (cost=0.00..449.00 rows=10000 width=146)
-> Seq Scan on idtemp (cost=0.00..184.00 rows=10000 width=146)
(6 rows)Should I just arrange for the temp table to contain the ones I am keeping?
Solution
The easiest optimization would probably be to let the planner use a hash anti join, by rewriting the query as:
also you may need to ANALYZE the temp table immediately after populating it.
delete from taskflag where not exists
(select 1 from idtemp where documentid=taskflag.documentid);also you may need to ANALYZE the temp table immediately after populating it.
Code Snippets
delete from taskflag where not exists
(select 1 from idtemp where documentid=taskflag.documentid);Context
StackExchange Database Administrators Q#18808, answer score: 5
Revisions (0)
No revisions yet.