patternsqlModerate
Is the system column "ctid" legitimate for identifying rows to delete?
Viewed 0 times
rowsthedeletecolumnsystemlegitimateforidentifyingctid
Problem
I have a table with hundreds of millions of rows that I need to delete data from.
The existing indexes are the most efficient.
I can however use the existing indexes to find the rows to delete by using the
What are the risks of relying on the
The existing indexes are the most efficient.
I can however use the existing indexes to find the rows to delete by using the
ctid values:DELETE FROM calendar_event WHERE ctid IN
(SELECT ctid FROM calendar_event WHERE user_id = 5 LIMIT 100 FOR UPDATE)What are the risks of relying on the
ctid in this case? My worst case scenario is deleting the wrong the row.Solution
The
This prevents them from being locked, modified or deleted by other
transactions until the current transaction ends. That is, other
transactions that attempt
rows will be blocked until the current transaction ends;
So the
However, I would use a CTE to materialize the selection and avoid unexpected results.
And without
Related, with explanation for both considerations:
About
ROW SHARE lock taken by FOR UPDATE prevents concurrent write access that would change the physical location of the row. The manual:This prevents them from being locked, modified or deleted by other
transactions until the current transaction ends. That is, other
transactions that attempt
UPDATE, DELETE, SELECT FOR UPDATE, SELECT FOR NO KEY UPDATE, SELECT FOR SHARE or SELECT FOR KEY SHARE of theserows will be blocked until the current transaction ends;
So the
ctid should be stable for the duration of the command (or the transaction, even) unless you alter the row within the same transaction yourself. ctid is still a system column for internal use and the project will not offer any guarantees. If you have any unique (combination of) column(s) (including the PK) use that instead of the ctid.However, I would use a CTE to materialize the selection and avoid unexpected results.
And without
ORDER BY you select arbitrary rows for deletion. You might as well add SKIP LOCKED to minimize lock contention with concurrent transactions:WITH cte AS (
SELECT ctid
FROM calendar_event
WHERE user_id = 5
LIMIT 100
FOR UPDATE SKIP LOCKED
)
DELETE FROM calendar_event WHERE ctid IN (TABLE cte);Related, with explanation for both considerations:
- Postgres UPDATE ... LIMIT 1
About
ctid:- How do I decompose ctid into page and row numbers?
Code Snippets
WITH cte AS (
SELECT ctid
FROM calendar_event
WHERE user_id = 5
LIMIT 100
FOR UPDATE SKIP LOCKED
)
DELETE FROM calendar_event WHERE ctid IN (TABLE cte);Context
StackExchange Database Administrators Q#214422, answer score: 14
Revisions (0)
No revisions yet.