patternsqlModerate
Deadlock with multi-row INSERTs despite ON CONFLICT DO NOTHING
Viewed 0 times
multidespitedeadlockinsertswithnothingconflictrow
Problem
Setup
I have a bulk insert function
(I wrap it that way because I need to get a count of actual inserts, without the "fake row updates" trick.)
The table
The trigger does this:
Both
Problem
Occasionally, I get an error
I have a bulk insert function
set_interactions(arg_rows text) that looks like this:with inserts as (
insert into interaction (
thing_id,
associate_id, created_time)
select t->>'thing_id', t->>'associate_id', now() from
json_array_elements(arg_rows::json) t
ON CONFLICT (thing_id, associate_id) DO NOTHING
RETURNING thing_id, associate_id
) select into insert_count count(*) from inserts;
-- Followed by an insert in an unrelated table that has two triggers, neither of which touch any of the tables here (also not by any of their triggers, etc.)(I wrap it that way because I need to get a count of actual inserts, without the "fake row updates" trick.)
The table
interaction has:- Just one constraint: A multi-column primary key (thing_id, associate_id)
- No indices
- Just one trigger: After insert, for each row.
The trigger does this:
DECLARE associateId text;
BEGIN
-- Go out and get the associate_id for this thing_id
BEGIN
SELECT thing.associate_id INTO STRICT associateId FROM thing WHERE thing.id = NEW.thing_id;
EXCEPTION
WHEN NO_DATA_FOUND THEN
RAISE EXCEPTION 'Could not map the thing to an associate!';
WHEN TOO_MANY_ROWS THEN
RAISE EXCEPTION 'Could not map the thing to a SINGLE associate!'; -- thing PK should prevent this
END;
-- We don't want to add an association between an associate interacting with their own things
IF associateId != NEW.associate_id THEN
-- Insert the new association, if it doesn't yet exist
INSERT INTO associations ("thing_owner", "associate")
VALUES (associateId, NEW.associate_id)
ON CONFLICT DO NOTHING;
END IF;
RETURN NULL;
END;Both
interactions and associations have no more columns than you see in the above statements.Problem
Occasionally, I get an error
deadlock detected from PostgreSQL 9.6.5 when the application calls set_interactions(). It may call it with 1-100 rows' worth of data, unsorted; "conflictSolution
The
Most importantly, add a consistent order to input rows with
The best defense against deadlocks is generally to avoid them by being
certain that all applications using a database acquire locks on
multiple objects in a consistent order.
Also, since
No need for another CTE,
The trigger function looks bloated, too. No need for a nested block, since you are not catching errors, only raising exceptions that roll back the whole transaction either way. And the exceptions are pointless, too.
-
The 1st
-
How to implement a many-to-many relationship in PostgreSQL?
-
The 2nd is pointless as well - you are suspecting as much:
-- PK should prevent this
The trigger function boils down to:
You might remove the trigger and the function
Now, I don't see any more chance for deadlocks. Of course, all other transactions possibly also writing to the same table concurrently must adhere to the same order of rows.
If that's not possible and you are still considering
ON CONFLICT clause can prevent duplicate key errors. There can still be friction with concurrent transactions trying to enter the same keys or update the same rows. It's no insurance against deadlocks.Most importantly, add a consistent order to input rows with
ORDER BY. To make sure the order is enforced I use a CTE, which materializes the result. (I think it should work with a subquery, too; just to be sure.) Else, mutually entangled inserts trying to enter identical index tuples in the unique index can lead to the deadlock you observed. The manual:The best defense against deadlocks is generally to avoid them by being
certain that all applications using a database acquire locks on
multiple objects in a consistent order.
Also, since
set_interactions() is a PL/pgSQL function this is simpler & cheaper:WITH data AS (
SELECT t->>'thing_id' AS t_id, t->>'associate_id' AS a_id
-- Or, if not type text, cast right away:
-- SELECT (t->>'thing_id')::int AS t_id, (t->>'associate_id')::int AS a_id
FROM json_array_elements(arg_rows::json) t
ORDER BY 1, 2 -- deterministic, stable order (!!)
)
INSERT INTO interaction (thing_id, associate_id, created_time)
SELECT t_id, a_id, now()
FROM data
ON CONFLICT (thing_id, associate_id) DO NOTHING;
GET DIAGNOSTICS insert_count = ROW_COUNT;No need for another CTE,
RETURNING and another count(*). More:- Best way to get result count before LIMIT was applied
The trigger function looks bloated, too. No need for a nested block, since you are not catching errors, only raising exceptions that roll back the whole transaction either way. And the exceptions are pointless, too.
-
The 1st
EXCEPTION on NO_DATA_FOUND can never occur in a proper many-to-many design with FK constraints enforcing referential integrity.-
How to implement a many-to-many relationship in PostgreSQL?
-
The 2nd is pointless as well - you are suspecting as much:
-- PK should prevent this
The trigger function boils down to:
BEGIN
-- Insert the new association, if it doesn't yet exist
INSERT INTO associations (thing_owner, associate)
SELECT t.associate_id, NEW.associate_id
FROM thing t
WHERE t.id = NEW.thing_id -- PK guarantees 0 or 1 result
AND t.associate_id <> NEW.associate_id -- exclude association to self
ON CONFLICT DO NOTHING;
RETURN NULL;
ENDYou might remove the trigger and the function
set_interactions() altogether and just run this query, doing everything useful I can see in the question:WITH data AS (
SELECT (t->>'thing_id')::int AS t_id, (t->>'associate_id')::int AS a_id -- asuming int
FROM json_array_elements(arg_rows::json) t
ORDER BY 1, 2 -- (!!)
)
, ins_inter AS (
INSERT INTO interaction (thing_id, associate_id, created_time)
SELECT t_id, a_id, now()
FROM data
ON CONFLICT (thing_id, associate_id) DO NOTHING
RETURNING thing_id, associate_id
)
, ins_ass AS (
INSERT INTO associations (thing_owner, associate)
SELECT t.associate_id, i.associate_id
FROM ins_inter i
JOIN thing t ON t.id = i.thing_id
AND t.associate_id <> i.associate_id -- exclude association to self
ON CONFLICT DO NOTHING
)
SELECT count(*) FROM ins_inter;Now, I don't see any more chance for deadlocks. Of course, all other transactions possibly also writing to the same table concurrently must adhere to the same order of rows.
If that's not possible and you are still considering
SKIP LOCKED, see:- Postgres UPDATE ... LIMIT 1
Code Snippets
WITH data AS (
SELECT t->>'thing_id' AS t_id, t->>'associate_id' AS a_id
-- Or, if not type text, cast right away:
-- SELECT (t->>'thing_id')::int AS t_id, (t->>'associate_id')::int AS a_id
FROM json_array_elements(arg_rows::json) t
ORDER BY 1, 2 -- deterministic, stable order (!!)
)
INSERT INTO interaction (thing_id, associate_id, created_time)
SELECT t_id, a_id, now()
FROM data
ON CONFLICT (thing_id, associate_id) DO NOTHING;
GET DIAGNOSTICS insert_count = ROW_COUNT;BEGIN
-- Insert the new association, if it doesn't yet exist
INSERT INTO associations (thing_owner, associate)
SELECT t.associate_id, NEW.associate_id
FROM thing t
WHERE t.id = NEW.thing_id -- PK guarantees 0 or 1 result
AND t.associate_id <> NEW.associate_id -- exclude association to self
ON CONFLICT DO NOTHING;
RETURN NULL;
ENDWITH data AS (
SELECT (t->>'thing_id')::int AS t_id, (t->>'associate_id')::int AS a_id -- asuming int
FROM json_array_elements(arg_rows::json) t
ORDER BY 1, 2 -- (!!)
)
, ins_inter AS (
INSERT INTO interaction (thing_id, associate_id, created_time)
SELECT t_id, a_id, now()
FROM data
ON CONFLICT (thing_id, associate_id) DO NOTHING
RETURNING thing_id, associate_id
)
, ins_ass AS (
INSERT INTO associations (thing_owner, associate)
SELECT t.associate_id, i.associate_id
FROM ins_inter i
JOIN thing t ON t.id = i.thing_id
AND t.associate_id <> i.associate_id -- exclude association to self
ON CONFLICT DO NOTHING
)
SELECT count(*) FROM ins_inter;Context
StackExchange Database Administrators Q#194756, answer score: 17
Revisions (0)
No revisions yet.