HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

INSERT ON CONFLICT DO UPDATE SET (an UPSERT) statement with a unique constraint is generating constraint violations when run concurrently

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
uniqueinsertupdatestatementwithconcurrentlyupsertconstraintviolationsgenerating

Problem

I am attempting to run an Postgres UPSERT in fairly concurrent code. I have been seeing a duplicate key value violates unique constraint error when my service is under load.

By way of a reduced example, I have the following table, which has an id primary key and an id_secondary unique key (meant to function as a second ID)
CREATE TABLE "test_table" (
"id" text NOT NULL,
"id_secondary" text NOT NULL,
"value" bool NOT NULL DEFAULT 'FALSE',
PRIMARY KEY ("id")
);
CREATE UNIQUE INDEX "test_table_uc" ON "public"."test_table" USING BTREE ("id_secondary");


I am running the following two statements simultaneously:
INSERT INTO "test_table" ("id", "id_secondary", "value") VALUES('1', '1', 'false')
ON CONFLICT ("id")
DO UPDATE
SET "value" = "excluded"."value";

INSERT INTO "test_table" ("id", "id_secondary", "value") VALUES('1', '1', 'true')
ON CONFLICT ("id")
DO UPDATE
SET "value" = "excluded"."value";


and it is my expectation that whichever statement executes last will be the one to set the value of "value".

Occasionally I am receiving the error duplicate key value violates unique constraint "test_table_uc" from one of these statements. My full test setup in Go (using testify) is:
suite.Run("Concurrency", func() {
db, err := sql.Open("postgres", "host=localhost port=35432 dbname=test sslmode=disable user=postgres")
db.SetMaxOpenConns(10)
suite.Require().NoError(err)

for ii := 0; ii

Where I am allowing concurrent SQL connections using
db.SetMaxOpenConns(10) and the go func() syntax is running both SQL statements in different goroutines (threads). Only two of the statements are run at a time, and the value of "id" and "id_secondary" are always set to the loop iteration. I wait for both statements to return before iterating.

It generally fails with the
duplicate key value violates unique constraint error within a few loops.

I have not modified my isolation settings and I am running in
SHOW default_t

Solution

After reading through the Postgres source code I believe I have found a satisfying answer to this.

In the ON CONFLICT statements given, only id is mentioned as a column. This will cause Postgres to use test_table_pkey as an arbiter index.

Normally, when evaluating an ON CONFLICT Postgres will speculatively insert the new row, and check only the arbiter indexes for conflicts (against currently committed rows) using ExecCheckIndexConstraints. If there is a conflict on test_table_pkey and test_table_uc it will only be checking for the arbiter index test_table_pkey, which will have a conflict and cause Postgres to revert to the ON CONFLICT clause.

If two of these INSERT expressions are occurring at the same time, neither will see each other (as they are both speculative and both pass ExecCheckIndexConstraints). This will cause Postgres to attempt to commit both speculative rows. The first row will commit properly. The second row will cause two constraint violations during ExecInsertIndexTuples. The function ExecInsertIndexTuples will suppress errors on any arbiter indexes only, meaning that the constraint violation on test_table_pkey will get caught (and cause Postgres to retry the insert from the top), but the test_table_uc violation will become an uncaught error which causes the statement to error.

Essentially, this means if your upsert is potentially causing conflicts on constraints that are not named in the ON CONFLICT clause, these statements will fail, but only during concurrent inserts.

Context

StackExchange Database Administrators Q#315039, answer score: 2

Revisions (0)

No revisions yet.