HiveBrain v1.2.0
Get Started
← Back to all entries
patternMinor

Deadlock in INSERT ... ON DUPLICATE KEY UPDATE

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
insertupdatedeadlockduplicatekey

Problem

We have a table defined like this:

CREATE TABLE ipsum ( 
    id char(40) NOT NULL, 
    source char(40) NOT NULL, 
    ip_address varbinary(16) NOT NULL, 
    port smallint(5) unsigned NOT NULL, 
    percentage decimal(5,2) NOT NULL DEFAULT '0.00', 
    first_date datetime NOT NULL, 
    last_date datetime NOT NULL, 
    p0 tinyint(1) NOT NULL DEFAULT '0',
    p1 tinyint(1) NOT NULL DEFAULT '0', 
    blocked tinyint(1) NOT NULL, 
    created datetime NOT NULL DEFAULT CURRENT_TIMESTAMP, 
    country varchar(100) DEFAULT NULL, 
    continent varchar(100) DEFAULT NULL, 
    note varchar(100) DEFAULT NULL, 
    PRIMARY KEY (id), 
    KEY idx_first_seen_peer (first_date), 
    KEY idx_last_seen_peer (last_date), 
    KEY ipsum (source,ip_address,port) 
) ENGINE=InnoDB DEFAULT CHARSET=latin1


We are doing bulk inserts into this table in batches of up to 15,000 records at a time. Inserts are structured like this:

INSERT INTO ipsum (id, source, ip_address, port, percentage, first_date, last_date, p0, p1, country, continent, note, blocked) 
VALUES {{ 15000 legitimate values here sorted alphabetically by the id field }} 
ON DUPLICATE KEY UPDATE 
p0 = p0 OR VALUES(p0), 
p1 = p1 OR VALUES(p1), 
last_date = GREATEST(last_date, VALUES(last_date)), 
first_date = LEAST(first_date, VALUES(first_date)), 
percentage = GREATEST(percentage, VALUES(percentage));


The records in the VALUES set have been ordered by id alphabetically.

When we reach a very high rate of input with multiple instances of the query running we start to see deadlocks:


Error: ER_LOCK_DEADLOCK: Deadlock found when trying to get lock; try restarting transaction]

No other statements are being executed against the table -- only the inserts as described above.

How can I modify the insert query to prevent the deadlocks?

Solution

Following the MySQL recommendations, some options I'd consider:

  • Check the SHOW ENGINE INNODB STATUS to see what the deadlock is on.



  • Convert the INSERT ON DUPLICATE KEY UPDATE into a SELECT then INSERT or UPDATE.



  • Reduce the size of the batches so they are quicker.



  • Reduce the concurrency.



What is the relation between the ids in the concurrent batches? Do the ranges of ids overlap?

Context

StackExchange Database Administrators Q#164772, answer score: 4

Revisions (0)

No revisions yet.