HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMajor

Merge statement deadlocking itself

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
itselfdeadlockingstatementmerge

Problem

I have the following procedure (SQL Server 2008 R2):

create procedure usp_SaveCompanyUserData
    @companyId bigint,
    @userId bigint,
    @dataTable tt_CoUserdata readonly
as
begin

    set nocount, xact_abort on;

    merge CompanyUser with (holdlock) as r
    using (
        select 
            @companyId as CompanyId, 
            @userId as UserId, 
            MyKey, 
            MyValue
        from @dataTable) as newData
    on r.CompanyId = newData.CompanyId
        and r.UserId = newData.UserId
        and r.MyKey = newData.MyKey
    when not matched then
        insert (CompanyId, UserId, MyKey, MyValue) values
        (@companyId, @userId, newData.MyKey, newData.MyValue);

end;


CompanyId, UserId, MyKey form the composite key for the target table. CompanyId is a foreign key to a parent table. Also, there is a non-clustered index on CompanyId asc, UserId asc.

It is called from many different threads, and I am consistently getting deadlocks between different processes calling this same statement. My understanding was that the "with (holdlock)" was necessary to prevent insert/update race condition errors.

I assume that two different threads are locking rows (or pages) in different orders when they are validating the constraints, and thus are deadlocking.

Is this a correct assumption?

What is the best way to resolve this situation (i.e. no deadlocks, minimum impact on multi-threaded performance)?

(If you view the image in a new tab, it is readable. Sorry for the small size.)

  • There are at most 28 rows are in the @datatable.



  • I have traced back through the code, and I cannot see anywhere that we start a transaction here.



  • The foreign key is set up to cascade only on delete, and there were no deletions from the parent table.

Solution

There would not be a problem if the table variable only ever held one value. With multiple rows, there is a new possibility for deadlock. Suppose two concurrent processes (A & B) run with table variables containing (1, 2) and (2, 1) for the same company.

Process A reads the destination, finds no row, and inserts the value '1'. It holds an exclusive row lock on value '1'. Process B reads the destination, finds no row, and inserts the value '2'. It holds an exclusive row lock on value '2'.

Now process A needs to process row 2, and process B needs to process row 1. Neither process can make progress because it requires a lock that is incompatible with the exclusive lock held by the other process.

To avoid deadlocks with multiple rows, the rows need to be processed (and tables accessed) in the same order every time. The table variable in the execution plan shown in the question is a heap, so the rows have no intrinsic order (they are quite likely to be read in insertion order, though this is not guaranteed):

The lack of consistent row processing order leads directly to the deadlock opportunity. A second consideration is that the lack of a key uniqueness guarantee means that a Table Spool is necessary to provide correct Halloween Protection. The spool is an eager spool, meaning all rows are written to a tempdb worktable before being read back and replayed for the Insert operator.

Redefining the TYPE of the table variable to include a clustered PRIMARY KEY:

DROP TYPE dbo.CoUserData;

CREATE TYPE dbo.CoUserData
AS TABLE
(
    MyKey   integer NOT NULL PRIMARY KEY CLUSTERED,
    MyValue integer NOT NULL
);


The execution plan now shows a scan of the clustered index and the uniqueness guarantee means the optimizer is able to safely remove the Table Spool:

In tests with 5000 iterations of the MERGE statement on 128 threads, no deadlocks occurred with the clustered table variable. I should stress that this is only on the basis of observation; the clustered table variable could also (technically) produce its rows in a variety of orders, but the chances of a consistent order are very greatly enhanced. The observed behaviour would need to be retested for every new cumulative update, service pack, or new version of SQL Server, of course.

In case the table variable definition cannot be changed, there is another alternative:

MERGE dbo.CompanyUser AS R
USING 
    (SELECT DISTINCT MyKey, MyValue FROM @DataTable) AS NewData ON
    R.CompanyId = @CompanyID
    AND R.UserID = @UserID
    AND R.MyKey = NewData.MyKey
WHEN NOT MATCHED THEN 
    INSERT 
        (CompanyID, UserID, MyKey, MyValue) 
    VALUES
        (@CompanyID, @UserID, NewData.MyKey, NewData.MyValue)
OPTION (ORDER GROUP);


This also achieves the elimination of the spool (and row-order consistency) at the cost of introducing an explicit sort:

This plan also produced no deadlocks using the same test. Reproduction script below:

CREATE TYPE dbo.CoUserData
AS TABLE
(
    MyKey   integer NOT NULL /* PRIMARY KEY */,
    MyValue integer NOT NULL
);
GO
CREATE TABLE dbo.Company
(
    CompanyID   integer NOT NULL

    CONSTRAINT PK_Company
        PRIMARY KEY (CompanyID)
);
GO
CREATE TABLE dbo.CompanyUser
(
    CompanyID   integer NOT NULL,
    UserID      integer NOT NULL,
    MyKey       integer NOT NULL,
    MyValue     integer NOT NULL

    CONSTRAINT PK_CompanyUser
        PRIMARY KEY CLUSTERED
            (CompanyID, UserID, MyKey),

    FOREIGN KEY (CompanyID)
        REFERENCES dbo.Company (CompanyID),
);
GO
CREATE NONCLUSTERED INDEX nc1
ON dbo.CompanyUser (CompanyID, UserID);
GO
INSERT dbo.Company (CompanyID) VALUES (1);
GO
DECLARE 
    @DataTable AS dbo.CoUserData,
    @CompanyID integer = 1,
    @UserID integer = 1;

INSERT @DataTable
SELECT TOP (10)
    V.MyKey,
    V.MyValue
FROM
(
    VALUES
        (1, 1),
        (2, 2),
        (3, 3),
        (4, 4),
        (5, 5),
        (6, 6),
        (7, 7),
        (8, 8),
        (9, 9)
) AS V (MyKey, MyValue)
ORDER BY NEWID();

BEGIN TRANSACTION;

    -- Test MERGE statement here

ROLLBACK TRANSACTION;

Code Snippets

DROP TYPE dbo.CoUserData;

CREATE TYPE dbo.CoUserData
AS TABLE
(
    MyKey   integer NOT NULL PRIMARY KEY CLUSTERED,
    MyValue integer NOT NULL
);
MERGE dbo.CompanyUser AS R
USING 
    (SELECT DISTINCT MyKey, MyValue FROM @DataTable) AS NewData ON
    R.CompanyId = @CompanyID
    AND R.UserID = @UserID
    AND R.MyKey = NewData.MyKey
WHEN NOT MATCHED THEN 
    INSERT 
        (CompanyID, UserID, MyKey, MyValue) 
    VALUES
        (@CompanyID, @UserID, NewData.MyKey, NewData.MyValue)
OPTION (ORDER GROUP);
CREATE TYPE dbo.CoUserData
AS TABLE
(
    MyKey   integer NOT NULL /* PRIMARY KEY */,
    MyValue integer NOT NULL
);
GO
CREATE TABLE dbo.Company
(
    CompanyID   integer NOT NULL

    CONSTRAINT PK_Company
        PRIMARY KEY (CompanyID)
);
GO
CREATE TABLE dbo.CompanyUser
(
    CompanyID   integer NOT NULL,
    UserID      integer NOT NULL,
    MyKey       integer NOT NULL,
    MyValue     integer NOT NULL

    CONSTRAINT PK_CompanyUser
        PRIMARY KEY CLUSTERED
            (CompanyID, UserID, MyKey),

    FOREIGN KEY (CompanyID)
        REFERENCES dbo.Company (CompanyID),
);
GO
CREATE NONCLUSTERED INDEX nc1
ON dbo.CompanyUser (CompanyID, UserID);
GO
INSERT dbo.Company (CompanyID) VALUES (1);
GO
DECLARE 
    @DataTable AS dbo.CoUserData,
    @CompanyID integer = 1,
    @UserID integer = 1;

INSERT @DataTable
SELECT TOP (10)
    V.MyKey,
    V.MyValue
FROM
(
    VALUES
        (1, 1),
        (2, 2),
        (3, 3),
        (4, 4),
        (5, 5),
        (6, 6),
        (7, 7),
        (8, 8),
        (9, 9)
) AS V (MyKey, MyValue)
ORDER BY NEWID();

BEGIN TRANSACTION;

    -- Test MERGE statement here

ROLLBACK TRANSACTION;

Context

StackExchange Database Administrators Q#23467, answer score: 35

Revisions (0)

No revisions yet.