patternsqlModerate
MERGE deadlocking prevention
Viewed 0 times
deadlockingpreventionmerge
Problem
In one of our databases we have a table that is intensively concurrently accessed by multiple threads. Threads do update or insert rows via
Simplified setup is following. The table is heap with two unique nonclustered indexes over
and the typical query is
i.e., matching happens by unique index key. Hint
I did small investigation and the following is what I found.
In most of the cases query execution plan is
with the following locking pattern
i.e.
Sometimes, however, query execution plan is different
(this plan shape can be forced by adding
notice
Sinc
MERGE. There are also threads that delete rows on occasion, so table data is very volatile. Threads doing upserts suffer from deadlocking sometimes. The issue looks similar to one described in this question. The difference, though, is that in our case each thread do update or insert exactly one row.Simplified setup is following. The table is heap with two unique nonclustered indexes over
CREATE TABLE [Cache]
(
[UID] uniqueidentifier NOT NULL CONSTRAINT DF_Cache_UID DEFAULT (newid()),
[ItemKey] varchar(200) NOT NULL,
[FileName] nvarchar(255) NOT NULL,
[Expires] datetime2(2) NOT NULL,
CONSTRAINT [PK_Cache] PRIMARY KEY NONCLUSTERED ([UID])
)
GO
CREATE UNIQUE INDEX IX_Cache ON [Cache] ([ItemKey]);
GOand the typical query is
DECLARE
@itemKey varchar(200) = 'Item_0F3C43A6A6A14255B2EA977EA730EDF2',
@fileName nvarchar(255) = 'File_0F3C43A6A6A14255B2EA977EA730EDF2.dat';
MERGE INTO [Cache] WITH (HOLDLOCK) T
USING (
VALUES (@itemKey, @fileName, dateadd(minute, 10, sysdatetime()))
) S(ItemKey, FileName, Expires)
ON T.ItemKey = S.ItemKey
WHEN MATCHED THEN
UPDATE
SET
T.FileName = S.FileName,
T.Expires = S.Expires
WHEN NOT MATCHED THEN
INSERT (ItemKey, FileName, Expires)
VALUES (S.ItemKey, S.FileName, S.Expires)
OUTPUT deleted.FileName;i.e., matching happens by unique index key. Hint
HOLDLOCK is here, because of concurrency (as advised here).I did small investigation and the following is what I found.
In most of the cases query execution plan is
with the following locking pattern
i.e.
IX lock on the object followed by more granular locks.Sometimes, however, query execution plan is different
(this plan shape can be forced by adding
INDEX(0) hint) and its locking pattern isnotice
X lock placed on object after IX is placed already.Sinc
Solution
Is placing
It looks a bit odd, but it is valid. At the time the
More specifically for the current scenario, it is true that serializable key-range locks are not available for a heap, so the only alternative is an
On the other hand, locking is complex, and intent locks may sometimes be taken for internal reasons not necessarily related to the intention to take lower level locks. Taking
So, the current situation is unfortunate for your deadlock scenario, and it may be avoidable in principle, but that's not necessarily the same as being a 'bug'. You could report the issue via your normal support channel, or on Microsoft Connect, if you need a definitive answer on this.
Could it happen that
No.
Forcing the index is a way of ensuring that key-range locks can be taken. Together with the update locks naturally taken when processing an access method for rows to change, this provides enough of a guarantee to avoid concurrency issues in your scenario.
If the schema of the table does not change (e.g. adding a new index), the hint is also enough to avoid this query deadlocking with itself. There is still a possibility of a cyclic deadlock with other queries that might access the heap before the nonclustered index (such as an update to the key of the nonclustered index).
...variable declaration from
This breaks the guarantee that a single row will be affected, so an Eager Table Spool is introduced for Halloween protection. As a further workaround for this, make the guarantee explicit with
My understanding [...] is that locking is situational in a great degree, and certain execution plan shape does not implies certain locking pattern.
There is certainly much more going on that is visible in an execution plan. You can force a certain plan shape with e.g. a plan guide, but the engine may still decide to take different locks at runtime. The chances are fairly low if you incorporate the
General remarks
It is somewhat unusual to see a heap table being used in this way. You should consider the merits of converting it to a clustered table, perhaps using the index Dan Guzman suggested in a comment:
This may have important space reuse advantages, as well as providing a good workaround for the current deadlocking issue.
Note how the RID lookup is no longer necessary:
If you can guarantee the existence of a unique index on
Both
See the related Q & A SQL Server 2014 Concurrent input issue for the correct pattern to use, and more information about
Deadlocks cannot always be prevented. They can be reduced to a minimum with careful coding and design, but the application should always be prepared to handle the odd deadlock gracefully (e.g. recheck conditions then retry).
If you have
IX followed by X on the object eligible? Is it bug or not?It looks a bit odd, but it is valid. At the time the
IX is taken, the intention may well be to take X locks at a lower level. There's nothing to say that such locks must actually be taken. After all, there might not be anything to lock at the lower level; the engine cannot know that ahead of time. In addition, there may be optimizations such that lower level locks can be skipped (an example for IS and S locks can be seen here).More specifically for the current scenario, it is true that serializable key-range locks are not available for a heap, so the only alternative is an
X lock at the object level. In that sense, the engine might be able to detect early that an X lock will inevitably be required if the access method is a heap scan, and so avoid taking the IX lock.On the other hand, locking is complex, and intent locks may sometimes be taken for internal reasons not necessarily related to the intention to take lower level locks. Taking
IX may be the least invasive way of providing a required protection for some obscure edge case. For a similar sort of consideration, see Shared Lock issued on IsolationLevel.ReadUncommitted.So, the current situation is unfortunate for your deadlock scenario, and it may be avoidable in principle, but that's not necessarily the same as being a 'bug'. You could report the issue via your normal support channel, or on Microsoft Connect, if you need a definitive answer on this.
Could it happen that
FORCESEEK will be ignored and bad locking pattern will be used?No.
FORCESEEK is less of a hint and more of a directive. If the optimizer cannot find a plan that honours the 'hint', it will produce an error.Forcing the index is a way of ensuring that key-range locks can be taken. Together with the update locks naturally taken when processing an access method for rows to change, this provides enough of a guarantee to avoid concurrency issues in your scenario.
If the schema of the table does not change (e.g. adding a new index), the hint is also enough to avoid this query deadlocking with itself. There is still a possibility of a cyclic deadlock with other queries that might access the heap before the nonclustered index (such as an update to the key of the nonclustered index).
...variable declaration from
varchar(200) to nvarchar(200)...This breaks the guarantee that a single row will be affected, so an Eager Table Spool is introduced for Halloween protection. As a further workaround for this, make the guarantee explicit with
MERGE TOP (1) INTO [Cache]....My understanding [...] is that locking is situational in a great degree, and certain execution plan shape does not implies certain locking pattern.
There is certainly much more going on that is visible in an execution plan. You can force a certain plan shape with e.g. a plan guide, but the engine may still decide to take different locks at runtime. The chances are fairly low if you incorporate the
TOP (1) element above.General remarks
It is somewhat unusual to see a heap table being used in this way. You should consider the merits of converting it to a clustered table, perhaps using the index Dan Guzman suggested in a comment:
CREATE UNIQUE CLUSTERED INDEX IX_Cache ON [Cache] ([ItemKey]);This may have important space reuse advantages, as well as providing a good workaround for the current deadlocking issue.
MERGE is also slightly unusual to see in a high-concurrency environment. Somewhat counter-intuitively, it is often more efficient to perform separate INSERT and UPDATE statements, for example:DECLARE
@itemKey varchar(200) = 'Item_0F3C43A6A6A14255B2EA977EA730EDF2',
@fileName nvarchar(255) = 'File_0F3C43A6A6A14255B2EA977EA730EDF2.dat';
BEGIN TRANSACTION;
DECLARE @expires datetime2(2) = DATEADD(MINUTE, 10, SYSDATETIME());
UPDATE TOP (1) dbo.Cache WITH (SERIALIZABLE, UPDLOCK)
SET [FileName] = @fileName,
Expires = @expires
OUTPUT Deleted.[FileName]
WHERE
ItemKey = @itemKey;
IF @@ROWCOUNT = 0
INSERT dbo.Cache
(ItemKey, [FileName], Expires)
VALUES
(@itemKey, @fileName, @expires);
COMMIT TRANSACTION;Note how the RID lookup is no longer necessary:
If you can guarantee the existence of a unique index on
ItemKey (as in the question) the redundant TOP (1) in the UPDATE can be removed, giving the simpler plan:Both
INSERT and UPDATE plans qualify for a trivial plan in either case. MERGE always requires full cost-based optimization.See the related Q & A SQL Server 2014 Concurrent input issue for the correct pattern to use, and more information about
MERGE.Deadlocks cannot always be prevented. They can be reduced to a minimum with careful coding and design, but the application should always be prepared to handle the odd deadlock gracefully (e.g. recheck conditions then retry).
If you have
Code Snippets
CREATE UNIQUE CLUSTERED INDEX IX_Cache ON [Cache] ([ItemKey]);DECLARE
@itemKey varchar(200) = 'Item_0F3C43A6A6A14255B2EA977EA730EDF2',
@fileName nvarchar(255) = 'File_0F3C43A6A6A14255B2EA977EA730EDF2.dat';
BEGIN TRANSACTION;
DECLARE @expires datetime2(2) = DATEADD(MINUTE, 10, SYSDATETIME());
UPDATE TOP (1) dbo.Cache WITH (SERIALIZABLE, UPDLOCK)
SET [FileName] = @fileName,
Expires = @expires
OUTPUT Deleted.[FileName]
WHERE
ItemKey = @itemKey;
IF @@ROWCOUNT = 0
INSERT dbo.Cache
(ItemKey, [FileName], Expires)
VALUES
(@itemKey, @fileName, @expires);
COMMIT TRANSACTION;Context
StackExchange Database Administrators Q#187776, answer score: 15
Revisions (0)
No revisions yet.