patternsqlMinor
Is a MERGE with OUTPUT better practice than a conditional INSERT and SELECT?
Viewed 0 times
insertmergewithpracticethanoutputbetterconditionalandselect
Problem
We often encounter the "If not exists, insert" situation. Dan Guzman's blog has an excellent investigation of how to make this process threadsafe.
I have a basic table that simply catalogs a string to an integer from a
It's not an
In my situation I only have to deal with
I could do this witout using
I think that this second approach is clearer to the reader, but I'm not convinced it's "better" practice
Or perhaps there is another better way that I haven't considered
I did search and reference other questions. This one: https://stackoverflow.com/questions/5288283/sql-server-insert-if-not-exists-best-pra
I have a basic table that simply catalogs a string to an integer from a
SEQUENCE. In a stored procedure I need to either get the integer key for the value if it exists, or INSERT it and then get the resulting value. There is a uniqueness constraint on the dbo.NameLookup.ItemName column so data integrity isn't at risk but I don't want to encounter the exceptions.It's not an
IDENTITY so I can't get SCOPE_IDENTITY and the value could be NULL in certain cases.In my situation I only have to deal with
INSERT safety on the table so I'm trying to decide if it's better practice to use MERGE like this:SET NOCOUNT, XACT_ABORT ON;
DECLARE @vValueId INT
DECLARE @inserted AS TABLE (Id INT NOT NULL)
MERGE
dbo.NameLookup WITH (HOLDLOCK) AS f
USING
(SELECT @vName AS val WHERE @vName IS NOT NULL AND LEN(@vName) > 0) AS new_item
ON f.ItemName= new_item.val
WHEN MATCHED THEN
UPDATE SET @vValueId = f.Id
WHEN NOT MATCHED BY TARGET THEN
INSERT
(ItemName)
VALUES
(@vName)
OUTPUT inserted.Id AS Id INTO @inserted;
SELECT @vValueId = s.Id FROM @inserted AS sI could do this witout using
MERGE with just a conditional INSERT followed by a SELECTI think that this second approach is clearer to the reader, but I'm not convinced it's "better" practice
SET NOCOUNT, XACT_ABORT ON;
INSERT INTO
dbo.NameLookup (ItemName)
SELECT
@vName
WHERE
NOT EXISTS (SELECT * FROM dbo.NameLookup AS t WHERE @vName IS NOT NULL AND LEN(@vName) > 0 AND t.ItemName = @vName)
DECLARE @vValueId int;
SELECT @vValueId = i.Id FROM dbo.NameLookup AS i WHERE i.ItemName = @vNameOr perhaps there is another better way that I haven't considered
I did search and reference other questions. This one: https://stackoverflow.com/questions/5288283/sql-server-insert-if-not-exists-best-pra
Solution
Because you are using a Sequence, you can use the same NEXT VALUE FOR function -- that you already have in a Default Constraint on the
So that takes care of part of the overall situation. The other part is handling the concurrency issue of two processes, at the exact same time, not finding an existing row for the exact same string, and proceeding with the
One way to handle these types of concurrency issues is to force this particular operation to be single threaded. The way to do that is by using application locks (which work across sessions). While effective, they can be a bit heavy-handed for a situation like this where the frequency of collisions is probably fairly low.
The other way to deal with the collisions is to accept that they will sometimes occur and handle them rather than try to avoid them. Using the
Test Setup: Sequence, Table, and Unique Index
Test Setup: Stored Procedure
The Test
Question from O.P.
Why is this better than the
The reasoning behind this approach is:
Comment from @SqlZim's answer (emphasis added)
I personally prefer to try and tailor a solution to avoid doing that when possible. In this case, I don't feel that using the locks from
I would agree with this first sentence if it were amended to state "and _when prudent". Just because something is technically possible does
Id Primary Key field -- to generate a new Id value ahead of time. Generating the value first means that you don't need to worry about not having SCOPE_IDENTITY, which then means that you don't need either the OUTPUT clause or doing an additional SELECT to get the new value; you will have the value before you do the INSERT, and you don't even need to mess with SET IDENTITY INSERT ON / OFF :-)So that takes care of part of the overall situation. The other part is handling the concurrency issue of two processes, at the exact same time, not finding an existing row for the exact same string, and proceeding with the
INSERT. The concern is about avoiding the Unique Constraint violation that would occur.One way to handle these types of concurrency issues is to force this particular operation to be single threaded. The way to do that is by using application locks (which work across sessions). While effective, they can be a bit heavy-handed for a situation like this where the frequency of collisions is probably fairly low.
The other way to deal with the collisions is to accept that they will sometimes occur and handle them rather than try to avoid them. Using the
TRY...CATCH construct, you can effectively trap a specific error (in this case: "unique constraint violation", Msg 2601) and re-execute the SELECT to get the Id value since we know that it now exists due to being in the CATCH block with that particular error. Other errors can be handled in the typical RAISERROR / RETURN or THROW manner.Test Setup: Sequence, Table, and Unique Index
USE [tempdb];
CREATE SEQUENCE dbo.MagicNumber
AS INT
START WITH 1
INCREMENT BY 1;
CREATE TABLE dbo.NameLookup
(
[Id] INT NOT NULL
CONSTRAINT [PK_NameLookup] PRIMARY KEY CLUSTERED
CONSTRAINT [DF_NameLookup_Id] DEFAULT (NEXT VALUE FOR dbo.MagicNumber),
[ItemName] NVARCHAR(50) NOT NULL
);
CREATE UNIQUE NONCLUSTERED INDEX [UIX_NameLookup_ItemName]
ON dbo.NameLookup ([ItemName]);
GOTest Setup: Stored Procedure
CREATE PROCEDURE dbo.GetOrInsertName
(
@SomeName NVARCHAR(50),
@ID INT OUTPUT,
@TestRaceCondition BIT = 0
)
AS
SET NOCOUNT ON;
BEGIN TRY
SELECT @ID = nl.[Id]
FROM dbo.NameLookup nl
WHERE nl.[ItemName] = @SomeName
AND @TestRaceCondition = 0;
IF (@ID IS NULL)
BEGIN
SET @ID = NEXT VALUE FOR dbo.MagicNumber;
INSERT INTO dbo.NameLookup ([Id], [ItemName])
VALUES (@ID, @SomeName);
END;
END TRY
BEGIN CATCH
IF (ERROR_NUMBER() = 2601) -- "Cannot insert duplicate key row in object"
BEGIN
SELECT @ID = nl.[Id]
FROM dbo.NameLookup nl
WHERE nl.[ItemName] = @SomeName;
END;
ELSE
BEGIN
;THROW; -- SQL Server 2012 or newer
/*
DECLARE @ErrorNumber INT = ERROR_NUMBER(),
@ErrorMessage NVARCHAR(4000) = ERROR_MESSAGE();
RAISERROR(N'Msg %d: %s', 16, 1, @ErrorNumber, @ErrorMessage);
RETURN;
*/
END;
END CATCH;
GOThe Test
DECLARE @ItemID INT;
EXEC dbo.GetOrInsertName
@SomeName = N'test1',
@ID = @ItemID OUTPUT;
SELECT @ItemID AS [ItemID];
GO
DECLARE @ItemID INT;
EXEC dbo.GetOrInsertName
@SomeName = N'test1',
@ID = @ItemID OUTPUT,
@TestRaceCondition = 1;
SELECT @ItemID AS [ItemID];
GOQuestion from O.P.
Why is this better than the
MERGE? Won't I get the same functionality without the TRY by using the WHERE NOT EXISTS clause?MERGE has various "issues" (several references are linked in @SqlZim's answer so no need to duplicate that info here). And, there is no additional locking in this approach (less contention), so it should be better on concurrency. In this approach you'll never get a Unique Constraint violation, all without any HOLDLOCK, etc. It's pretty much guaranteed to work.The reasoning behind this approach is:
- If you have enough executions of this procedure such that you need to worry about collisions, then you don't want to:
- take any more steps than are necessary
- hold locks on any resources for longer than necessary
- Since collisions can only happen upon new entries (new entries submitted at the exact same time), the frequency of falling into the
CATCHblock in the first place will be pretty low. It makes more sense to optimize the code that will run 99% of the time instead of the code that will run 1% of the time (unless there is no cost to optimizing both, but that is not the case here).
Comment from @SqlZim's answer (emphasis added)
I personally prefer to try and tailor a solution to avoid doing that when possible. In this case, I don't feel that using the locks from
serializable is a heavy handed approach, and I would be confident it would handle high concurrency well.I would agree with this first sentence if it were amended to state "and _when prudent". Just because something is technically possible does
Code Snippets
USE [tempdb];
CREATE SEQUENCE dbo.MagicNumber
AS INT
START WITH 1
INCREMENT BY 1;
CREATE TABLE dbo.NameLookup
(
[Id] INT NOT NULL
CONSTRAINT [PK_NameLookup] PRIMARY KEY CLUSTERED
CONSTRAINT [DF_NameLookup_Id] DEFAULT (NEXT VALUE FOR dbo.MagicNumber),
[ItemName] NVARCHAR(50) NOT NULL
);
CREATE UNIQUE NONCLUSTERED INDEX [UIX_NameLookup_ItemName]
ON dbo.NameLookup ([ItemName]);
GOCREATE PROCEDURE dbo.GetOrInsertName
(
@SomeName NVARCHAR(50),
@ID INT OUTPUT,
@TestRaceCondition BIT = 0
)
AS
SET NOCOUNT ON;
BEGIN TRY
SELECT @ID = nl.[Id]
FROM dbo.NameLookup nl
WHERE nl.[ItemName] = @SomeName
AND @TestRaceCondition = 0;
IF (@ID IS NULL)
BEGIN
SET @ID = NEXT VALUE FOR dbo.MagicNumber;
INSERT INTO dbo.NameLookup ([Id], [ItemName])
VALUES (@ID, @SomeName);
END;
END TRY
BEGIN CATCH
IF (ERROR_NUMBER() = 2601) -- "Cannot insert duplicate key row in object"
BEGIN
SELECT @ID = nl.[Id]
FROM dbo.NameLookup nl
WHERE nl.[ItemName] = @SomeName;
END;
ELSE
BEGIN
;THROW; -- SQL Server 2012 or newer
/*
DECLARE @ErrorNumber INT = ERROR_NUMBER(),
@ErrorMessage NVARCHAR(4000) = ERROR_MESSAGE();
RAISERROR(N'Msg %d: %s', 16, 1, @ErrorNumber, @ErrorMessage);
RETURN;
*/
END;
END CATCH;
GODECLARE @ItemID INT;
EXEC dbo.GetOrInsertName
@SomeName = N'test1',
@ID = @ItemID OUTPUT;
SELECT @ItemID AS [ItemID];
GO
DECLARE @ItemID INT;
EXEC dbo.GetOrInsertName
@SomeName = N'test1',
@ID = @ItemID OUTPUT,
@TestRaceCondition = 1;
SELECT @ItemID AS [ItemID];
GOContext
StackExchange Database Administrators Q#158092, answer score: 9
Revisions (0)
No revisions yet.