debugsqlModerate
Cannot insert duplicate key row on a non-unique index?
Viewed 0 times
cannotuniqueinsertnonduplicateindexrowkey
Problem
We have encountered this strange error three times over the past few days, after being error free for 8 weeks, and I'm stumped.
This is the error message:
The index we have is not unique. If you notice, the duplicate key value in the error message doesn’t even line up with the index. Strange thing is if I rerun the proc, it succeeds.
This is the most recent link I could find that has my issues but I don't see a solution.
https://www.sqlservercentral.com/forums/topic/error-cannot-insert-duplicate-key-row-in-a-non-unique-index
A couple things about my scenario:
There are 45 fields for each table, I mainly listed the ones used in indexes. I'm updating the TransactionID (clustered key) in the update statement (unnecessarily). Strange that we haven't had any issues for months until last week. And it's only happening sporadically via SSIS.
Table
`USE [DB]
GO
/ Object: Table [sales].[Transactions] Script Date: 5/29/2019 1:37:49 PM /
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
IF NOT EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[sales].[Transactions]') AND type in (N'U'))
BEGIN
CREATE TABLE [sales].[Transactions]
(
[TransactionID] [bigint] NOT NULL,
[ClientID] [int] NOT NULL,
[TransactionDate] datetime2 NOT NULL,
/ snip/
[BusinessUserID] varchar NOT NULL,
[BusinessTransactionID] varchar NOT NULL,
[InsertDate] datetime2 NOT NULL,
[UpdateDate] [da
This is the error message:
Executing the query "EXEC dbo.MergeTransactions" failed with the following error:
"Cannot insert duplicate key row in object 'sales.Transactions' with unique index
'NCI_Transactions_ClientID_TransactionDate'.
The duplicate key value is (1001, 2018-12-14 19:16:29.00, 304050920).".
The index we have is not unique. If you notice, the duplicate key value in the error message doesn’t even line up with the index. Strange thing is if I rerun the proc, it succeeds.
This is the most recent link I could find that has my issues but I don't see a solution.
https://www.sqlservercentral.com/forums/topic/error-cannot-insert-duplicate-key-row-in-a-non-unique-index
A couple things about my scenario:
- The proc is updating the TransactionID (part of the primary key) - I think this is what is causing the error but don't know why? We'll be removing that logic.
- Change tracking is enabled on the table
- Doing transaction read uncommitted
There are 45 fields for each table, I mainly listed the ones used in indexes. I'm updating the TransactionID (clustered key) in the update statement (unnecessarily). Strange that we haven't had any issues for months until last week. And it's only happening sporadically via SSIS.
Table
`USE [DB]
GO
/ Object: Table [sales].[Transactions] Script Date: 5/29/2019 1:37:49 PM /
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
IF NOT EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[sales].[Transactions]') AND type in (N'U'))
BEGIN
CREATE TABLE [sales].[Transactions]
(
[TransactionID] [bigint] NOT NULL,
[ClientID] [int] NOT NULL,
[TransactionDate] datetime2 NOT NULL,
/ snip/
[BusinessUserID] varchar NOT NULL,
[BusinessTransactionID] varchar NOT NULL,
[InsertDate] datetime2 NOT NULL,
[UpdateDate] [da
Solution
My question is, what is going on under the hood? And what is the solution?
It is a bug. The problem is that it only happens occasionally, and will be tough to reproduce. Still, your best chance is to engage Microsoft support. Update processing is mind-bendingly complex, so this will require a very detailed investigation.
For an example of the sort of complexities involved, have a look at my posts MERGE Bug with Filtered Indexes and Incorrect Results with Indexed Views. Neither of those relate directly to your issue, but they do give a flavour.
Write a deterministic update
That's all rather generic of course. Perhaps more usefully, I can say that you should look to rewrite your current
Use caution when specifying the FROM clause to provide the criteria for the update operation. The results of an UPDATE statement are undefined if the statement includes a FROM clause that is not specified in such a way that only one value is available for each column occurrence that is updated, that is if the UPDATE statement is not deterministic.
Your
Example
Let me show you an example, using tables loosely modelled on those given in the question:
To keep things simple, put one row in the target table, and four rows in the source:
All four source rows match the target on
(Updating the
The first surprise is that the
The important point is that the result is undefined, and in this case produces an outcome that matches none of the source rows:
db<>fiddle demo
More details: The ANY Aggregate is Broken
The update should be written such that it would succeed if written as the equivalent
As a bonus, you may find that rewriting your current update to be deterministic will result in your occasional bug problem also going away. The product bug will still exist for people that write non-determinstic updates of course.
It is a bug. The problem is that it only happens occasionally, and will be tough to reproduce. Still, your best chance is to engage Microsoft support. Update processing is mind-bendingly complex, so this will require a very detailed investigation.
For an example of the sort of complexities involved, have a look at my posts MERGE Bug with Filtered Indexes and Incorrect Results with Indexed Views. Neither of those relate directly to your issue, but they do give a flavour.
Write a deterministic update
That's all rather generic of course. Perhaps more usefully, I can say that you should look to rewrite your current
UPDATE statement. As the documentation says:Use caution when specifying the FROM clause to provide the criteria for the update operation. The results of an UPDATE statement are undefined if the statement includes a FROM clause that is not specified in such a way that only one value is available for each column occurrence that is updated, that is if the UPDATE statement is not deterministic.
Your
UPDATE is not deterministic, and the results are therefore undefined. You should change it so that at most one source row is identified for each target row. Without that change, the result of the update may not reflect any individual source row.Example
Let me show you an example, using tables loosely modelled on those given in the question:
CREATE TABLE dbo.Transactions
(
TransactionID bigint NOT NULL,
ClientID integer NOT NULL,
TransactionDate datetime2(2) NOT NULL,
CONSTRAINT PK_dbo_Transactions
PRIMARY KEY CLUSTERED (TransactionID),
INDEX dbo_Transactions_ClientID_TranDate
(ClientID, TransactionDate)
);
CREATE TABLE #Working
(
TransactionID bigint NULL,
ClientID integer NULL,
TransactionDate datetime2(2) NULL,
INDEX cx CLUSTERED (TransactionID)
);To keep things simple, put one row in the target table, and four rows in the source:
INSERT dbo.Transactions
(TransactionID, ClientID, TransactionDate)
VALUES
(1, 1, '2019-01-01');
INSERT #Working
(TransactionID, ClientID, TransactionDate)
VALUES
(1, 2, NULL),
(1, NULL, '2019-03-03'),
(1, 3, NULL),
(1, NULL, '2019-02-02');All four source rows match the target on
TransactionID, so which one will be used if we run an update (like the one in the question) that joins on TransactionID alone?UPDATE T
SET T.TransactionID = W.TransactionID,
T.ClientID = W.ClientID,
T.TransactionDate = W.TransactionDate
FROM #Working AS W
JOIN dbo.Transactions AS T
ON T.TransactionID = W.TransactionID;(Updating the
TransactionID column is not important for the demo, you can comment it out if you like.)The first surprise is that the
UPDATE completes without an error, despite the target table not allowing nulls in any column (all the candidate rows contain a null).The important point is that the result is undefined, and in this case produces an outcome that matches none of the source rows:
SELECT
T.TransactionID,
T.ClientID,
T.TransactionDate
FROM dbo.Transactions AS T;╔═══════════════╦══════════╦════════════════════════╗
║ TransactionID ║ ClientID ║ TransactionDate ║
╠═══════════════╬══════════╬════════════════════════╣
║ 1 ║ 2 ║ 2019-03-03 00:00:00.00 ║
╚═══════════════╩══════════╩════════════════════════╝
db<>fiddle demo
More details: The ANY Aggregate is Broken
The update should be written such that it would succeed if written as the equivalent
MERGE statement, which does check for attempts to update the same target row more than once. I do not generally recommend using MERGE directly, because it has been subject to so many implementation bugs, and normally has worse performance.As a bonus, you may find that rewriting your current update to be deterministic will result in your occasional bug problem also going away. The product bug will still exist for people that write non-determinstic updates of course.
Code Snippets
CREATE TABLE dbo.Transactions
(
TransactionID bigint NOT NULL,
ClientID integer NOT NULL,
TransactionDate datetime2(2) NOT NULL,
CONSTRAINT PK_dbo_Transactions
PRIMARY KEY CLUSTERED (TransactionID),
INDEX dbo_Transactions_ClientID_TranDate
(ClientID, TransactionDate)
);
CREATE TABLE #Working
(
TransactionID bigint NULL,
ClientID integer NULL,
TransactionDate datetime2(2) NULL,
INDEX cx CLUSTERED (TransactionID)
);INSERT dbo.Transactions
(TransactionID, ClientID, TransactionDate)
VALUES
(1, 1, '2019-01-01');
INSERT #Working
(TransactionID, ClientID, TransactionDate)
VALUES
(1, 2, NULL),
(1, NULL, '2019-03-03'),
(1, 3, NULL),
(1, NULL, '2019-02-02');UPDATE T
SET T.TransactionID = W.TransactionID,
T.ClientID = W.ClientID,
T.TransactionDate = W.TransactionDate
FROM #Working AS W
JOIN dbo.Transactions AS T
ON T.TransactionID = W.TransactionID;SELECT
T.TransactionID,
T.ClientID,
T.TransactionDate
FROM dbo.Transactions AS T;Context
StackExchange Database Administrators Q#240822, answer score: 13
Revisions (0)
No revisions yet.