patternsqlMinor
"Upsert" trigger writing mass duplicates from SSIS
Viewed 0 times
triggermassssiswritingupsertfromduplicates
Problem
I have a problem with an "upsert" trigger, being triggered by a SSIS package with fastload. (FIRE_TRIGGERS is enabled)
Inserting new rows works fine, but all existing rows get updated with one set of values.
```
SELECT
text_vchr, count(*) as dups
FROM MP_VOL
group by text
CREATE TRIGGER [dbo].[MP_VOL_UPSERT]
ON [dbo].[vwMP_VOL]
INSTEAD OF INSERT
AS
BEGIN
SET NOCOUNT ON;
UPDATE [dbo].[vwMP_VOL]
SET
[vwMP_VOL].[id_batch] = inserted.id_batch
,[vwMP_VOL].[id_invoice] = inserted.id_invoice
,[vwMP_VOL].[text_vchr] = inserted.text_vchr
,[vwMP_VOL].[amt_frt_tcur] = inserted.amt_frt_tcur
,[vwMP_VOL].[amt_line_hcur] = inserted.amt_line_hcur
,[vwMP_VOL].[line_total_hcur] = inserted.line_total_hcur
,[vwMP_VOL].[id_tax_1] = inserted.id_tax_1
,[vwMP_VOL].[id_tax_2] = inserted.id_tax_2
,[vwMP_VOL].[id_tax_3] = inserted.id_tax_3
,[vwMP_VOL].[id_tax_4] = inserted.id_tax_4
,[vwMP_VOL].[line_total_rem_hcu] = inserted.line_total_rem_hcu
,[vwMP_VOL].[id_po] = inserted.id_po
,[vwMP_VOL].[cnt_rlse] = inserted.cnt_rlse
,[vwMP_VOL].[cnt_line_nbr] = inserted.cnt_line_nbr
FROM inserted
INNER JOIN (SELECT [Org]
,[id_vend]
,[id_org_orig]
,[id_vchr_nbr]
,[id_vchr_line]
FROM [vwMP_VOL]) AS old
ON old.[Org] = inserted.Org
AND old.[id_vend] = inserted.id_vend
AND old.[id_org_orig] = inserted.id_org_orig
AND old.[id_vchr_nbr] = inserted.id_vchr_nbr
AND old.[id_vchr_line] = inserted.id_vchr_line;
INSERT INTO [dbo].[vwMP_VOL]
SELECT inserted.* FROM inserted
LEFT JOIN (SELECT [Org]
,[id_vend]
,[id_org_orig]
,[id_vchr_nbr]
,[id_vchr_line]
FROM [vwMP_VOL]) AS old
ON old.[Org] = inserted.Org
AND old.[id_vend] = inserted.id_vend
AND old.[id_org_orig] = inserted.id_org_orig
AND old.[id_vchr_nbr] = inserted.id_vchr_nbr
AND old.[id_vchr_line] = inserted.id_vchr_line
WHERE old.Org IS NULL;
ENDInserting new rows works fine, but all existing rows get updated with one set of values.
```
SELECT
text_vchr, count(*) as dups
FROM MP_VOL
group by text
Solution
I suspect the issue appears because you are referencing the view twice in the
The result is that the the two references are unrelated and the rows are updated with unrelated data (and possibly many times as well!)
See the the official MSDN documentation about
Best Practices
...
Use caution when specifying the FROM clause to provide the criteria for the update operation. The results of an UPDATE statement are undefined if the statement includes a FROM clause that is not specified in such a way that only one value is available for each column occurrence that is updated, that is if the UPDATE statement is not deterministic.
and a more detailed analysis of this "feature" and other issues it can cause, in Hugo Kornelis's blog post: Let's deprecate
Correctness? Bah, who cares?
Well, most do. That’s why we test.
If I mess up the join criteria in a
I think the best way to solve this is to use the alias in the
This would work, too, but I don't like it as much as the previous because we can't alias the table:
UPDATE statement, once in UPDATE [dbo].[vwMP_VOL] clause and then in FROM clause and the second time you are giving it an alias. But you are not using the alias in the SET clause, only the first reference.The result is that the the two references are unrelated and the rows are updated with unrelated data (and possibly many times as well!)
See the the official MSDN documentation about
UPDATE and the caution:Best Practices
...
Use caution when specifying the FROM clause to provide the criteria for the update operation. The results of an UPDATE statement are undefined if the statement includes a FROM clause that is not specified in such a way that only one value is available for each column occurrence that is updated, that is if the UPDATE statement is not deterministic.
and a more detailed analysis of this "feature" and other issues it can cause, in Hugo Kornelis's blog post: Let's deprecate
UPDATE FROM! and especially this paragraph:Correctness? Bah, who cares?
Well, most do. That’s why we test.
If I mess up the join criteria in a
SELECT query so that too many rows from the second table match, I’ll see it as soon as I test, because I get more rows back then expected. If I mess up the subquery criteria in an ANSI standard UPDATE query in a similar way, I see it even sooner, because SQL Server will return an error if the subquery returns more than a single value. But with the proprietary UPDATE FROM syntax, I can mess up the join and never notice – SQL Server will happily update the same row over and over again if it matches more than one row in the joined table, with only the result of the last of those updates sticking. And there is no way of knowing which row that will be, since that depends in the query execution plan that happens to be chosen. A worst case scenario would be one where the execution plan just happens to result in the expected outcome during all tests on the single-processor development server – and then, after deployment to the four-way dual-core production server, our precious data suddenly hits the fan…I think the best way to solve this is to use the alias in the
UPDATE clause and change accordingly the SET clause:UPDATE old
SET
[id_batch] = inserted.id_batch
,[id_invoice] = inserted.id_invoice
,[text_vchr] = inserted.text_vchr
,[amt_frt_tcur] = inserted.amt_frt_tcur
,[amt_line_hcur] = inserted.amt_line_hcur
,[line_total_hcur] = inserted.line_total_hcur
,[id_tax_1] = inserted.id_tax_1
,[id_tax_2] = inserted.id_tax_2
,[id_tax_3] = inserted.id_tax_3
,[id_tax_4] = inserted.id_tax_4
,[line_total_rem_hcu] = inserted.line_total_rem_hcu
,[id_po] = inserted.id_po
,[cnt_rlse] = inserted.cnt_rlse
,[cnt_line_nbr] = inserted.cnt_line_nbr
FROM inserted
-- INNER JOIN (SELECT * FROM [dbo].[vwMP_VOL]) AS old
-- or the simpler
INNER JOIN [dbo].[vwMP_VOL] AS old
ON old.[Org] = inserted.Org
AND old.[id_vend] = inserted.id_vend
AND old.[id_org_orig] = inserted.id_org_orig
AND old.[id_vchr_nbr] = inserted.id_vchr_nbr
AND old.[id_vchr_line] = inserted.id_vchr_line;This would work, too, but I don't like it as much as the previous because we can't alias the table:
UPDATE [dbo].[vwMP_VOL]
SET
[id_batch] = inserted.id_batch
---
,[cnt_line_nbr] = inserted.cnt_line_nbr
FROM inserted
WHERE [dbo].[vwMP_VOL].[Org] = inserted.Org
AND [dbo].[vwMP_VOL].[id_vend] = inserted.id_vend
AND [dbo].[vwMP_VOL].[id_org_orig] = inserted.id_org_orig
AND [dbo].[vwMP_VOL].[id_vchr_nbr] = inserted.id_vchr_nbr
AND [dbo].[vwMP_VOL].[id_vchr_line] = inserted.id_vchr_line;Code Snippets
UPDATE old
SET
[id_batch] = inserted.id_batch
,[id_invoice] = inserted.id_invoice
,[text_vchr] = inserted.text_vchr
,[amt_frt_tcur] = inserted.amt_frt_tcur
,[amt_line_hcur] = inserted.amt_line_hcur
,[line_total_hcur] = inserted.line_total_hcur
,[id_tax_1] = inserted.id_tax_1
,[id_tax_2] = inserted.id_tax_2
,[id_tax_3] = inserted.id_tax_3
,[id_tax_4] = inserted.id_tax_4
,[line_total_rem_hcu] = inserted.line_total_rem_hcu
,[id_po] = inserted.id_po
,[cnt_rlse] = inserted.cnt_rlse
,[cnt_line_nbr] = inserted.cnt_line_nbr
FROM inserted
-- INNER JOIN (SELECT * FROM [dbo].[vwMP_VOL]) AS old
-- or the simpler
INNER JOIN [dbo].[vwMP_VOL] AS old
ON old.[Org] = inserted.Org
AND old.[id_vend] = inserted.id_vend
AND old.[id_org_orig] = inserted.id_org_orig
AND old.[id_vchr_nbr] = inserted.id_vchr_nbr
AND old.[id_vchr_line] = inserted.id_vchr_line;UPDATE [dbo].[vwMP_VOL]
SET
[id_batch] = inserted.id_batch
---
,[cnt_line_nbr] = inserted.cnt_line_nbr
FROM inserted
WHERE [dbo].[vwMP_VOL].[Org] = inserted.Org
AND [dbo].[vwMP_VOL].[id_vend] = inserted.id_vend
AND [dbo].[vwMP_VOL].[id_org_orig] = inserted.id_org_orig
AND [dbo].[vwMP_VOL].[id_vchr_nbr] = inserted.id_vchr_nbr
AND [dbo].[vwMP_VOL].[id_vchr_line] = inserted.id_vchr_line;Context
StackExchange Database Administrators Q#140692, answer score: 4
Revisions (0)
No revisions yet.