HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

"Upsert" trigger writing mass duplicates from SSIS

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
triggermassssiswritingupsertfromduplicates

Problem

I have a problem with an "upsert" trigger, being triggered by a SSIS package with fastload. (FIRE_TRIGGERS is enabled)

CREATE TRIGGER [dbo].[MP_VOL_UPSERT] 
   ON  [dbo].[vwMP_VOL] 
   INSTEAD OF INSERT
AS 
BEGIN
    SET NOCOUNT ON;

UPDATE [dbo].[vwMP_VOL]
   SET
    [vwMP_VOL].[id_batch] = inserted.id_batch
      ,[vwMP_VOL].[id_invoice] = inserted.id_invoice
      ,[vwMP_VOL].[text_vchr] = inserted.text_vchr
      ,[vwMP_VOL].[amt_frt_tcur] = inserted.amt_frt_tcur
      ,[vwMP_VOL].[amt_line_hcur] = inserted.amt_line_hcur
      ,[vwMP_VOL].[line_total_hcur] = inserted.line_total_hcur
      ,[vwMP_VOL].[id_tax_1] = inserted.id_tax_1
      ,[vwMP_VOL].[id_tax_2] = inserted.id_tax_2
      ,[vwMP_VOL].[id_tax_3] = inserted.id_tax_3
      ,[vwMP_VOL].[id_tax_4] = inserted.id_tax_4
      ,[vwMP_VOL].[line_total_rem_hcu] = inserted.line_total_rem_hcu
      ,[vwMP_VOL].[id_po] = inserted.id_po
      ,[vwMP_VOL].[cnt_rlse] = inserted.cnt_rlse
      ,[vwMP_VOL].[cnt_line_nbr] = inserted.cnt_line_nbr
FROM inserted 
INNER JOIN (SELECT [Org]
      ,[id_vend]
      ,[id_org_orig]
      ,[id_vchr_nbr]
      ,[id_vchr_line]
  FROM [vwMP_VOL]) AS old
ON old.[Org] = inserted.Org
      AND old.[id_vend] = inserted.id_vend
      AND old.[id_org_orig] = inserted.id_org_orig
      AND old.[id_vchr_nbr] = inserted.id_vchr_nbr
      AND old.[id_vchr_line] = inserted.id_vchr_line;

INSERT INTO [dbo].[vwMP_VOL] 
SELECT inserted.* FROM inserted 
LEFT JOIN (SELECT [Org]
      ,[id_vend]
      ,[id_org_orig]
      ,[id_vchr_nbr]
      ,[id_vchr_line]
  FROM [vwMP_VOL]) AS old
ON old.[Org] = inserted.Org
      AND old.[id_vend] = inserted.id_vend
      AND old.[id_org_orig] = inserted.id_org_orig
      AND old.[id_vchr_nbr] = inserted.id_vchr_nbr
      AND old.[id_vchr_line] = inserted.id_vchr_line
WHERE old.Org IS NULL;

END


Inserting new rows works fine, but all existing rows get updated with one set of values.

```
SELECT
text_vchr, count(*) as dups
FROM MP_VOL
group by text

Solution

I suspect the issue appears because you are referencing the view twice in the UPDATE statement, once in UPDATE [dbo].[vwMP_VOL] clause and then in FROM clause and the second time you are giving it an alias. But you are not using the alias in the SET clause, only the first reference.

The result is that the the two references are unrelated and the rows are updated with unrelated data (and possibly many times as well!)

See the the official MSDN documentation about UPDATE and the caution:


Best Practices


...


Use caution when specifying the FROM clause to provide the criteria for the update operation. The results of an UPDATE statement are undefined if the statement includes a FROM clause that is not specified in such a way that only one value is available for each column occurrence that is updated, that is if the UPDATE statement is not deterministic.

and a more detailed analysis of this "feature" and other issues it can cause, in Hugo Kornelis's blog post: Let's deprecate UPDATE FROM! and especially this paragraph:


Correctness? Bah, who cares?


Well, most do. That’s why we test.


If I mess up the join criteria in a SELECT query so that too many rows from the second table match, I’ll see it as soon as I test, because I get more rows back then expected. If I mess up the subquery criteria in an ANSI standard UPDATE query in a similar way, I see it even sooner, because SQL Server will return an error if the subquery returns more than a single value. But with the proprietary UPDATE FROM syntax, I can mess up the join and never notice – SQL Server will happily update the same row over and over again if it matches more than one row in the joined table, with only the result of the last of those updates sticking. And there is no way of knowing which row that will be, since that depends in the query execution plan that happens to be chosen. A worst case scenario would be one where the execution plan just happens to result in the expected outcome during all tests on the single-processor development server – and then, after deployment to the four-way dual-core production server, our precious data suddenly hits the fan…

I think the best way to solve this is to use the alias in the UPDATE clause and change accordingly the SET clause:

UPDATE old 
   SET
       [id_batch] = inserted.id_batch
      ,[id_invoice] = inserted.id_invoice
      ,[text_vchr] = inserted.text_vchr
      ,[amt_frt_tcur] = inserted.amt_frt_tcur
      ,[amt_line_hcur] = inserted.amt_line_hcur
      ,[line_total_hcur] = inserted.line_total_hcur
      ,[id_tax_1] = inserted.id_tax_1
      ,[id_tax_2] = inserted.id_tax_2
      ,[id_tax_3] = inserted.id_tax_3
      ,[id_tax_4] = inserted.id_tax_4
      ,[line_total_rem_hcu] = inserted.line_total_rem_hcu
      ,[id_po] = inserted.id_po
      ,[cnt_rlse] = inserted.cnt_rlse
      ,[cnt_line_nbr] = inserted.cnt_line_nbr
FROM inserted 
          --   INNER JOIN (SELECT * FROM [dbo].[vwMP_VOL]) AS old
-- or the simpler
INNER JOIN [dbo].[vwMP_VOL] AS old
ON old.[Org] = inserted.Org
      AND old.[id_vend] = inserted.id_vend
      AND old.[id_org_orig] = inserted.id_org_orig
      AND old.[id_vchr_nbr] = inserted.id_vchr_nbr
      AND old.[id_vchr_line] = inserted.id_vchr_line;


This would work, too, but I don't like it as much as the previous because we can't alias the table:

UPDATE [dbo].[vwMP_VOL] 
   SET
       [id_batch] = inserted.id_batch
      ---
      ,[cnt_line_nbr] = inserted.cnt_line_nbr
FROM inserted 
WHERE [dbo].[vwMP_VOL].[Org] = inserted.Org
      AND [dbo].[vwMP_VOL].[id_vend] = inserted.id_vend
      AND [dbo].[vwMP_VOL].[id_org_orig] = inserted.id_org_orig
      AND [dbo].[vwMP_VOL].[id_vchr_nbr] = inserted.id_vchr_nbr
      AND [dbo].[vwMP_VOL].[id_vchr_line] = inserted.id_vchr_line;

Code Snippets

UPDATE old 
   SET
       [id_batch] = inserted.id_batch
      ,[id_invoice] = inserted.id_invoice
      ,[text_vchr] = inserted.text_vchr
      ,[amt_frt_tcur] = inserted.amt_frt_tcur
      ,[amt_line_hcur] = inserted.amt_line_hcur
      ,[line_total_hcur] = inserted.line_total_hcur
      ,[id_tax_1] = inserted.id_tax_1
      ,[id_tax_2] = inserted.id_tax_2
      ,[id_tax_3] = inserted.id_tax_3
      ,[id_tax_4] = inserted.id_tax_4
      ,[line_total_rem_hcu] = inserted.line_total_rem_hcu
      ,[id_po] = inserted.id_po
      ,[cnt_rlse] = inserted.cnt_rlse
      ,[cnt_line_nbr] = inserted.cnt_line_nbr
FROM inserted 
          --   INNER JOIN (SELECT * FROM [dbo].[vwMP_VOL]) AS old
-- or the simpler
INNER JOIN [dbo].[vwMP_VOL] AS old
ON old.[Org] = inserted.Org
      AND old.[id_vend] = inserted.id_vend
      AND old.[id_org_orig] = inserted.id_org_orig
      AND old.[id_vchr_nbr] = inserted.id_vchr_nbr
      AND old.[id_vchr_line] = inserted.id_vchr_line;
UPDATE [dbo].[vwMP_VOL] 
   SET
       [id_batch] = inserted.id_batch
      ---
      ,[cnt_line_nbr] = inserted.cnt_line_nbr
FROM inserted 
WHERE [dbo].[vwMP_VOL].[Org] = inserted.Org
      AND [dbo].[vwMP_VOL].[id_vend] = inserted.id_vend
      AND [dbo].[vwMP_VOL].[id_org_orig] = inserted.id_org_orig
      AND [dbo].[vwMP_VOL].[id_vchr_nbr] = inserted.id_vchr_nbr
      AND [dbo].[vwMP_VOL].[id_vchr_line] = inserted.id_vchr_line;

Context

StackExchange Database Administrators Q#140692, answer score: 4

Revisions (0)

No revisions yet.