HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

A clarification of MERGE best practices

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
clarificationbestpracticesmerge

Problem

I just wanted to ask a clarification about something on the MSDN from the "Optimizing MERGE Statement Performance" page.

I am working with a data warehouse that takes in records from many different databases and stores the data. All of the tables in my warehouse database basically follow this same pattern:

CREATE TABLE Foo (
    database_guid UNIQUEIDENTIFIER
    ,FooPk BIGINT
    ,Bar NVARCHAR(20)
    ,Qix NCHAR(10)
    ,CONSTRAINT [PK_Foo] PRIMARY KEY CLUSTERED (
        database_guid ASC
        ,FooPk ASC
        )
    )
GO

CREATE PROCEDURE [iv].[LoadSomeTable] 
    @databaseGUID UNIQUEIDENTIFIER
AS
BEGIN
    SET NOCOUNT ON

    MERGE Foo
    USING #FooStaging AS Source
    ON Foo.FooPk = Source.FooPk AND Foo.database_guid = @databaseGUID
    WHEN MATCHED THEN
        UPDATE SET Bar = Source.Bar
                  ,Qix = Source.Qix
    WHEN NOT MATCHED THEN
        INSERT (database_guid, FooPk, Bar, Qix)
            VALUES (@databaseGUID, FooPk, Bar, Qix);
END
GO

CREATE TABLE #FooStaging (
    FooPk BIGINT
    ,Bar NVARCHAR(20)
    ,Qix NCHAR(10)
    )

--Data gets loaded in to #FooStaging from a C# call to SqlBulkCopy then calls iv.LoadSomeTable


The thing I am now concerned about is I just read this statement from that MSDN page


Specify only search conditions in the ON ``
clause that determine the criteria for matching data in the source and
target tables. That is, specify only columns from the target table
that are compared to the corresponding columns of the source table. Do
not include comparisons to other values such as a constant.

After reading that I am thinking I did my query wrong and my merge statement should be

MERGE Foo
USING #FooStaging AS Source
ON Foo.FooPk = Source.FooPk
WHEN MATCHED AND Foo.database_guid = @databaseGUID THEN
    UPDATE SET Bar = Source.Bar
              ,Qix = Source.Qix
WHEN NOT MATCHED THEN
    INSERT (database_guid, FooPk, Bar, Qix)
        VALUES (@databaseGUID, FooPk, Bar, Qix);


But that d

Solution

I would venture that your best approach is to use separate statements for each of the potential actions, and put them in a serializable transaction. You get to use tried and true statements with no funny semantics or "best practices" violations, and you get to avoid all of the problems I outline in this post, including wrong results bugs and potential index corruption:

  • MERGE with Caution

Context

StackExchange Database Administrators Q#61186, answer score: 9

Revisions (0)

No revisions yet.