HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlModerate

Deleting millions of rows from a MSSQL server table

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
rowsdeletingmillionsmssqlserverfromtable

Problem

This SQL query took 38 minutes to delete just 10K of rows. How can I optimize it?

Index already exists for CREATEDATE from table2.

declare @tmpcount int
declare @counter int

SET @counter = 0
SET @tmpcount = 1
WHILE @counter <> @tmpcount
BEGIN
    SET ROWCOUNT 10000

    SET @counter = @counter + 1

    DELETE table1
    FROM table1
    JOIN table2
        ON table2.DOC_NUMBER = table1.DOC_NUMBER
        AND table2.DOC_YEAR = table1.DOC_YEAR
    WHERE YEAR(CREATEDDATE) BETWEEN 2007 and 2009

    IF @@ROWCOUNT = 0 
        BREAK
END

Solution

The performance problem is due to the non-sargable expression YEAR(CREATEDATE). Applying a function to a column in a WHERE clause prevents efficient use of the index on the column. Below is an alternative technique, which uses an inclusive start and exclusive end for the datetime range.

Note the use of SET ROWCOUNT is deprecated for DELETE, INSERT and UPDATE statements. Use TOP instead.

DELETE TOP(10000) table1
FROM dbo.table1
JOIN dbo.table2
    ON table2.DOC_NUMBER = table1.DOC_NUMBER
    AND table2.DOC_YEAR = table1.DOC_YEAR
WHERE 
    CREATEDDATE >= '20070101' 
    AND CREATEDATE < '20100101';


I'm not sure the purpose of @tmpcount in your script other than perhaps controlling the number of iterations for testing.

Code Snippets

DELETE TOP(10000) table1
FROM dbo.table1
JOIN dbo.table2
    ON table2.DOC_NUMBER = table1.DOC_NUMBER
    AND table2.DOC_YEAR = table1.DOC_YEAR
WHERE 
    CREATEDDATE >= '20070101' 
    AND CREATEDATE < '20100101';

Context

StackExchange Code Review Q#113864, answer score: 11

Revisions (0)

No revisions yet.