patternsqlMajor
Why are multiple COUNT faster than one SUM with CASE?
Viewed 0 times
casewhyarewiththanfasteronemultiplesumcount
Problem
I wanted to know which of the following two approaches is faster:
1) Three
2)
I was suprised that the difference is so large. The first query with three subqueries returns the result immediately whereas the second
Why does it make such a great difference whether i use
Execution-plans:
There are 12 statuses in total. Those three statuses belong to 7% of all rows.
This is the actual view, i'm not sure if it's relevant:
```
CREATE VIEW [dbo].[Claims]
AS
SELECT
mu.Marketunitname AS MarketUnit,
c.Countryname AS Country,
gsp.Gspname AS GSP,
gsp.Wcmskeynumber AS GspNumber,
sl.Slname AS SL,
sl.Wcmskeynumber AS SlNumber,
m.Modelname AS Model,
m.Salesname AS [Model-Salesname],
s.Claimstatusname AS [Status],
d.Work_order AS [Work Order],
d.Ssn_number AS IMEI,
d.Ssn_out,
Remarks,
d.Claimnumber AS [Claim-Number],
d.Rma_number AS [RMA-Number],
dbo.ToShortDateString(d.Received_Date, 1) AS [Received Date],
Iddata,
Fisl,
Fimodel,
Ficlaimstatus
FROM Tabdata AS d
INNER JOIN Locsl AS sl
ON d.Fisl = sl.Idsl
INNE
1) Three
COUNT:SELECT Approved = (SELECT COUNT(*) FROM dbo.Claims d
WHERE d.Status = 'Approved'),
Valid = (SELECT COUNT(*) FROM dbo.Claims d
WHERE d.Status = 'Valid'),
Reject = (SELECT COUNT(*) FROM dbo.Claims d
WHERE d.Status = 'Reject')2)
SUM with FROM-clause:SELECT Approved = SUM(CASE WHEN Status = 'Approved' THEN 1 ELSE 0 END),
Valid = SUM(CASE WHEN Status = 'Valid' THEN 1 ELSE 0 END),
Reject = SUM(CASE WHEN Status = 'Reject' THEN 1 ELSE 0 END)
FROM dbo.Claims c;I was suprised that the difference is so large. The first query with three subqueries returns the result immediately whereas the second
SUM approach needs 18 seconds. Claims is a view which selects from a table containing ~18 million rows. There is an index on the FK-Column to the ClaimStatus table which contains the status-name.Why does it make such a great difference whether i use
COUNT or SUM?Execution-plans:
COUNT
SUM
There are 12 statuses in total. Those three statuses belong to 7% of all rows.
This is the actual view, i'm not sure if it's relevant:
```
CREATE VIEW [dbo].[Claims]
AS
SELECT
mu.Marketunitname AS MarketUnit,
c.Countryname AS Country,
gsp.Gspname AS GSP,
gsp.Wcmskeynumber AS GspNumber,
sl.Slname AS SL,
sl.Wcmskeynumber AS SlNumber,
m.Modelname AS Model,
m.Salesname AS [Model-Salesname],
s.Claimstatusname AS [Status],
d.Work_order AS [Work Order],
d.Ssn_number AS IMEI,
d.Ssn_out,
Remarks,
d.Claimnumber AS [Claim-Number],
d.Rma_number AS [RMA-Number],
dbo.ToShortDateString(d.Received_Date, 1) AS [Received Date],
Iddata,
Fisl,
Fimodel,
Ficlaimstatus
FROM Tabdata AS d
INNER JOIN Locsl AS sl
ON d.Fisl = sl.Idsl
INNE
Solution
The
Clearly seeking an index three times is going to be faster than seeking it 12 times.
The first plan requires a memory grant of 238MB, whereas the second plan requires a memory grant of 650MB. It may be that the larger memory grant could not be immediately filled, making the query that much slower.
Alter the second query to be:
This will allow the query optimizer to eliminate 75% of the index seeks, and should result in both a lower required memory grant, lower I/O requirements, and faster time-to-result.
The
COUNT(*) version is able to simply seek into the index you have on the status column once for each status you are selecting, whereas the SUM(...) version needs to seek the index twelve times (the total number of unique status types).Clearly seeking an index three times is going to be faster than seeking it 12 times.
The first plan requires a memory grant of 238MB, whereas the second plan requires a memory grant of 650MB. It may be that the larger memory grant could not be immediately filled, making the query that much slower.
Alter the second query to be:
SELECT Approved = SUM(CASE WHEN Status = 'Approved' THEN 1 ELSE 0 END),
Valid = SUM(CASE WHEN Status = 'Valid' THEN 1 ELSE 0 END),
Reject = SUM(CASE WHEN Status = 'Reject' THEN 1 ELSE 0 END)
FROM dbo.Claims c
WHERE c.Status = 'Approved'
OR c.Status = 'Valid'
OR c.Status = 'Reject';This will allow the query optimizer to eliminate 75% of the index seeks, and should result in both a lower required memory grant, lower I/O requirements, and faster time-to-result.
The
SUM(CASE WHEN ...) construct essentially prevents the query optimizer from pushing the Status predicates down into the index seek portion of the plan.Code Snippets
SELECT Approved = SUM(CASE WHEN Status = 'Approved' THEN 1 ELSE 0 END),
Valid = SUM(CASE WHEN Status = 'Valid' THEN 1 ELSE 0 END),
Reject = SUM(CASE WHEN Status = 'Reject' THEN 1 ELSE 0 END)
FROM dbo.Claims c
WHERE c.Status = 'Approved'
OR c.Status = 'Valid'
OR c.Status = 'Reject';Context
StackExchange Database Administrators Q#119945, answer score: 20
Revisions (0)
No revisions yet.