HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

Is there an option or hint possible to improve performance of query with multiple values in the "in" clause

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
thehintwithqueryimproveoptionpossibleperformancemultiplevalues

Problem

We have a table CustomerNote with 4 columns ID, CustomerID, Note, Date

There is an index on CustomerID asc, Date desc

When the following query is executed

select top 30 
    Date 
from CustomerNote
where CustomerID in (1,5)
order by Date desc


The index is used, but it's still fetching ALL CustomerNotes for the customerID's 1 & 5, to then sort/top, causing a lot of CPU usage.

This is due to the multiple values in the "in" clause. I know that the "in" clause will never have more values than 10, so it would be a much better approach if sql server iterates over the 10, fetches at least 30 per customerID and the merge, sorts & tops. Is there a query hint or option to achieve this?

Solution

You're probably better off writing the transformation you want manually, but in the spirit of finding something the optimizer can just about do for you with minimal changes:

SELECT TOP (30) 
    CN.[Date] 
FROM dbo.CustomerNote AS CN
WHERE 
    CN.CustomerID IN (SELECT 1 UNION SELECT 5)
ORDER BY 
    CN.[Date] DESC;


Yes, it would be nice if the optimizer explored this sort of option for you without changing syntax.
Example table and data

CREATE TABLE dbo.CustomerNote
(
    CustomerID integer NOT NULL,
    [Date] datetime NOT NULL
);


WITH 
    N (n) AS
    (
        SELECT 
            SV.number 
        FROM master.dbo.spt_values AS SV
        WHERE
            SV.[type] = N'P'
            AND SV.number >= 1
    )
INSERT dbo.CustomerNote
    WITH (TABLOCKX)
(
    CustomerID, 
    [Date]
)
SELECT 
    C.n, 
    DATEADD
    (
        MINUTE, 
        -D.n * RAND(CHECKSUM(NEWID())) * 1000,
        GETDATE()
    )
FROM N AS C
CROSS JOIN N AS D
WHERE
    C.n BETWEEN 1 AND 9;

CREATE NONCLUSTERED INDEX 
    [IX dbo.CustomerNote CustomerID, Date-] 
ON dbo.CustomerNote
    (CustomerID, [Date] DESC);


That loads 2047 randomish date values for ten different customers:

SELECT CN.CustomerID, NumRows = COUNT_BIG(*) 
FROM dbo.CustomerNote AS CN
GROUP BY CN.CustomerID
ORDER BY CN.CustomerID;


CustomerID
NumRows

1
2047

2
2047

3
2047

4
2047

5
2047

6
2047

7
2047

8
2047

9
2047

Running the solution code produces a post-execution plan where 14 rows are read from an ordered index seek for customer 1 and 17 rows from a similar seek for customer 5:

db<>fiddle demo

The plan does not read all 2047 rows for each customer.

Note also this solution does not require a sort operator.

The solution is quite general, though there are some sorting conditions to meet for the Merge Concatenation.

If you want to project a different column, that column would need to be part of the index key (not as an include) to meet the sorting requirement; or you could fetch only a key for the table and look up the extra column(s) as a separate step after finding the required small number of keys.

Additional demos with an extra column:

  • https://dbfiddle.uk/nNKi4m0P (added to index key)



  • https://dbfiddle.uk/D4fNdMf2 (key plus lookup)

Code Snippets

SELECT TOP (30) 
    CN.[Date] 
FROM dbo.CustomerNote AS CN
WHERE 
    CN.CustomerID IN (SELECT 1 UNION SELECT 5)
ORDER BY 
    CN.[Date] DESC;
CREATE TABLE dbo.CustomerNote
(
    CustomerID integer NOT NULL,
    [Date] datetime NOT NULL
);
WITH 
    N (n) AS
    (
        SELECT 
            SV.number 
        FROM master.dbo.spt_values AS SV
        WHERE
            SV.[type] = N'P'
            AND SV.number >= 1
    )
INSERT dbo.CustomerNote
    WITH (TABLOCKX)
(
    CustomerID, 
    [Date]
)
SELECT 
    C.n, 
    DATEADD
    (
        MINUTE, 
        -D.n * RAND(CHECKSUM(NEWID())) * 1000,
        GETDATE()
    )
FROM N AS C
CROSS JOIN N AS D
WHERE
    C.n BETWEEN 1 AND 9;

CREATE NONCLUSTERED INDEX 
    [IX dbo.CustomerNote CustomerID, Date-] 
ON dbo.CustomerNote
    (CustomerID, [Date] DESC);
SELECT CN.CustomerID, NumRows = COUNT_BIG(*) 
FROM dbo.CustomerNote AS CN
GROUP BY CN.CustomerID
ORDER BY CN.CustomerID;

Context

StackExchange Database Administrators Q#326661, answer score: 9

Revisions (0)

No revisions yet.