patternsqlMinor
Is there an option or hint possible to improve performance of query with multiple values in the "in" clause
Viewed 0 times
thehintwithqueryimproveoptionpossibleperformancemultiplevalues
Problem
We have a table CustomerNote with 4 columns ID, CustomerID, Note, Date
There is an index on CustomerID asc, Date desc
When the following query is executed
The index is used, but it's still fetching ALL CustomerNotes for the customerID's 1 & 5, to then sort/top, causing a lot of CPU usage.
This is due to the multiple values in the "in" clause. I know that the "in" clause will never have more values than 10, so it would be a much better approach if sql server iterates over the 10, fetches at least 30 per customerID and the merge, sorts & tops. Is there a query hint or option to achieve this?
There is an index on CustomerID asc, Date desc
When the following query is executed
select top 30
Date
from CustomerNote
where CustomerID in (1,5)
order by Date descThe index is used, but it's still fetching ALL CustomerNotes for the customerID's 1 & 5, to then sort/top, causing a lot of CPU usage.
This is due to the multiple values in the "in" clause. I know that the "in" clause will never have more values than 10, so it would be a much better approach if sql server iterates over the 10, fetches at least 30 per customerID and the merge, sorts & tops. Is there a query hint or option to achieve this?
Solution
You're probably better off writing the transformation you want manually, but in the spirit of finding something the optimizer can just about do for you with minimal changes:
Yes, it would be nice if the optimizer explored this sort of option for you without changing syntax.
Example table and data
That loads 2047 randomish date values for ten different customers:
CustomerID
NumRows
1
2047
2
2047
3
2047
4
2047
5
2047
6
2047
7
2047
8
2047
9
2047
Running the solution code produces a post-execution plan where 14 rows are read from an ordered index seek for customer 1 and 17 rows from a similar seek for customer 5:
db<>fiddle demo
The plan does not read all 2047 rows for each customer.
Note also this solution does not require a sort operator.
The solution is quite general, though there are some sorting conditions to meet for the Merge Concatenation.
If you want to project a different column, that column would need to be part of the index key (not as an include) to meet the sorting requirement; or you could fetch only a key for the table and look up the extra column(s) as a separate step after finding the required small number of keys.
Additional demos with an extra column:
SELECT TOP (30)
CN.[Date]
FROM dbo.CustomerNote AS CN
WHERE
CN.CustomerID IN (SELECT 1 UNION SELECT 5)
ORDER BY
CN.[Date] DESC;Yes, it would be nice if the optimizer explored this sort of option for you without changing syntax.
Example table and data
CREATE TABLE dbo.CustomerNote
(
CustomerID integer NOT NULL,
[Date] datetime NOT NULL
);WITH
N (n) AS
(
SELECT
SV.number
FROM master.dbo.spt_values AS SV
WHERE
SV.[type] = N'P'
AND SV.number >= 1
)
INSERT dbo.CustomerNote
WITH (TABLOCKX)
(
CustomerID,
[Date]
)
SELECT
C.n,
DATEADD
(
MINUTE,
-D.n * RAND(CHECKSUM(NEWID())) * 1000,
GETDATE()
)
FROM N AS C
CROSS JOIN N AS D
WHERE
C.n BETWEEN 1 AND 9;
CREATE NONCLUSTERED INDEX
[IX dbo.CustomerNote CustomerID, Date-]
ON dbo.CustomerNote
(CustomerID, [Date] DESC);That loads 2047 randomish date values for ten different customers:
SELECT CN.CustomerID, NumRows = COUNT_BIG(*)
FROM dbo.CustomerNote AS CN
GROUP BY CN.CustomerID
ORDER BY CN.CustomerID;CustomerID
NumRows
1
2047
2
2047
3
2047
4
2047
5
2047
6
2047
7
2047
8
2047
9
2047
Running the solution code produces a post-execution plan where 14 rows are read from an ordered index seek for customer 1 and 17 rows from a similar seek for customer 5:
db<>fiddle demo
The plan does not read all 2047 rows for each customer.
Note also this solution does not require a sort operator.
The solution is quite general, though there are some sorting conditions to meet for the Merge Concatenation.
If you want to project a different column, that column would need to be part of the index key (not as an include) to meet the sorting requirement; or you could fetch only a key for the table and look up the extra column(s) as a separate step after finding the required small number of keys.
Additional demos with an extra column:
- https://dbfiddle.uk/nNKi4m0P (added to index key)
- https://dbfiddle.uk/D4fNdMf2 (key plus lookup)
Code Snippets
SELECT TOP (30)
CN.[Date]
FROM dbo.CustomerNote AS CN
WHERE
CN.CustomerID IN (SELECT 1 UNION SELECT 5)
ORDER BY
CN.[Date] DESC;CREATE TABLE dbo.CustomerNote
(
CustomerID integer NOT NULL,
[Date] datetime NOT NULL
);WITH
N (n) AS
(
SELECT
SV.number
FROM master.dbo.spt_values AS SV
WHERE
SV.[type] = N'P'
AND SV.number >= 1
)
INSERT dbo.CustomerNote
WITH (TABLOCKX)
(
CustomerID,
[Date]
)
SELECT
C.n,
DATEADD
(
MINUTE,
-D.n * RAND(CHECKSUM(NEWID())) * 1000,
GETDATE()
)
FROM N AS C
CROSS JOIN N AS D
WHERE
C.n BETWEEN 1 AND 9;
CREATE NONCLUSTERED INDEX
[IX dbo.CustomerNote CustomerID, Date-]
ON dbo.CustomerNote
(CustomerID, [Date] DESC);SELECT CN.CustomerID, NumRows = COUNT_BIG(*)
FROM dbo.CustomerNote AS CN
GROUP BY CN.CustomerID
ORDER BY CN.CustomerID;Context
StackExchange Database Administrators Q#326661, answer score: 9
Revisions (0)
No revisions yet.