patternsqlMinor
TSQL speedup partition by query
Viewed 0 times
speeduptsqlpartitionquery
Problem
CREATE TABLE entries(
uuid BINARY(16) PRIMARY KEY,
status VARCHAR(20),
client_id VARCHAR(32),
creation_date DATETIME,
);status can be NEW, IN_PROGRESS, DONE
I need one entry with status != DONE for each client_id. I want maximum 500 of such entries and with oldest creation date.
I ended with query:
SELECT TOP (500) *
FROM (SELECT
*,
ROW_NUMBER() OVER (PARTITION BY client_id
ORDER BY creation_date) rn
FROM entries
WHERE status <> 'DONE') v
WHERE rn = 1We have around 1000 clients and around 1 milion entries.
Currently query executes ~90 seconds on our Azure SQL database.
Azure SQL Server suggested creating following index:
CREATE NONCLUSTERED INDEX [nci_wi_entries_0A6FBEBB67742F72F54C5F86FD2C5023]
ON [dbo].[entries] ([status])
INCLUDE ([client_id], [creation_date])
WITH (ONLINE = ON)After that, query executes around 20 seconds, but that is still too slow. I must have results in less than second.
Do you have any ideas how to optimize it (in any way, either rebuilding table or optimizing query)?
Solution
If you have a separate table
Also, use filtered index to match your query:
The query (assuming you have
Clients with the list of all 1000 client IDs, it may be faster to seek 500 times into 1M rows table instead of reading all 1M rows. This is another common variant to write a top-n-per-group query. The following question discussed it in great details: Retrieving n rows per group.Also, use filtered index to match your query:
CREATE NONCLUSTERED INDEX [IX] ON [dbo].[entries]
(
[client_id] ASC,
[creation_date] ASC
)
INCLUDE
(
[status],
[uuid]
)
WHERE (status <> 'DONE')The query (assuming you have
Clients table):SELECT TOP(500)
CA.*
FROM
dbo.Clients
CROSS APPLY
(
SELECT TOP(1)
dbo.entries.*
FROM
dbo.entries
WHERE
dbo.entries.client_id = dbo.Clients.client_id
AND dbo.entries.status <> 'DONE'
ORDER BY dbo.entries.creation_date
) AS CA
;Code Snippets
CREATE NONCLUSTERED INDEX [IX] ON [dbo].[entries]
(
[client_id] ASC,
[creation_date] ASC
)
INCLUDE
(
[status],
[uuid]
)
WHERE (status <> 'DONE')SELECT TOP(500)
CA.*
FROM
dbo.Clients
CROSS APPLY
(
SELECT TOP(1)
dbo.entries.*
FROM
dbo.entries
WHERE
dbo.entries.client_id = dbo.Clients.client_id
AND dbo.entries.status <> 'DONE'
ORDER BY dbo.entries.creation_date
) AS CA
;Context
StackExchange Database Administrators Q#188545, answer score: 2
Revisions (0)
No revisions yet.