gotchasqlMajor
Why does adding a TOP 1 dramatically worsen performance?
Viewed 0 times
whytopaddingperformancedoesworsendramatically
Problem
I have a fairly simple query
That is giving me horrible performance (like never bothered to wait for it to finish). The query plan looks like this:
However if I remove the
Correct PK & indexing below.
The fact that the
Note: I've read the results from this post and understand the concept of a
Edit For people reading this after the fact here are a few extra pieces of information.
When I started there were no other indexes. I ended up with one on Correspondence_Journal (Document_Id, File_Number)
SELECT TOP 1 dc.DOCUMENT_ID,
dc.COPIES,
dc.REQUESTOR,
dc.D_ID,
cj.FILE_NUMBER
FROM DOCUMENT_QUEUE dc
JOIN CORRESPONDENCE_JOURNAL cj
ON dc.DOCUMENT_ID = cj.DOCUMENT_ID
WHERE dc.QUEUE_DATE <= GETDATE()
AND dc.PRINT_LOCATION = 2
ORDER BY cj.FILE_NUMBERThat is giving me horrible performance (like never bothered to wait for it to finish). The query plan looks like this:
However if I remove the
TOP 1 I get a plan that looks like this and it runs in 1-2 seconds:Correct PK & indexing below.
The fact that the
TOP 1 changed the query plan doesn't surprise me, I'm just a bit surprised that it makes it so much worse. Note: I've read the results from this post and understand the concept of a
Row Goal etc. What I'm curious about is how I can go about changing the query so that it uses the better plan. Currently I'm dumping the data into a temp table then pulling the first row off of it. I'm wondering if there is a better method.Edit For people reading this after the fact here are a few extra pieces of information.
- Document_Queue - PK/CI is D_ID and it has ~5k rows.
- Correspondence_Journal - PK/CI is FILE_NUMBER, CORRESPONDENCE_ID and it has ~1.4 mil rows.
When I started there were no other indexes. I ended up with one on Correspondence_Journal (Document_Id, File_Number)
Solution
Since you get the correct plan with the
In my mind, the query plan for the
ORDER BY, maybe you could just roll your own TOP operator?SELECT DOCUMENT_ID, COPIES, REQUESTOR, D_ID, FILE_NUMBER
FROM (
SELECT dc.DOCUMENT_ID,
dc.COPIES,
dc.REQUESTOR,
dc.D_ID,
cj.FILE_NUMBER,
ROW_NUMBER() OVER (ORDER BY cj.FILE_NUMBER) AS _rownum
FROM DOCUMENT_QUEUE dc
INNER JOIN CORRESPONDENCE_JOURNAL cj
ON dc.DOCUMENT_ID = cj.DOCUMENT_ID
WHERE dc.QUEUE_DATE <= GETDATE()
AND dc.PRINT_LOCATION = 2
) AS sub
WHERE _rownum=1;In my mind, the query plan for the
ROW_NUMBER() above should be the same as if you had an ORDER BY. The query plan should now have a Segment, Sequence Project and finally a Filter operator, the rest should look just like your good plan.Code Snippets
SELECT DOCUMENT_ID, COPIES, REQUESTOR, D_ID, FILE_NUMBER
FROM (
SELECT dc.DOCUMENT_ID,
dc.COPIES,
dc.REQUESTOR,
dc.D_ID,
cj.FILE_NUMBER,
ROW_NUMBER() OVER (ORDER BY cj.FILE_NUMBER) AS _rownum
FROM DOCUMENT_QUEUE dc
INNER JOIN CORRESPONDENCE_JOURNAL cj
ON dc.DOCUMENT_ID = cj.DOCUMENT_ID
WHERE dc.QUEUE_DATE <= GETDATE()
AND dc.PRINT_LOCATION = 2
) AS sub
WHERE _rownum=1;Context
StackExchange Database Administrators Q#127639, answer score: 30
Revisions (0)
No revisions yet.