snippetsqlModerate
Why am I getting a sort when I have an index?
Viewed 0 times
indexwhygettingwhensorthave
Problem
Azure SQL Database.
I have a table from which I need to get the first and most recent rows for
I have an index like so:
My query to get the first row is (plan here):
The index scan is using the index (
My query to get the most recent row (plan here):
Again, the index scan is using the index (
The first one straight after the index scan.
Isn't the optimiser clever enough to read the index in reverse order?
Again, what's the second sort for?
The actual table is quite large, so you can imagine the sorts are costly. How can I best optimise here?
I have a table from which I need to get the first and most recent rows for
Col1 and Col2 based on CreateDate.CREATE TABLE dbo.table1 (
Id INT IDENTITY(1,1) PRIMARY KEY ,
Col1 VARCHAR(255) COLLATE SQL_Latin1_General_CP1_CS_AS NOT NULL ,
Col2 VARCHAR(255) COLLATE SQL_Latin1_General_CP1_CS_AS NOT NULL ,
CreateDate DATETIME NOT NULL
) ;I have an index like so:
CREATE INDEX IX__table1_ASC
ON dbo.table1 (Col1, Col2, CreateDate );My query to get the first row is (plan here):
--Get the first row
SELECT TOP (1) WITH TIES
*
FROM table1
ORDER BY ROW_NUMBER()
OVER (PARTITION BY Col1, Col2
ORDER BY CreateDate );The index scan is using the index (
IX__table1_ASC) I created, but why am I getting a sort?My query to get the most recent row (plan here):
--get latest row
SELECT TOP (1) WITH TIES
*
FROM table1
ORDER BY ROW_NUMBER()
OVER (PARTITION BY Col1, Col2
ORDER BY CreateDate DESC); --desc hereAgain, the index scan is using the index (
IX__table1_ASC), but this time I'm getting two sorts.The first one straight after the index scan.
Isn't the optimiser clever enough to read the index in reverse order?
Again, what's the second sort for?
The actual table is quite large, so you can imagine the sorts are costly. How can I best optimise here?
Solution
The index scan is using the index (IX__table1_ASC) I created but why
am I getting a sort?
Because you are using an inefficient way of selecting the top row per group.
Just use
The
In your first execution plan the sort is not there to calculate the row number but is there to order the rows by the result of that row numbering after it was already calculated without a sort.
Regarding your second query this is a long standing optimser limitation - you can get the backwards ordered index scan and no sort with the below.
The
A presentation-level
In SQL Server 2000 some people used to get an "ordered view" by adding a
Personally I would hope that any spare engineering effort goes into improving the optimisation so this sort of hack isn't necessary first though!
am I getting a sort?
Because you are using an inefficient way of selecting the top row per group.
Just use
WITH T
AS (SELECT *,
ROW_NUMBER()
OVER (
PARTITION BY Col1, Col2
ORDER BY CreateDate ) AS RN
FROM table1)
SELECT *
FROM T
WHERE RN = 1The
TOP (1) WITH TIES here is just a more obfuscated and less efficient way of selecting all rows where the row number equals 1. Unfortunately some answerers on StackOverflow use this method for no good reason that I can discern except liking the novelty.In your first execution plan the sort is not there to calculate the row number but is there to order the rows by the result of that row numbering after it was already calculated without a sort.
Regarding your second query this is a long standing optimser limitation - you can get the backwards ordered index scan and no sort with the below.
WITH T
AS (SELECT *,
ROW_NUMBER()
OVER (
PARTITION BY Col1, Col2
ORDER BY CreateDate DESC ) AS RN
FROM table1
ORDER BY Col1 DESC, Col2 DESC, CreateDate DESC
OFFSET 0 ROWS
)
SELECT *
FROM T
WHERE RN = 1The
OFFSET 0 ROWS is a hack to allow ORDER BY in a derived table, which is not normally allowed in SQL Server. The important thing is to give the optimizer a separate reason to consider the optimal ordering.A presentation-level
ORDER BY would serve the same purpose, but I prefer placing the hack closer to the thing that requires it. This approach also allows you to specify a different presentation order. Bear in mind the OFFSET 0 could be optimized away some day, like TOP (100) PERCENT is.In SQL Server 2000 some people used to get an "ordered view" by adding a
TOP 100 PERCENT ... ORDER BY. The effect of this, at least most of the time, was that just doing a plain SELECT from the view without any ORDER BY on the outer query returned the rows in the desired order. This was never guaranteed and in SQL Server 2005 logic was added to the optimiser that just optimised TOP 100 PERCENT out in this type of case as logically redundant. Potentially the same could happen in the future to OFFSET 0 ROWS as it is similarly redundant.Personally I would hope that any spare engineering effort goes into improving the optimisation so this sort of hack isn't necessary first though!
Code Snippets
WITH T
AS (SELECT *,
ROW_NUMBER()
OVER (
PARTITION BY Col1, Col2
ORDER BY CreateDate ) AS RN
FROM table1)
SELECT *
FROM T
WHERE RN = 1WITH T
AS (SELECT *,
ROW_NUMBER()
OVER (
PARTITION BY Col1, Col2
ORDER BY CreateDate DESC ) AS RN
FROM table1
ORDER BY Col1 DESC, Col2 DESC, CreateDate DESC
OFFSET 0 ROWS
)
SELECT *
FROM T
WHERE RN = 1Context
StackExchange Database Administrators Q#317999, answer score: 16
Revisions (0)
No revisions yet.