patternsqlModerate
Why are there execution plan differences between OFFSET ... FETCH and the old-style ROW_NUMBER scheme?
Viewed 0 times
whytheschemeareoffsetfetchplandifferencesbetweenstyle
Problem
The new
One would assume that the optimizer recognizes both and optimizes them (trivially) to the fullest.
Here is a very simple case where
One can vary this test case by creating a CI on
OFFSET ... FETCH model introduces with SQL Server 2012 offers simple and faster paging. Why are there any differences at all considering that the two forms are semantically identical and very common?One would assume that the optimizer recognizes both and optimizes them (trivially) to the fullest.
Here is a very simple case where
OFFSET ... FETCH is ~2x faster according to the cost estimate.SELECT * INTO #objects FROM sys.objects
SELECT *
FROM (
SELECT *, ROW_NUMBER() OVER (ORDER BY object_id) r
FROM #objects
) x
WHERE r >= 30 AND r < (30 + 10)
ORDER BY object_id
SELECT *
FROM #objects
ORDER BY object_id
OFFSET 30 ROWS FETCH NEXT 10 ROWS ONLYOne can vary this test case by creating a CI on
object_id or adding filters but it is impossible to remove all plan differences. OFFSET ... FETCH is always faster because it does less work at execution time.Solution
The examples in the question do not quite produce the same results (the
The
The
That is a saving of 0.000098 estimated cost units (though the
Where constant literal values are used (e.g.
One example for the third point above occurs where the paging set is quite wide. It can be much more efficient to seek the keys needed using a nonclustered index and manually lookup against the clustered index compared with scanning the index with
Overall, it is probably better that people make an informed decision to change their paging queries to use
OFFSET example has an off-by-one error). The updated forms below fix that issue, remove the extra sort for the ROW_NUMBER case, and use variables to make the solution more general:DECLARE
@PageSize bigint = 10,
@PageNumber integer = 3;
WITH Numbered AS
(
SELECT TOP ((@PageNumber + 1) * @PageSize)
o.*,
rn = ROW_NUMBER() OVER (
ORDER BY o.[object_id])
FROM #objects AS o
ORDER BY
o.[object_id]
)
SELECT
x.name,
x.[object_id],
x.principal_id,
x.[schema_id],
x.parent_object_id,
x.[type],
x.type_desc,
x.create_date,
x.modify_date,
x.is_ms_shipped,
x.is_published,
x.is_schema_published
FROM Numbered AS x
WHERE
x.rn >= @PageNumber * @PageSize
AND x.rn < ((@PageNumber + 1) * @PageSize)
ORDER BY
x.[object_id];
SELECT
o.name,
o.[object_id],
o.principal_id,
o.[schema_id],
o.parent_object_id,
o.[type],
o.type_desc,
o.create_date,
o.modify_date,
o.is_ms_shipped,
o.is_published,
o.is_schema_published
FROM #objects AS o
ORDER BY
o.[object_id]
OFFSET @PageNumber * @PageSize - 1 ROWS
FETCH NEXT @PageSize ROWS ONLY;The
ROW_NUMBER plan has an estimated cost of 0.0197935:The
OFFSET plan has an estimated cost of 0.0196955:That is a saving of 0.000098 estimated cost units (though the
OFFSET plan would require extra operators if you want to return a row number for each row). The OFFSET plan will still be slightly cheaper, generally speaking, but do remember that estimated costs are exactly that - real testing is still required. The bulk of the cost in both plans is the cost of the full sort of the input set, so helpful indexes would benefit both solutions.Where constant literal values are used (e.g.
OFFSET 30 in the original example) the optimizer can use a TopN Sort instead of a full sort followed by a Top. When the rows needed from the TopN Sort is a constant literal and - Having some paging queries automatically transformed and not others could be confusing
- The
OFFSETplan is not guaranteed to be better in all cases
One example for the third point above occurs where the paging set is quite wide. It can be much more efficient to seek the keys needed using a nonclustered index and manually lookup against the clustered index compared with scanning the index with
OFFSET or ROW_NUMBER. There are additional issues to consider if the paging application needs to know how many rows or pages there are in total. There is another good discussion of the relative merits of the 'key seek' and 'offset' methods here.Overall, it is probably better that people make an informed decision to change their paging queries to use
OFFSET, if appropriate, after thorough testing.Code Snippets
DECLARE
@PageSize bigint = 10,
@PageNumber integer = 3;
WITH Numbered AS
(
SELECT TOP ((@PageNumber + 1) * @PageSize)
o.*,
rn = ROW_NUMBER() OVER (
ORDER BY o.[object_id])
FROM #objects AS o
ORDER BY
o.[object_id]
)
SELECT
x.name,
x.[object_id],
x.principal_id,
x.[schema_id],
x.parent_object_id,
x.[type],
x.type_desc,
x.create_date,
x.modify_date,
x.is_ms_shipped,
x.is_published,
x.is_schema_published
FROM Numbered AS x
WHERE
x.rn >= @PageNumber * @PageSize
AND x.rn < ((@PageNumber + 1) * @PageSize)
ORDER BY
x.[object_id];
SELECT
o.name,
o.[object_id],
o.principal_id,
o.[schema_id],
o.parent_object_id,
o.[type],
o.type_desc,
o.create_date,
o.modify_date,
o.is_ms_shipped,
o.is_published,
o.is_schema_published
FROM #objects AS o
ORDER BY
o.[object_id]
OFFSET @PageNumber * @PageSize - 1 ROWS
FETCH NEXT @PageSize ROWS ONLY;Context
StackExchange Database Administrators Q#30210, answer score: 14
Revisions (0)
No revisions yet.