HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMajor

Order By causes a scan on a large table

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
orderscanlargetablecauses

Problem

I have the following query;

SELECT TOP 100 ID
FROM [dbo].[TableName] WITH (NOLOCK)
WHERE TypeId = 2
    AND DateTimeUTC  '4/26/2022 7:36:36 AM'
ORDER BY ID ASC


The table [dbo].[TableName] (Not its real name, btw) has just over 118 million rows.

I've created the following Index on this table;

CREATE INDEX [ix_TableName_DateTimeUTC_TypeId] 
ON [dbo].[TableName] (DateTimeUTC, TypeId)
    WITH FILLFACTOR = 90;


If I run this query (excluding the ORDER BY), the query performs a SEEK on the above index, and completes instantly. However, as soon as I include the ORDER BY, the query performs a SCAN instead on the PK, reading all 118+ million rows. As you can imagine, this tanks the performance and the query takes a long time to finish.

The simplest way to resolve this problem is to just remove the ORDER BY clause altogether, however I don't think that's possible because the application (which makes this call) requires the data to be returned in order.

Any suggestions on how to improve this?

Solution

sortie

I would change the index to look like this:

CREATE INDEX 
    [TypeId_Id_DateTimeUTC] 
ON [dbo].[TableName] 
(
    TypeId, 
    Id, 
    DateTimeUTC
)
WITH 
(
    FILLFACTOR = 100,
    SORT_IN_TEMPDB = ON
);


The idea is to make the initial data location and sorting free, and also support the range predicate.

Having Id as the second column is so that the TypeId = 2 portion can be seeked and then the index rows are logically ordered by Id - SQL Server then just needs to read the index rows in their existing order until 100 rows matching the DateTimeUTC predicate are read. i.e. it is to avoid any sort operation.

I discuss this in some detail in these blog posts:

Let’s Design A SQL Server Index Together Part 1, Part 2, Part 3.

It is usually better, as a practical matter, to avoid a sort than a residual predicate.

Code Snippets

CREATE INDEX 
    [TypeId_Id_DateTimeUTC] 
ON [dbo].[TableName] 
(
    TypeId, 
    Id, 
    DateTimeUTC
)
WITH 
(
    FILLFACTOR = 100,
    SORT_IN_TEMPDB = ON
);

Context

StackExchange Database Administrators Q#315205, answer score: 20

Revisions (0)

No revisions yet.