HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

Why isn't the primary (clustered) key being used in this query?

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
thisclusteredwhytheprimaryusedquerybeingisnkey

Problem

I have a SQL Server 2008 R2 table whose schema structure looks as follows:

CREATE TABLE [dbo].[CDSIM_BE]
(
    [ID] [bigint] NOT NULL,
    [EquipmentID] [varchar](50) NOT NULL,
    [SerialNumber] [varchar](50) NULL,
    [PyrID] [varchar](50) NULL,
    [MeasMode] [varchar](50) NULL,
    [ReadTime] [datetime] NOT NULL,
    [SubID] [varchar](15) NULL,
    [ProbePosition] [float] NULL,
    [DataPoint] [int] NULL,

    CONSTRAINT [PK_CDSIM_BE] 
    PRIMARY KEY CLUSTERED ([ID] ASC, [EquipmentID] ASC, [ReadTime] ASC)
         WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, 
               IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, 
               ALLOW_PAGE_LOCKS = ON) ON [MonthlyArchiveScheme9]([ReadTime])
) ON [MonthlyArchiveScheme9]([ReadTime])

CREATE NONCLUSTERED INDEX [idx_CDSIM_BE__SubID_ProbePosition] 
ON [dbo].[CDSIM_BE] ([SubID] ASC, [ProbePosition] ASC)
INCLUDE ([EquipmentID], [ReadTime], [BECorr]) 
    WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, 
          SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, 
          ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [MonthlyArchiveScheme9]([ReadTime])

CREATE NONCLUSTERED INDEX [IX_CDSIM_BE_ProbePosition] 
ON [dbo].[CDSIM_BE] ([ProbePosition] ASC)
INCLUDE ([SerialNumber], [SubID]) 
    WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, 
          SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, 
          ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [MonthlyArchiveScheme9]([ReadTime])

CREATE NONCLUSTERED INDEX [IX_CSDIM_Readtime] 
ON [dbo].[CDSIM_BE]([ReadTime] ASC)
INCLUDE ([EquipmentID]) 
    WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, 
          SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, 
          ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [MonthlyArchiveScheme9]([ReadTime])


And I'm executing this simple query:

Select Max(Id)
From dbo.CDSIM_BE


There are ~2.5B rows in the table.

The query plan shows an index scan being done on

Solution

The clustered index is partitioned on ReadTime so it couldn't use the PK as you describe. It would need to find the Max(Id) for each partition and then find the max of those. It is possible to rewrite the query to get such a plan however.

Using an example based on the article here a possible rewrite might be

SELECT MAX(ID) AS ID
FROM   sys.partitions AS P
       CROSS APPLY (SELECT MAX(ID) AS ID
                    FROM   [dbo].[CDSIM_BE]
                    WHERE  $PARTITION.MonthlyArchiveFunction9(ReadTime) 
                                                    = P.partition_number) AS A
WHERE  P.object_id = OBJECT_ID('dbo.CDSIM_BE')
       AND P.index_id <= 1;


To process each partition in turn.

Note the plan still has a scan (with a seek predicate to select the partition) but this is not a full scan of the partition.

The scan is in index order with direction "BACKWARD". The TOP iterator can stop requesting rows from the scan after the first one is received.

Code Snippets

SELECT MAX(ID) AS ID
FROM   sys.partitions AS P
       CROSS APPLY (SELECT MAX(ID) AS ID
                    FROM   [dbo].[CDSIM_BE]
                    WHERE  $PARTITION.MonthlyArchiveFunction9(ReadTime) 
                                                    = P.partition_number) AS A
WHERE  P.object_id = OBJECT_ID('dbo.CDSIM_BE')
       AND P.index_id <= 1;

Context

StackExchange Database Administrators Q#104610, answer score: 7

Revisions (0)

No revisions yet.