HiveBrain v1.2.0
Get Started
← Back to all entries
gotchasqlMinor

Unexpected Table Scan with Parameterized LIKE

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
scanwithlikeunexpectedparameterizedtable

Problem

I'm experiencing an unexpected table scan on SQL Server 2005 against a heap table when parameterizing a LIKE statement... but when the same value as the variable is hard-coded, the expected Index Seek occurs instead.

The issue only happens given this specific scenario... so I'm not confused as to how to resolve the issue, I'm confused as to why this is happening.

The following T-SQL should recreate the issue on SQL Server 2005:

IF (OBJECT_ID('tempdb.dbo.#tblTest') IS NOT NULL)
    DROP TABLE dbo.#tblTest
GO

CREATE TABLE dbo.#tblTest (
    ID INT IDENTITY(1, 1),
    SerialNumber VARCHAR(50)
)
GO

-- Populate the table with 10,000 rows
SET NOCOUNT ON
DECLARE @i INT
SET @i = 0

WHILE @i < 10000
BEGIN
    INSERT INTO dbo.#tblTest VALUES(CAST(@i AS VARCHAR(10)))
    SET @i = @i + 1
END
GO

-- To recreate the issue, the table must be a heap.
ALTER TABLE dbo.#tblTest ADD CONSTRAINT PK_tblTest PRIMARY KEY NONCLUSTERED (ID)
GO

-- Create a (non-covering) index on serial number.
CREATE NONCLUSTERED INDEX IX_tblTest_SerialNumber ON dbo.#tblTest (SerialNumber)
GO

DECLARE @Criteria VARCHAR(50)
SET @Criteria = '1234%'

-- This produces a Table Scan.
SELECT * 
FROM dbo.#tblTest
WHERE SerialNumber LIKE @Criteria

-- This produces an Index Seek
SELECT *
FROM dbo.#tblTest
WHERE SerialNumber LIKE '1234%'


I was directed towards this article by Paul White which seems very closely related, but the conclusions / explanations don't match my specific issue.

Any insight is appreciated.

Solution

The claim that it occurs only for non-clustered indexes is due to the fact that you only had two columns - one being the indexed column, the other being

  • the clustering key (i.e. the other column) if there were a clustered index



  • the RID if there were no clustered index



In the second case, to satisfy the SELECT (all columns) portion of your query, it would have to perform an expensive lookup, so the generic (robust) plan to perform a 10,000 record table scan is chosen instead. In the first case, the index is all that is required to satisfy the SELECT clause.

*It should be noted that the number of records and index cardinality play a role in determining the plan as well.

With more columns, the plan predictably switches to an CLUSTERED INDEX SCAN for the parameterized LIKE statement even with a clustered index, per the revised test below.

IF (OBJECT_ID('tempdb.dbo.#tblTest') IS NOT NULL)
    DROP TABLE dbo.#tblTest
GO

CREATE TABLE dbo.#tblTest (
    ID INT IDENTITY(1, 1),
    SerialNumber VARCHAR(50),
    Othercolumn1 uniqueidentifier default (newid()),
    RowVer timestamp
)
GO

-- Populate the table with 10,000 rows
SET NOCOUNT ON
DECLARE @i INT
SET @i = 0

WHILE @i < 10000
BEGIN
    INSERT INTO dbo.#tblTest (serialnumber) VALUES(CAST(@i AS VARCHAR(10)))
    SET @i = @i + 1
END
GO

-- To recreate the issue, the table must be a heap.
ALTER TABLE dbo.#tblTest ADD CONSTRAINT PK_tblTest PRIMARY KEY CLUSTERED (ID)
GO

-- Create a (non-covering) index on serial number.
CREATE NONCLUSTERED INDEX IX_tblTest_SerialNumber ON dbo.#tblTest (SerialNumber)
GO

DECLARE @Criteria VARCHAR(50)
SET @Criteria = '1234%'

-- This produces a Table Scan.
SELECT * 
FROM dbo.#tblTest
WHERE SerialNumber LIKE @Criteria

-- This produces an Index Seek
SELECT *
FROM dbo.#tblTest
WHERE SerialNumber LIKE '1234%'


Here are the plans generated based on my revised table structure. For the schema in the question, the top part becomes a Table Scan and the bottom part becomes an RID Lookup instead of Key Lookup - all else being equal.

One of the more expensive operations in executing a query is first constructing the execution plan. To help this along, SQL Server has a Plan Cache that stores the text of statements and the associated SET settings. The same text using different SET settings can result in different behaviour, so that is planned anew and is stored as a separate entry.

The non-parameterized query is simple to plan - it contains the exact text '1234%'. The VARCHAR index on SerialNumber is easily searched for the portion containing the prefix "1234". SQL Server also estimates the cardinality of the query and invariably for your data will choose the INDEX SEEK plan. Any further presentation of the exact query statement (text) to SQL Server will contain the static value '1234%' and is safe to re-execute the same plan efficiently.

On the other hand the parameterized query is stored into the Plan Cache (dictionary) keyed by the statement text ... WHERE SerialNumber LIKE @Criteria. Even though @Criteria in the current batch contains the value '1234%' and can use an INDEX SEEK, it is quite possible for another user to submit the exact same query with @Criteria set to '%9' instead which would not perform as well using the INDEX SEEK + RID Lookup. This would SELECT 10% of the data which is often over the tipping point at which index seeking is no longer favourable. For robustness and reusability, the plan cached for this query (and then executed) is the Table Scan version that will cater for the widest range of @Criteria values at average efficiency across the possible values.

Code Snippets

IF (OBJECT_ID('tempdb.dbo.#tblTest') IS NOT NULL)
    DROP TABLE dbo.#tblTest
GO

CREATE TABLE dbo.#tblTest (
    ID INT IDENTITY(1, 1),
    SerialNumber VARCHAR(50),
    Othercolumn1 uniqueidentifier default (newid()),
    RowVer timestamp
)
GO

-- Populate the table with 10,000 rows
SET NOCOUNT ON
DECLARE @i INT
SET @i = 0

WHILE @i < 10000
BEGIN
    INSERT INTO dbo.#tblTest (serialnumber) VALUES(CAST(@i AS VARCHAR(10)))
    SET @i = @i + 1
END
GO

-- To recreate the issue, the table must be a heap.
ALTER TABLE dbo.#tblTest ADD CONSTRAINT PK_tblTest PRIMARY KEY CLUSTERED (ID)
GO

-- Create a (non-covering) index on serial number.
CREATE NONCLUSTERED INDEX IX_tblTest_SerialNumber ON dbo.#tblTest (SerialNumber)
GO

DECLARE @Criteria VARCHAR(50)
SET @Criteria = '1234%'

-- This produces a Table Scan.
SELECT * 
FROM dbo.#tblTest
WHERE SerialNumber LIKE @Criteria

-- This produces an Index Seek
SELECT *
FROM dbo.#tblTest
WHERE SerialNumber LIKE '1234%'

Context

StackExchange Database Administrators Q#39101, answer score: 5

Revisions (0)

No revisions yet.