HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

Index converted from Seek to Scan in CTE

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
seekscancteindexfromconverted

Problem

SQL Server version:

Microsoft SQL Server 2019 (RTM-CU8-GDR) (KB4583459) - 15.0.4083.2 (X64)
Nov 2 2020 18:35:09
Copyright (C) 2019 Microsoft Corporation
Enterprise Edition: Core-based Licensing (64-bit) on Windows Server 2019 Standard 10.0 (Build 17763: ) (Hypervisor)

I have one query which is completing in less than a second, with 5 million records in both table used. Actual Execution plan shows Index Seek for both the tables and both table reads 1 or 2 rows.

But, when convert same query to CTE, it is taking around 2min. The Actual Execution Plan shows Index Scan and reads all 5 million rows.

Until now, I was under the impression that for normal query, it doesn't matter if query is within a CTE or not. Even same nested query is also very slow.

What could be affecting it?

Actual query is:
select ca.inst_bs as Base,
dateadd(mi,ROW_NUMBER() over (partition by i2.idcse,i2.idpln order by i2.dtdue,i2.id),i2.dtdue) as d1,
i2.id,
ca.id as idcse,
i2.dtdue
from tbl_cse ca with(nolock)
join tbl_Inst i2 with(nolock) on i2.idcse = ca.id
where i2.idcse = 3169


Query converted to CTE is:
with tt as
(
select ca.inst_bs as Base,
dateadd(mi,ROW_NUMBER() over (partition by i2.idcse,i2.idpln order by i2.dtdue,i2.id),i2.dtdue) as d1,
i2.id,
ca.id as idcse,
i2.dtdue
from tbl_cse ca with(nolock)
join tbl_Inst i2 with(nolock) on i2.idcse = ca.id
)
select * from tt where idcse = 3169


Actual Execution Plan for performant query: Query.sqlplan

Fast Plan visualized on Paste The Plan

Actual Execution Plan for slow query using CTE: CTE.sqlplan

Slow Plan visualized on Paste The Plan

If I convert my normal query to a view and use idcse = 3169 criteria with the view, it is working slow, same as CTE.

For a temporary solution, I created a table-valued function that accepts the criteria and it is working as expected to give an index seek. I need to keep checking for a complete solution because many of my queries a

Solution

Problems

The optimizer logic that allows a predicate to be pushed past a window function is quite limited. The predicate must reference a constant at runtime and the partitioning clause of the window function must contain the same column as the predicate.

Your database is using forced parameterization, so the constant value you supply in your query is replaced with a parameter marker @0. This prevents the pushdown as described in the linked Q & A just above.

You also need to ensure the column you are filtering on is the same one used in the partitioning clause of the ROW_NUMBER. It is not enough that the columns have the same name or alias, they must be the same reference.

In your query:

with tt as 
( 
    select ca.inst_bs as Base, 
        dateadd(mi,ROW_NUMBER() over (partition by i2.idcse,i2.idpln order by i2.dtdue,i2.id),i2.dtdue) as d1,
        i2.id,
        ca.id as idcse,
        i2.dtdue 
    from tbl_cse ca with(nolock) 
    join tbl_Inst i2 with(nolock) on i2.idcse = ca.id
)
select * from tt where idcse = 3169


  • The ROW_NUMBER is partitioned by i2.idcse



  • The outer test on idcse refers to ca.id (aliased as idcse)



Those are not the same reference, they just have the same exposed name.

They do have the same value due to the join on i2.idcse = ca.id, but that extra step of reasoning is enough to prevent the optimizer applying the transformation.
Solution

You can fix this in multiple ways. The easiest might be to return i2.idcse from the CTE rather than ca.id:

with tt as
(
select ca.inst_bs as Base,
dateadd(mi,ROW_NUMBER() over (partition by i2.idcse,i2.idpln order by i2.dtdue,i2.id),i2.dtdue) as d1,
i2.id,
i2.idcse, -- CHANGED from ca.id as idcse
i2.dtdue
from tbl_cse ca with(nolock)
join tbl_Inst i2 with(nolock) on i2.idcse = ca.id
)
select * from tt where idcse = 3169

To workaround the forced parameterization issue, add OPTION (RECOMPILE) to the query.

As I said in my Stack Overflow answer though, I'd probably stick will the table-valued function replacement. That gives you much more control over where the predicate appears.

Related reading:

  • Query is fast but becomes sluggish when created as a view



  • SQL Server Views | Inline View Expansion Guidelines on Stack Overflow



User charlieface provided a db<>fiddle demo you might like to play around with.

Code Snippets

with tt as 
( 
    select ca.inst_bs as Base, 
        dateadd(mi,ROW_NUMBER() over (partition by i2.idcse,i2.idpln order by i2.dtdue,i2.id),i2.dtdue) as d1,
        i2.id,
        ca.id as idcse,
        i2.dtdue 
    from tbl_cse ca with(nolock) 
    join tbl_Inst i2 with(nolock) on i2.idcse = ca.id
)
select * from tt where idcse = 3169

Context

StackExchange Database Administrators Q#307832, answer score: 4

Revisions (0)

No revisions yet.