patternsqlMinor
Index converted from Seek to Scan in CTE
Viewed 0 times
seekscancteindexfromconverted
Problem
SQL Server version:
Microsoft SQL Server 2019 (RTM-CU8-GDR) (KB4583459) - 15.0.4083.2 (X64)
Nov 2 2020 18:35:09
Copyright (C) 2019 Microsoft Corporation
Enterprise Edition: Core-based Licensing (64-bit) on Windows Server 2019 Standard 10.0 (Build 17763: ) (Hypervisor)
I have one query which is completing in less than a second, with 5 million records in both table used. Actual Execution plan shows Index Seek for both the tables and both table reads 1 or 2 rows.
But, when convert same query to CTE, it is taking around 2min. The Actual Execution Plan shows Index Scan and reads all 5 million rows.
Until now, I was under the impression that for normal query, it doesn't matter if query is within a CTE or not. Even same nested query is also very slow.
What could be affecting it?
Actual query is:
Query converted to CTE is:
Actual Execution Plan for performant query: Query.sqlplan
Fast Plan visualized on Paste The Plan
Actual Execution Plan for slow query using CTE: CTE.sqlplan
Slow Plan visualized on Paste The Plan
If I convert my normal query to a view and use
For a temporary solution, I created a table-valued function that accepts the criteria and it is working as expected to give an index seek. I need to keep checking for a complete solution because many of my queries a
Microsoft SQL Server 2019 (RTM-CU8-GDR) (KB4583459) - 15.0.4083.2 (X64)
Nov 2 2020 18:35:09
Copyright (C) 2019 Microsoft Corporation
Enterprise Edition: Core-based Licensing (64-bit) on Windows Server 2019 Standard 10.0 (Build 17763: ) (Hypervisor)
I have one query which is completing in less than a second, with 5 million records in both table used. Actual Execution plan shows Index Seek for both the tables and both table reads 1 or 2 rows.
But, when convert same query to CTE, it is taking around 2min. The Actual Execution Plan shows Index Scan and reads all 5 million rows.
Until now, I was under the impression that for normal query, it doesn't matter if query is within a CTE or not. Even same nested query is also very slow.
What could be affecting it?
Actual query is:
select ca.inst_bs as Base,
dateadd(mi,ROW_NUMBER() over (partition by i2.idcse,i2.idpln order by i2.dtdue,i2.id),i2.dtdue) as d1,
i2.id,
ca.id as idcse,
i2.dtdue
from tbl_cse ca with(nolock)
join tbl_Inst i2 with(nolock) on i2.idcse = ca.id
where i2.idcse = 3169
Query converted to CTE is:
with tt as
(
select ca.inst_bs as Base,
dateadd(mi,ROW_NUMBER() over (partition by i2.idcse,i2.idpln order by i2.dtdue,i2.id),i2.dtdue) as d1,
i2.id,
ca.id as idcse,
i2.dtdue
from tbl_cse ca with(nolock)
join tbl_Inst i2 with(nolock) on i2.idcse = ca.id
)
select * from tt where idcse = 3169
Actual Execution Plan for performant query: Query.sqlplan
Fast Plan visualized on Paste The Plan
Actual Execution Plan for slow query using CTE: CTE.sqlplan
Slow Plan visualized on Paste The Plan
If I convert my normal query to a view and use
idcse = 3169 criteria with the view, it is working slow, same as CTE.For a temporary solution, I created a table-valued function that accepts the criteria and it is working as expected to give an index seek. I need to keep checking for a complete solution because many of my queries a
Solution
Problems
The optimizer logic that allows a predicate to be pushed past a window function is quite limited. The predicate must reference a constant at runtime and the partitioning clause of the window function must contain the same column as the predicate.
Your database is using forced parameterization, so the constant value you supply in your query is replaced with a parameter marker
You also need to ensure the column you are filtering on is the same one used in the partitioning clause of the
In your query:
Those are not the same reference, they just have the same exposed name.
They do have the same value due to the join on
Solution
You can fix this in multiple ways. The easiest might be to return
with tt as
(
select ca.inst_bs as Base,
dateadd(mi,ROW_NUMBER() over (partition by i2.idcse,i2.idpln order by i2.dtdue,i2.id),i2.dtdue) as d1,
i2.id,
i2.idcse, -- CHANGED from ca.id as idcse
i2.dtdue
from tbl_cse ca with(nolock)
join tbl_Inst i2 with(nolock) on i2.idcse = ca.id
)
select * from tt where idcse = 3169
To workaround the forced parameterization issue, add
As I said in my Stack Overflow answer though, I'd probably stick will the table-valued function replacement. That gives you much more control over where the predicate appears.
Related reading:
User charlieface provided a db<>fiddle demo you might like to play around with.
The optimizer logic that allows a predicate to be pushed past a window function is quite limited. The predicate must reference a constant at runtime and the partitioning clause of the window function must contain the same column as the predicate.
Your database is using forced parameterization, so the constant value you supply in your query is replaced with a parameter marker
@0. This prevents the pushdown as described in the linked Q & A just above.You also need to ensure the column you are filtering on is the same one used in the partitioning clause of the
ROW_NUMBER. It is not enough that the columns have the same name or alias, they must be the same reference.In your query:
with tt as
(
select ca.inst_bs as Base,
dateadd(mi,ROW_NUMBER() over (partition by i2.idcse,i2.idpln order by i2.dtdue,i2.id),i2.dtdue) as d1,
i2.id,
ca.id as idcse,
i2.dtdue
from tbl_cse ca with(nolock)
join tbl_Inst i2 with(nolock) on i2.idcse = ca.id
)
select * from tt where idcse = 3169- The
ROW_NUMBERis partitioned byi2.idcse
- The outer test on
idcserefers toca.id(aliased asidcse)
Those are not the same reference, they just have the same exposed name.
They do have the same value due to the join on
i2.idcse = ca.id, but that extra step of reasoning is enough to prevent the optimizer applying the transformation.Solution
You can fix this in multiple ways. The easiest might be to return
i2.idcse from the CTE rather than ca.id:with tt as
(
select ca.inst_bs as Base,
dateadd(mi,ROW_NUMBER() over (partition by i2.idcse,i2.idpln order by i2.dtdue,i2.id),i2.dtdue) as d1,
i2.id,
i2.idcse, -- CHANGED from ca.id as idcse
i2.dtdue
from tbl_cse ca with(nolock)
join tbl_Inst i2 with(nolock) on i2.idcse = ca.id
)
select * from tt where idcse = 3169
To workaround the forced parameterization issue, add
OPTION (RECOMPILE) to the query.As I said in my Stack Overflow answer though, I'd probably stick will the table-valued function replacement. That gives you much more control over where the predicate appears.
Related reading:
- Query is fast but becomes sluggish when created as a view
- SQL Server Views | Inline View Expansion Guidelines on Stack Overflow
User charlieface provided a db<>fiddle demo you might like to play around with.
Code Snippets
with tt as
(
select ca.inst_bs as Base,
dateadd(mi,ROW_NUMBER() over (partition by i2.idcse,i2.idpln order by i2.dtdue,i2.id),i2.dtdue) as d1,
i2.id,
ca.id as idcse,
i2.dtdue
from tbl_cse ca with(nolock)
join tbl_Inst i2 with(nolock) on i2.idcse = ca.id
)
select * from tt where idcse = 3169Context
StackExchange Database Administrators Q#307832, answer score: 4
Revisions (0)
No revisions yet.