HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

Why can't unused columns and tables be ignored in a recursive CTE

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
whycantablesignoredcolumnsunusedcterecursiveand

Problem

Usually, SQL Server can optimize away any unused columns in the execution plan, and unused joined tables are not queried. But as soon as a recursive CTE comes into play, apparently all columns and joined tables are queried, even if not necessary. Why is that? Can I help SQL Server to ignore unused columns/tables here?

Example:

;WITH cte AS
(
    SELECT cn.CatalogNodeId,
           cn.ParentId,
           cn.CatalogNodeType,
           (SELECT Name FROM dbo.LocalizedCatalogNodes WHERE CatalogNodeId = cn.CatalogNodeId) AS Name
    FROM dbo.CatalogNodes AS cn
    WHERE cn.ParentId IS NULL
    UNION ALL
    SELECT cn.CatalogNodeId,
           cn.ParentId,
           cn.CatalogNodeType,
           (SELECT Name FROM dbo.LocalizedCatalogNodes WHERE CatalogNodeId = cn.CatalogNodeId) AS Name
    FROM dbo.CatalogNodes AS cn
    JOIN cte ON cte.CatalogNodeId = cn.ParentId
)
SELECT CatalogNodeId FROM cte


This is a simple recursive CTE resolving a hierarchical parent-child structure, adding a localized name to each node.

Even though not necessary, the execution plan shows that column CatalogNodeType is retrieved and dbo.LocalizedCatalogNodes is joined for each iteration.

Now comment out the UNION ALL part making it a non-recursive CTE, and the execution plan suddenly consists of just two steps, not containing column CatalogNodeType or table dbo.LocalizedCatalogNodes.

Solution

But as soon as a recursive CTE comes into play, apparently all columns and joined tables are queried, even if not necessary. Why is that?

  • Because recursive CTEs have a minimal initial implementation that has never been improved on (as of SQL Server 2022).



  • Recursion isn't a relational concept. Most of the optimizer's abilities are founded on relational equivalences. It's usual bag of tricks mostly can't be applied within your CTE, especially in the recursive part where the semantics can be unusual at times.



As a result, the entire area of the plan representing the recursive CTE is a hard-coded iterative translation of recursion with a stack spool serving as the, well, stack.

There are very few optimizations applied to this area of the plan, limited to eliminating the recursive part if it is provably empty (leaving only the anchor), or introducing another spool after the whole thing in case it is referenced and reused many times in the calling query.

Can I help SQL Server to ignore unused columns/tables here?

Not while continuing to use the recursive CTE as written, no. The ability to do what you want is simply not available.

In some ways, the current implementation reflects what you might write using a recursive multi-statement table-valued function. Clearly, the table variable used by the function requires a fixed structure, which is populated in full regardless of the requirements of the caller of the function. A recursive CTE does not have the same nesting level limit of 32 as a recursive function call though.

In this case, you could usefully move the repeated subquery outside the CTE, either keeping it as a correlated subquery or expressing it as an outer join. This resolves one part of the question but doesn't really address the spirit of it (as I interpret it anyway).

As a side note, SQL Graph reuses and builds on several basic features of recursive CTEs in the parser and query processor, including the initial translation from recursive syntax to iteration. SQL Graph has seen more than one round of investment and uses a different logical and physical implementation, which is more amenable to relational manipulations.

Your example, when represented as a graph in SQL Server, does not require recursion but in other queries where recursion is needed, it is implemented very differently. Graph plans (recursive or otherwise) are much more amenable to the range of simplifications available to 'normal' SQL queries, including unnecessary projection removal as requested here.

Context

StackExchange Database Administrators Q#288839, answer score: 3

Revisions (0)

No revisions yet.