patternsqlModerate

Why is this join cardinality estimate so large?

Submitted by: @import:stackexchange-dba·Mar 10, 2026·

Viewed 0 times

thisestimatewhycardinalityjoinlarge

Problem

I am experiencing what I think is an impossibly high cardinality estimate for the following query:

SELECT dm.PRIMARY_ID
FROM
(
    SELECT COALESCE(d1.JOIN_ID, d2.JOIN_ID, d3.JOIN_ID) PRIMARY_ID
    FROM X_DRIVING_TABLE dt
    LEFT OUTER JOIN X_DETAIL_1 d1 ON dt.ID = d1.ID
    LEFT OUTER JOIN X_DETAIL_LINK lnk ON d1.LINK_ID = lnk.LINK_ID
    LEFT OUTER JOIN X_DETAIL_2 d2 ON dt.ID = d2.ID
    LEFT OUTER JOIN X_DETAIL_3 d3 ON dt.ID = d3.ID
) dm
INNER JOIN X_LAST_TABLE lst ON dm.PRIMARY_ID = lst.JOIN_ID;

The estimated plan is here. I am working on a statistics copy of the tables so I cannot include an actual plan. However, I don't think that it's very relevant for this problem.

SQL Server estimates that 481577 rows will be returned from the "dm" derived table. It then estimates that 4528030000 rows will be returned after the join to X_LAST_TABLE is performed, but JOIN_ID is the primary key of X_LAST_TIME. I would expect a join cardinality estimate between 0 and 481577 rows. Instead, the row estimate appears to be 10% of the number of rows I would get when cross joining the outer and the inner tables. The math for this works out with rounding: 481577940250.1 = 45280277425 which is rounded to 4528030000.

I am primarily looking for a root cause for this behavior. I am interested in simple workarounds as well, but please do not suggest changing the data model or using temp tables. This query is a simplification of logic within a view. I know that doing COALESCE on a few columns and joining on them isn't a good practice. Part of the goal of this question is to figure out if I need to recommend that the data model be redesigned.

I am testing on Microsoft SQL Server 2014 with the legacy cardinality estimator enabled. TF 4199 and others are on. I can give a full list of trace flags if that ends up being relevant.

Here is the most relevant table definition:

```
CREATE TABLE X_LAST_TABLE (
JOIN_ID NUMERIC(18, 0) NOT NULL
CONSTRAINT PK_X_LAST_TABLE PRIMARY KEY

Solution

I know that doing COALESCE on a few columns and joining on them isn't a good practice.

Generating good cardinality and distribution estimates is hard enough when the schema is 3NF+ (with keys and constraints) and the query is relational and primarily SPJG (selection-projection-join-group by). The CE model is built on those principles. The more unusual or non-relational features there are in a query, the closer one gets to the boundaries of what the cardinality and selectivity framework can handle. Go too far and CE will give up and guess.

Most of the MCVE example is simple SPJ (no G), albeit with predominantly outer equijoins (modelled as inner join plus anti-semijoin) rather than the simpler inner equijoin (or semijoin). All the relations have keys, though no foreign keys or other constraints. All but one of the joins are one-to-many, which is good.

The exception is the many-to-many outer join between X_DETAIL_1 and X_DETAIL_LINK. The only function of this join in the MCVE is to potentially duplicate rows in X_DETAIL_1. This is an unusual sort of a thing.

Simple equality predicates (selections) and scalar operators are also better. For example attribute compare-equal attribute/constant normally works well in the model. It is relatively "easy" to modify histograms and frequency statistics to reflect the application of such predicates.

COALESCE is built on CASE, which is in turn implemented internally as IIF (and this was true well before IIF appeared in the Transact-SQL language). The CE models IIF as a UNION with two mutually-exclusive children, each consisting of a project on a selection on the input relation. Each of the listed components has model support, so combining them is relatively straightforward. Even so, the more one layers abstractions, the less accurate the end result tends to be - a reason why larger execution plans tend to be less stable and reliable.

ISNULL, on the other hand, is intrinsic to the engine. It is not built up using any more basic components. Applying the effect of ISNULL to a histogram, for example, is as simple as replacing the step for NULL values (and compacting as necessary). It is still relatively opaque, as scalar operators go, and so best avoided where possible. Nevertheless, it is generally speaking more optimizer-friendly (less optimizer-unfriendly) than a CASE-based alternate.

The CE (70 and 120+) is very complex, even by SQL Server standards. It is not a case of applying simple logic (with a secret formula) to each operator. The CE knows about keys and functional dependencies; it knows how to estimate using frequencies, multivariate statistics, and histograms; and there is an absolute ton of special cases, refinements, checks & balances, and supporting structures. It often estimates e.g. joins in multiple ways (frequency, histogram) and decides on an outcome or adjustment based on the differences between the two.

One last basic thing to cover: Initial cardinality estimation runs for every operation in the query tree, from the bottom up. Selectivity and cardinality is derived for leaf operators first (base relations). Modified histograms and density/frequency information is derived for parent operators. The further up the tree we go, the lower the quality of estimates tends to be as errors tend to accumulate.

This single initial comprehensive estimation provides a starting point, and occurs well before any consideration is given to a final execution plan (it happens a way before even the trivial plan compilation stage). The query tree at this point tends to reflect the written form of the query fairly closely (though with subqueries removed, and simplifications applied etc.)

Immediately after initial estimation, SQL Server performs heuristic join reordering, which loosely speaking tries to reorder the tree to place smaller tables and high-selectivity joins first. It also tries to position inner joins before outer joins and cross products. Its capabilities are not extensive; its efforts are not exhaustive; and it does not consider physical costs (since they do not exist yet - only statistical information and metadata information are present). Heuristic reorder is most successful on simple inner equijoin trees. It exists to provide a "better" starting point for cost-based optimization.

Why is this join cardinality estimate so large?

The MCVE has an "unusual" mostly-redundant many-to-many join, and an equi join with COALESCE in the predicate. The operator tree also has an inner join last, which heuristic join reorder was unable to move up the tree to a more preferred position. Leaving aside all the scalars and projections, the join tree is:
`LogOp_Join [ Card=4.52803e+009 ]
LogOp_LeftOuterJoin [ Card=481577 ]
LogOp_LeftOuterJoin [ Card=481577 ]
LogOp_LeftOuterJoin [ Card=481577 ]
LogOp_LeftOuterJoin [ Card=481577 ]
LogOp_Get TBL: X_DRIVING_TABLE(alias TBL: dt) [ Card=48157

Code Snippets

SELECT 
    PRIMARY_ID = COALESCE(d1.JOIN_ID, d2.JOIN_ID, d3.JOIN_ID)
FROM X_DRIVING_TABLE dt
LEFT OUTER JOIN X_DETAIL_1 d1
    ON dt.ID = d1.ID
LEFT OUTER JOIN X_DETAIL_LINK lnk 
    ON d1.LINK_ID = lnk.LINK_ID
LEFT OUTER JOIN X_DETAIL_2 d2 
    ON dt.ID = d2.ID
LEFT OUTER JOIN X_DETAIL_3 d3 
    ON dt.ID = d3.ID
INNER JOIN X_LAST_TABLE lst 
    ON lst.JOIN_ID = COALESCE(d1.JOIN_ID, d2.JOIN_ID, d3.JOIN_ID)

-- the moment of doom
movsd xmm0,mmword ptr [sqllang!x_Selectivity_Equal

Context

StackExchange Database Administrators Q#158201, answer score: 17

Revisions (0)

No revisions yet.