HiveBrain v1.2.0
Get Started
← Back to all entries
debugsqlMinor

Why can't SQL Server find a seek plan? Is this a bug?

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
thiswhycanseekbugsqlplanfindserver

Problem

create table Test1 (Id int not null, H char, primary key (Id), index i1 unique (H))
create table Test2 (Id int not null, H char, primary key (Id), index i2 unique (H))
insert into Test1 values (1, 'A'), (2, 'B')
insert into Test2 values (1, 'A'), (2, 'C')


this query fails Query processor could not produce a query plan because of the hints defined in this query. (if I remove the forceseek hints then it runs but it will scan one of the two tables - even if it's extremely large)

select * from Test1 a with (forceseek)
join Test2 b with (forceseek) on a.Id = b.Id
where a.H = 'A' or b.H = 'C'


this equivalent query runs fine:

select * from Test1 a with (forceseek)
join Test2 b with (forceseek) on a.Id = b.Id
where a.H = 'A'
union
select * from Test1 a with (forceseek)
join Test2 b with (forceseek) on a.Id = b.Id
where b.H = 'C'


and gives plan

..I don't understand why SQL Server doesn't run the first query optimally. Is this a known issue? Does it have a name?

Solution

For the original query in question optimizer considers(*) two main alternatives normally (when FORCESEEK hints aren't used on any of the tables).

First one is simple join

when indexes from both tables are scanned entirely (without predicates), and predicate a.Id = b.Id AND (a.H = 'A' OR b.H = 'C') is tested at join node.

Second one is apply form (more about it here)

when index on one of the tables is scanned on the outer side of the Nested Loops join, and then index data is used on the inner side to seek into clustered index on the other table using b.Id = a.Id seek predicate and additional a.H = 'A' OR b.H = 'C' predicate. It can be expressed in T-SQL as

SELECT *
FROM Test1 a
    CROSS APPLY (
        SELECT *
        FROM Test2 b
        WHERE b.Id = a.Id AND (a.H = 'A' OR b.H = 'C')
    ) appl


When FORCESEEK is used on one of the tables, simple join alternative falls out from consideration, but optimizer additionally considers modified form of apply

which unions two seeks into clustered index on the other table. One with b.Id = a.Id seek predicate and additional b.H = 'C' predicate. And the other with b.Id = a.Id seek predicate beyond Filter with a.H = 'A' startup predicate. It can be expressed in T-SQL as

SELECT *
FROM Test1 a
    CROSS APPLY (
        SELECT DISTINCT u.Id, u.H
        FROM (
            SELECT b.Id, b.H
            FROM Test2 b
            WHERE b.Id = a.Id AND a.H = 'A'
            UNION ALL
            SELECT b.Id, b.H
            FROM Test2 b
            WHERE b.Id = a.Id AND b.H = 'C'
        ) u
    ) appl


There are more alternatives actually (that use spooling on the inner side of the apply, for example, or different physical join implementation for simple join, or non-clustered index scan instead of clustered index scan, or vice versa, etc.), but above execution plans shapes are quite representative.

When FORCESEEK is used on both tables, no new alternatives appear. Moreover, apply alternatives become rejected after consideration due to seek requirement on both tables.

So, I think we can say, that possible implementations of the original written form of the query require FORCESEEK requirement to be relaxed for at least one of the tables.

You have another equivalent query, but discovering such alternative is not implemented in the current version of query optimizer unfortunately. This is not a bug though, just imperfection.

Notice also that you add FORCESEEK to convince optimizer to seek in the non-clustered index, but in the above cases optimizer understands it in its own way and performs seek over clustered index instead. Rewriting query, when its performance is not satisfactory, is one of the first (and right) things to try often.

(*) One may find it out by analyzing final memo structure and applied transformations (using undocumented trace flags 8615, 8619 and 8621).

Code Snippets

SELECT *
FROM Test1 a
    CROSS APPLY (
        SELECT *
        FROM Test2 b
        WHERE b.Id = a.Id AND (a.H = 'A' OR b.H = 'C')
    ) appl
SELECT *
FROM Test1 a
    CROSS APPLY (
        SELECT DISTINCT u.Id, u.H
        FROM (
            SELECT b.Id, b.H
            FROM Test2 b
            WHERE b.Id = a.Id AND a.H = 'A'
            UNION ALL
            SELECT b.Id, b.H
            FROM Test2 b
            WHERE b.Id = a.Id AND b.H = 'C'
        ) u
    ) appl

Context

StackExchange Database Administrators Q#265309, answer score: 4

Revisions (0)

No revisions yet.