patternMinor
Should I nest dependent outer joins in SQL Server?
Viewed 0 times
sqlnestshouldserverouterdependentjoins
Problem
I have heard mixed information on this and am hoping for a canonical or expert opinion.
If I have multiple
For a contrived example, the
http://sqlfiddle.com/#!3/31022/5
Compared to http://sqlfiddle.com/#!3/31022/7
As shown above these produce different query plans in SS2k8
If I have multiple
LEFT OUTER JOINs , each dependent on the last, is it better to nest them?For a contrived example, the
JOIN to MyParent depends on the JOIN to MyChild:http://sqlfiddle.com/#!3/31022/5
SELECT
{columns}
FROM
MyGrandChild AS gc
LEFT OUTER JOIN
MyChild AS c
ON c.[Id] = gc.[ParentId]
LEFT OUTER JOIN
MyParent AS p
ON p.[id] = c.[ParentId]Compared to http://sqlfiddle.com/#!3/31022/7
SELECT
{columns}
FROM
MyGrandChild AS gc
LEFT OUTER JOIN
(
MyChild AS c
LEFT OUTER JOIN
MyParent AS p
ON p.[id] = c.[ParentId]
)
ON c.[Id] = gc.[ParentId]As shown above these produce different query plans in SS2k8
Solution
This is absolutely not a canonical answer but I noticed that for the nested loops query plans shown in the SQL Fiddle it was possible to apply the plan from Query 2 to Query 1 with the use of the
Query processor could not produce query plan because USE PLAN hint
contains plan that could not be verified to be legal for query. Remove
or replace USE PLAN hint. For best likelihood of successful plan
forcing, verify that the plan provided in the USE PLAN hint is one
generated automatically by SQL Server for the same query.
Disabling the optimizer transformation rule
Experimenting with greater quantities of data shows that SQL Server is certainly capable of transforming
A very contrived case where the first query performs better than the second is
Which gives plans
For me Query 1 had an elapsed time of 108 ms vs 1,163 ms for Query 2.
Query 1
Query 2
So it might be provisionally assumed that the first ("unnested") syntax is potentially beneficial as it allows more potential join orders to be considered but I haven't done exhaustive enough testing to have much confidence in this as a general rule.
It may well be entirely possible to come up with counter examples where Query 2 performs better. Try both and look at the execution plans.
USE PLAN hint but attempting the reverse operation fails with Query processor could not produce query plan because USE PLAN hint
contains plan that could not be verified to be legal for query. Remove
or replace USE PLAN hint. For best likelihood of successful plan
forcing, verify that the plan provided in the USE PLAN hint is one
generated automatically by SQL Server for the same query.
Disabling the optimizer transformation rule
ReorderLOJN prevents the previously successful plan hint from succeeding too.Experimenting with greater quantities of data shows that SQL Server is certainly capable of transforming
(A LOJ B) LOJ C to A LOJ (B LOJ C) naturally as well but I didn't see any evidence that the reverse is true.A very contrived case where the first query performs better than the second is
DROP TABLE MyGrandChild , MyChild, MyParent
CREATE TABLE MyParent
(Id int)
CREATE TABLE MyChild
(Id int PRIMARY KEY
,ParentId int,
Filler char(8000) NULL)
CREATE TABLE MyGrandChild
(Id int
,ParentId int)
INSERT INTO MyChild
(Id, ParentId)
SELECT TOP (100000) ROW_NUMBER() OVER (ORDER BY @@SPID),
ROW_NUMBER() OVER (ORDER BY @@SPID)
FROM master..spt_values v1, master..spt_values
INSERT INTO MyGrandChild
(Id, ParentId)
OUTPUT INSERTED.Id INTO MyParent
SELECT TOP (3000) Id, Id AS ParentId
FROM MyChild
ORDER BY Id
SET STATISTICS IO ON;
SET STATISTICS TIME ON;
SELECT gc.Id AS gcId,
gc.ParentId AS gcpId,
c.Id AS cId,
c.ParentId AS cpId,
p.Id AS pId
FROM MyGrandChild AS gc
LEFT OUTER JOIN MyChild AS c
ON c.[Id] = gc.[ParentId]
LEFT OUTER JOIN MyParent AS p
ON p.[Id] = c.[ParentId]
SELECT gc.Id AS gcId,
gc.ParentId AS gcpId,
c.Id AS cId,
c.ParentId AS cpId,
p.Id AS pId
FROM MyGrandChild AS gc
LEFT OUTER JOIN( MyChild AS c
LEFT OUTER JOIN MyParent AS p
ON p.[Id] = c.[ParentId])
ON c.[Id] = gc.[ParentId]Which gives plans
For me Query 1 had an elapsed time of 108 ms vs 1,163 ms for Query 2.
Query 1
Table 'Worktable'. Scan count 0, logical reads 0
Table 'MyChild'. Scan count 0, logical reads 9196
Table 'MyGrandChild'. Scan count 1, logical reads 7
Table 'MyParent'. Scan count 1, logical reads 5Query 2
Table 'MyParent'. Scan count 1, logical reads 15000
Table 'MyChild'. Scan count 0, logical reads 9000
Table 'MyGrandChild'. Scan count 1, logical reads 7So it might be provisionally assumed that the first ("unnested") syntax is potentially beneficial as it allows more potential join orders to be considered but I haven't done exhaustive enough testing to have much confidence in this as a general rule.
It may well be entirely possible to come up with counter examples where Query 2 performs better. Try both and look at the execution plans.
Code Snippets
DROP TABLE MyGrandChild , MyChild, MyParent
CREATE TABLE MyParent
(Id int)
CREATE TABLE MyChild
(Id int PRIMARY KEY
,ParentId int,
Filler char(8000) NULL)
CREATE TABLE MyGrandChild
(Id int
,ParentId int)
INSERT INTO MyChild
(Id, ParentId)
SELECT TOP (100000) ROW_NUMBER() OVER (ORDER BY @@SPID),
ROW_NUMBER() OVER (ORDER BY @@SPID)
FROM master..spt_values v1, master..spt_values
INSERT INTO MyGrandChild
(Id, ParentId)
OUTPUT INSERTED.Id INTO MyParent
SELECT TOP (3000) Id, Id AS ParentId
FROM MyChild
ORDER BY Id
SET STATISTICS IO ON;
SET STATISTICS TIME ON;
SELECT gc.Id AS gcId,
gc.ParentId AS gcpId,
c.Id AS cId,
c.ParentId AS cpId,
p.Id AS pId
FROM MyGrandChild AS gc
LEFT OUTER JOIN MyChild AS c
ON c.[Id] = gc.[ParentId]
LEFT OUTER JOIN MyParent AS p
ON p.[Id] = c.[ParentId]
SELECT gc.Id AS gcId,
gc.ParentId AS gcpId,
c.Id AS cId,
c.ParentId AS cpId,
p.Id AS pId
FROM MyGrandChild AS gc
LEFT OUTER JOIN( MyChild AS c
LEFT OUTER JOIN MyParent AS p
ON p.[Id] = c.[ParentId])
ON c.[Id] = gc.[ParentId]Table 'Worktable'. Scan count 0, logical reads 0
Table 'MyChild'. Scan count 0, logical reads 9196
Table 'MyGrandChild'. Scan count 1, logical reads 7
Table 'MyParent'. Scan count 1, logical reads 5Table 'MyParent'. Scan count 1, logical reads 15000
Table 'MyChild'. Scan count 0, logical reads 9000
Table 'MyGrandChild'. Scan count 1, logical reads 7Context
StackExchange Database Administrators Q#58602, answer score: 3
Revisions (0)
No revisions yet.