patternsqlMinor
Optimize a slow nested loops join
Viewed 0 times
loopsjoinslownestedoptimize
Problem
I have this query running around 15 seconds on my system. The query plan is here. I believe the main problem is around here, where the optimizer chooses nested loops and is mistaken with cardinality:
Tables _Reference65215 and _InfoRg100966 have 454 and 3332 rows respectively.
If I force SQL Server to use hash match here the execution time drops to 2 seconds.
I cannot use any kind of hints here as the code comes from the platform. The platform uses its own language which then gets interpreted to T-SQL. I can change the query. Stats are up to date. Maxdop 1 is a general recommendation from the vendor of the platform
What might be the reason of choosing nested loops and how to fix it? Maybe there is something else that is worth attention?
```
EXEC sp_executesql
N'SELECT TOP 45 T1._IDRRef,
T1._Marked,
T1._Number,
T1._Date_Time,
T1._Posted,
T1._Fld74271RRef,
CASE
WHEN T1._Fld74272RRef=@P1
THEN T1._Fld74307RRef
ELSE T1._Fld74272RRef
END,
T1._Fld74273,
T1._Fld74274RRef,
T1._Fld74281,
CASE
WHEN T1._Fld74272RRef=@P2
THEN 0x000100DD
WHEN T1._Fld74272RRef=@P3
OR T1._Fld74272RRef=@P4
OR T1._Fld74272RRef=@P5
THEN 0x0000FF7E
ELSE 0x0000FF3E
END,
CASE
WHEN T1._Fld74272RRef=@P6
THEN T1._Fld74284RRef
WHEN T1._Fld74272RRef=@P7
OR T1._Fld74272RRef=@P8
OR T1._Fld74272RRef=@P9
THEN T1._Fld74285RRef
ELSE T1._Fld74282RRef
END,
T1._Fld74289RRe
Tables _Reference65215 and _InfoRg100966 have 454 and 3332 rows respectively.
If I force SQL Server to use hash match here the execution time drops to 2 seconds.
I cannot use any kind of hints here as the code comes from the platform. The platform uses its own language which then gets interpreted to T-SQL. I can change the query. Stats are up to date. Maxdop 1 is a general recommendation from the vendor of the platform
What might be the reason of choosing nested loops and how to fix it? Maybe there is something else that is worth attention?
```
EXEC sp_executesql
N'SELECT TOP 45 T1._IDRRef,
T1._Marked,
T1._Number,
T1._Date_Time,
T1._Posted,
T1._Fld74271RRef,
CASE
WHEN T1._Fld74272RRef=@P1
THEN T1._Fld74307RRef
ELSE T1._Fld74272RRef
END,
T1._Fld74273,
T1._Fld74274RRef,
T1._Fld74281,
CASE
WHEN T1._Fld74272RRef=@P2
THEN 0x000100DD
WHEN T1._Fld74272RRef=@P3
OR T1._Fld74272RRef=@P4
OR T1._Fld74272RRef=@P5
THEN 0x0000FF7E
ELSE 0x0000FF3E
END,
CASE
WHEN T1._Fld74272RRef=@P6
THEN T1._Fld74284RRef
WHEN T1._Fld74272RRef=@P7
OR T1._Fld74272RRef=@P8
OR T1._Fld74272RRef=@P9
THEN T1._Fld74285RRef
ELSE T1._Fld74282RRef
END,
T1._Fld74289RRe
Solution
I would add
The easiest way to spot the use of row goals is when using the
The use of loop joins is more common here as it is estimating smaller result sets & using non blocking operators.
Row goals will in most cases help, but with big convoluted queries I have seen them being bothersome in the past.
By disabling these row goals you should get a more optimized plan for the entire query.
You could also disable this by adding
I tried to remove TOP 45 part - it works now less than 1 second. Where
could I get more insight on row goals?
Removing the
Paul White has some great blog posts on row goals. Parts 1 and 2 can be found here and here. There is also Inside the Optimizer: Row Goals In Depth.
If you cannot change the query, a plan guide could be added for the specific query to disable the row goal.
Another method to remove the row goal without adding the hint(s)
Following the statement:
A second general point to bear in mind is that a row goal is only set
when the goal would be less than the regular estimate
Again from: Setting and Identifying Row Goals in Execution Plans
You could specify a nonsensical
For example:
More examples:
DB<>Fiddle
In short, the
And one on the Nested loops operator right after the second
For a real world example with a row goal issue such as yours, I was able to change the plan shape with non-blocking operators such as NL joins due to a
Actual performance issue not shown in screenshots
To hash match operators by using the
This plan shape matches the query when applying the row goal disable hints.
I would still prefer to add the query hint instead of writing the two
OPTION(USE HINT 'DISABLE_OPTIMIZER_ROWGOAL') to the query to disable row goals.The easiest way to spot the use of row goals is when using the
TOP() operator, used to return 45 rows as quickly as possible in your case.The use of loop joins is more common here as it is estimating smaller result sets & using non blocking operators.
Row goals will in most cases help, but with big convoluted queries I have seen them being bothersome in the past.
By disabling these row goals you should get a more optimized plan for the entire query.
You could also disable this by adding
OPTION(QUERYTRACEON 4138) to the end of the query.I tried to remove TOP 45 part - it works now less than 1 second. Where
could I get more insight on row goals?
Removing the
TOP(45) removed the row goal & improved performance in this case.Paul White has some great blog posts on row goals. Parts 1 and 2 can be found here and here. There is also Inside the Optimizer: Row Goals In Depth.
If you cannot change the query, a plan guide could be added for the specific query to disable the row goal.
Another method to remove the row goal without adding the hint(s)
Following the statement:
A second general point to bear in mind is that a row goal is only set
when the goal would be less than the regular estimate
Again from: Setting and Identifying Row Goals in Execution Plans
You could specify a nonsensical
TOP() with such a high number that the row goal will not be set and then wrap your actual TOP() around it.For example:
SELECT TOP(5) *
FROM(
SELECT TOP(99999999999) *
FROM dbo.table
ORDER BY column ASC
) AS A;More examples:
DB<>Fiddle
In short, the
TOP(5) rowgoal is overriden by the much higher top(99999999999) in the inner query, resulting in only 2 actual row goals remaining in my example, one on the TOP(99999999) operator:PhysicalOp="Top" LogicalOp="Top" EstimateRows="5" EstimateRowsWithoutRowGoal="9.4199"
And one on the Nested loops operator right after the second
TOP() that exactly matches the EstimateRowsPhysicalOp="Nested Loops" LogicalOp="Inner Join" EstimateRows="9.4199" EstimateRowsWithoutRowGoal="9.4199"
For a real world example with a row goal issue such as yours, I was able to change the plan shape with non-blocking operators such as NL joins due to a
TOP(5) row goal:Actual performance issue not shown in screenshots
To hash match operators by using the
TOP(999999999) method.This plan shape matches the query when applying the row goal disable hints.
I would still prefer to add the query hint instead of writing the two
TOP()s. But if you are not allowed to or can't, this could help.Code Snippets
SELECT TOP(5) *
FROM(
SELECT TOP(99999999999) *
FROM dbo.table
ORDER BY column ASC
) AS A;Context
StackExchange Database Administrators Q#255380, answer score: 9
Revisions (0)
No revisions yet.