HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

Optimize a slow nested loops join

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
loopsjoinslownestedoptimize

Problem

I have this query running around 15 seconds on my system. The query plan is here. I believe the main problem is around here, where the optimizer chooses nested loops and is mistaken with cardinality:



Tables _Reference65215 and _InfoRg100966 have 454 and 3332 rows respectively.
If I force SQL Server to use hash match here the execution time drops to 2 seconds.

I cannot use any kind of hints here as the code comes from the platform. The platform uses its own language which then gets interpreted to T-SQL. I can change the query. Stats are up to date. Maxdop 1 is a general recommendation from the vendor of the platform

What might be the reason of choosing nested loops and how to fix it? Maybe there is something else that is worth attention?

```
EXEC sp_executesql
N'SELECT TOP 45 T1._IDRRef,
T1._Marked,
T1._Number,
T1._Date_Time,
T1._Posted,
T1._Fld74271RRef,
CASE
WHEN T1._Fld74272RRef=@P1
THEN T1._Fld74307RRef
ELSE T1._Fld74272RRef
END,
T1._Fld74273,
T1._Fld74274RRef,
T1._Fld74281,
CASE
WHEN T1._Fld74272RRef=@P2
THEN 0x000100DD
WHEN T1._Fld74272RRef=@P3
OR T1._Fld74272RRef=@P4
OR T1._Fld74272RRef=@P5
THEN 0x0000FF7E
ELSE 0x0000FF3E
END,
CASE
WHEN T1._Fld74272RRef=@P6
THEN T1._Fld74284RRef
WHEN T1._Fld74272RRef=@P7
OR T1._Fld74272RRef=@P8
OR T1._Fld74272RRef=@P9
THEN T1._Fld74285RRef
ELSE T1._Fld74282RRef
END,
T1._Fld74289RRe

Solution

I would add OPTION(USE HINT 'DISABLE_OPTIMIZER_ROWGOAL') to the query to disable row goals.

The easiest way to spot the use of row goals is when using the TOP() operator, used to return 45 rows as quickly as possible in your case.

The use of loop joins is more common here as it is estimating smaller result sets & using non blocking operators.

Row goals will in most cases help, but with big convoluted queries I have seen them being bothersome in the past.

By disabling these row goals you should get a more optimized plan for the entire query.

You could also disable this by adding OPTION(QUERYTRACEON 4138) to the end of the query.


I tried to remove TOP 45 part - it works now less than 1 second. Where
could I get more insight on row goals?

Removing the TOP(45) removed the row goal & improved performance in this case.

Paul White has some great blog posts on row goals. Parts 1 and 2 can be found here and here. There is also Inside the Optimizer: Row Goals In Depth.

If you cannot change the query, a plan guide could be added for the specific query to disable the row goal.

Another method to remove the row goal without adding the hint(s)

Following the statement:


A second general point to bear in mind is that a row goal is only set
when the goal would be less than the regular estimate

Again from: Setting and Identifying Row Goals in Execution Plans

You could specify a nonsensical TOP() with such a high number that the row goal will not be set and then wrap your actual TOP() around it.

For example:

SELECT TOP(5) *
FROM(
SELECT TOP(99999999999) *
FROM dbo.table
ORDER BY column ASC
) AS  A;


More examples:

DB<>Fiddle

In short, the TOP(5) rowgoal is overriden by the much higher top(99999999999) in the inner query, resulting in only 2 actual row goals remaining in my example, one on the TOP(99999999) operator:

PhysicalOp="Top" LogicalOp="Top" EstimateRows="5" EstimateRowsWithoutRowGoal="9.4199"


And one on the Nested loops operator right after the second TOP() that exactly matches the EstimateRows

PhysicalOp="Nested Loops" LogicalOp="Inner Join" EstimateRows="9.4199" EstimateRowsWithoutRowGoal="9.4199"


For a real world example with a row goal issue such as yours, I was able to change the plan shape with non-blocking operators such as NL joins due to a TOP(5) row goal:

Actual performance issue not shown in screenshots

To hash match operators by using the TOP(999999999) method.

This plan shape matches the query when applying the row goal disable hints.

I would still prefer to add the query hint instead of writing the two TOP()s. But if you are not allowed to or can't, this could help.

Code Snippets

SELECT TOP(5) *
FROM(
SELECT TOP(99999999999) *
FROM dbo.table
ORDER BY column ASC
) AS  A;

Context

StackExchange Database Administrators Q#255380, answer score: 9

Revisions (0)

No revisions yet.