HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlModerate

Optimize a query that's running slow with nested loops inner ioin

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
withqueryloopsslownestedrunningthatoptimizeioininner

Problem

Consider the following simple query (only 3 tables involved)

SELECT

        l.sku_id AS ProductId,
        l.is_primary AS IsPrimary,
        v1.category_name AS Category1,
        v2.category_name AS Category2,
        v3.category_name AS Category3,
        v4.category_name AS Category4,
        v5.category_name AS Category5

    FROM category c4
    JOIN category_voc v4 ON v4.category_id = c4.category_id and v4.language_code = 'en'

    JOIN category c3 ON c3.category_id = c4.parent_category_id
    JOIN category_voc v3 ON v3.category_id = c3.category_id and v3.language_code = 'en'

    JOIN category c2 ON c2.category_id = c3.category_id
    JOIN category_voc v2 ON v2.category_id = c2.category_id and v2.language_code = 'en'

    JOIN category c1 ON c1.category_id = c2.parent_category_id
    JOIN category_voc v1 ON v1.category_id = c1.category_id and v1.language_code = 'en'

    LEFT OUTER JOIN category c5 ON c5.parent_category_id = c4.category_id
    LEFT OUTER JOIN category_voc v5 ON v5.category_id = c5.category_id and v5.language_code = @lang

    JOIN category_link l on l.sku_id IN (SELECT value FROM #Ids) AND
    (
        l.category_id = c4.category_id OR
        l.category_id = c5.category_id
    )

    WHERE c4.[level] = 4 AND c4.version_id = 5


This is a pretty simple query, the only confusing part is the last category join, it's this way because category level 5 might or might not exist. At the end of the query I am looking for category info per product ID (SKU ID), and the that's where the very large table category_link comes in. Finally, the table #Ids is just a temp table containing 10'000 Ids.

When executed, I get the following actual execution plan:

As you can see, almost 90% of the time is spent in the Nested Loops (Inner Join). Here's extra information on those Nested Loops:

Note that the table names don't match exactly because I edited the query table names for readability, but it's pretty easy to match (ads_alt_category = category). I

Solution

The problem appears to be in this part of the code:

JOIN category_link l on l.sku_id IN (SELECT value FROM #Ids) AND
(
    l.category_id = c4.category_id OR
    l.category_id = c5.category_id
)


or in join conditions is always suspicious. One suggestion is to split this into two joins:

JOIN category_link l1 on l1.sku_id in (SELECT value FROM #Ids) and l1.category_id = cr.category_id
left outer join
category_link l1 on l2.sku_id in (SELECT value FROM #Ids) and l2.category_id = cr.category_id


You then have to modify the rest of the query to handle this . . . coalesce(l1.sku_id, l2.sku_id) for instance in the select clause.

Code Snippets

JOIN category_link l on l.sku_id IN (SELECT value FROM #Ids) AND
(
    l.category_id = c4.category_id OR
    l.category_id = c5.category_id
)
JOIN category_link l1 on l1.sku_id in (SELECT value FROM #Ids) and l1.category_id = cr.category_id
left outer join
category_link l1 on l2.sku_id in (SELECT value FROM #Ids) and l2.category_id = cr.category_id

Context

StackExchange Database Administrators Q#40280, answer score: 19

Revisions (0)

No revisions yet.