HiveBrain v1.2.0
Get Started
← Back to all entries
gotchasqlMinor

Unexpected extremely long query time (~5 minutes using nested WHEN-INs)

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
extremelyquerylongtimenestedunexpectedusingwhenminutesins

Problem

For the first time, I am running up against a horrifically long execution time of a MySQL query (~5 minutes).

The data in the database is highly (and not arbitrarily) normalized. It very efficient at organizing and shuffling data to be displayed in a lot of very helpful ways for different purposes, except this one particular query is throwing a wrench into it.

I can't understand the reason for it. However, some background information that may shed some light onto otherwise arbitrarily convoluted queries.

The company has divided the world up into many many teams (macroregions). Everybody belongs to one or two teams, based on their expertise.

-
For example, there are many diverse teams. A few examples are Spanish, Sahara, Iberia, Portuguese, Jungle teams. Each of the teams has considerable overlap with other teams, but is in some senses independent.

-
The Arabic team works quite closely with the Sahara team, by virtue of the fact that the database tells them they have to work together on certain assignments, because of overlapping geographic locations. The Spanish and the Portuguese team also work closely together, they both work with the Americas and Europe teams and the Portuguese team also works with the Africa team, as does the Arabic team.

-
Each team has a given set of regions, which are also not unique to that particular team. For example, the Mediterranean region belongs to around 12 teams, and when an event occurs there, they all work on it together.

-
Each country belongs to one or more regions. Turkey belongs to Central Asia, Europe and even Mediterranean, and a few others.

Given all that, it is necessary to show each individual what other people on their teams are doing, as well as people who are not on their team, but have overlapping regions.

Query 1 accomplishes this perfectly, and very quickly less than .09 seconds.

```
SELECT report_name
FROM reports
WHERE region I

Solution

Let's start with your original query

SELECT *  
    FROM reports 
    WHERE report_name IN ( 
        SELECT report_name  
        FROM reports  
        WHERE region IN ( 
            SELECT distinct region  
            FROM macroregions 
            WHERE macroregion IN ( 
                SELECT distinct macroregion 
                FROM users  
                WHERE callsign = '$thisuser' 
            ) 
        )     
    )


You can gather keys only (in stages), then join the keys with the reports table. Here is my new proposed query

SELECT reports.* FROM
(
    SELECT rpts.report_name FROM
    (
        SELECT DISTINCT regions.region FROM
        (SELECT DISTINCT macroregion
        FROM users WHERE callsign = '$thisuser') users
        INNER JOIN
        (SELECT DISTINCT macroregion,region FROM regions) regions
        USING (macroregion)
    ) regionkeys
    INNER JOIN
    (SELECT region,report_name FROM reports) rpts
    USING (region)
) reportnamekeys
INNER JOIN reports
USING (report_name);


If you are not comfortable with the USING clause here is my new propsed query without using the USING clause:

SELECT reports.* FROM
(
    SELECT rpts.report_name FROM
    (
        SELECT DISTINCT regions.region FROM
        (SELECT DISTINCT macroregion
        FROM users WHERE callsign = '$thisuser') users
        INNER JOIN
        (SELECT DISTINCT macroregion,region FROM regions) regions
        ON users.macroregion = regions.macroregion
    ) regionkeys
    INNER JOIN
    (SELECT region,report_name FROM reports) rpts
    ON regionkeys.region = rpts.region
) reportnamekeys
INNER JOIN reports
ON reportkeys.report_name = reports.report_name;


You will need some indexes to support the subqueries

ALTER TABLE users   ADD INDEX callsign_macroregion_ndx (callsign,macroregion);
ALTER TABLE regions ADD INDEX macroregion_region_ndx (macroregion,region);
ALTER TABLE reports ADD INDEX region_report_name_ndx (region,report_name);
ALTER TABLE reports ADD INDEX report_name_ndx (report_name);


The first 3 indexes are called covering indexes. They are called such because the subqueries calls for only those exact columns. Therefore, no need to read from the table. The data is only fetched from the index.

  • http://peter-zaitsev.livejournal.com/6949.html



  • http://ronaldbradford.com/blog/tag/covering-index/



  • http://www.mysqlperformanceblog.com/2006/11/23/covering-index-and-prefix-indexes/



You should also study how to refactor queries to gather keys, perform WHERE clauses early, and perform JOINs last. Here is great YouTube Video on how to do this: http://youtu.be/ZVisY-fEoMw (Based on this book : Refactoring SQL Applications)

Give it a Try !!!

Code Snippets

SELECT *  
    FROM reports 
    WHERE report_name IN ( 
        SELECT report_name  
        FROM reports  
        WHERE region IN ( 
            SELECT distinct region  
            FROM macroregions 
            WHERE macroregion IN ( 
                SELECT distinct macroregion 
                FROM users  
                WHERE callsign = '$thisuser' 
            ) 
        )     
    )
SELECT reports.* FROM
(
    SELECT rpts.report_name FROM
    (
        SELECT DISTINCT regions.region FROM
        (SELECT DISTINCT macroregion
        FROM users WHERE callsign = '$thisuser') users
        INNER JOIN
        (SELECT DISTINCT macroregion,region FROM regions) regions
        USING (macroregion)
    ) regionkeys
    INNER JOIN
    (SELECT region,report_name FROM reports) rpts
    USING (region)
) reportnamekeys
INNER JOIN reports
USING (report_name);
SELECT reports.* FROM
(
    SELECT rpts.report_name FROM
    (
        SELECT DISTINCT regions.region FROM
        (SELECT DISTINCT macroregion
        FROM users WHERE callsign = '$thisuser') users
        INNER JOIN
        (SELECT DISTINCT macroregion,region FROM regions) regions
        ON users.macroregion = regions.macroregion
    ) regionkeys
    INNER JOIN
    (SELECT region,report_name FROM reports) rpts
    ON regionkeys.region = rpts.region
) reportnamekeys
INNER JOIN reports
ON reportkeys.report_name = reports.report_name;
ALTER TABLE users   ADD INDEX callsign_macroregion_ndx (callsign,macroregion);
ALTER TABLE regions ADD INDEX macroregion_region_ndx (macroregion,region);
ALTER TABLE reports ADD INDEX region_report_name_ndx (region,report_name);
ALTER TABLE reports ADD INDEX report_name_ndx (report_name);

Context

StackExchange Database Administrators Q#12606, answer score: 3

Revisions (0)

No revisions yet.