HiveBrain v1.2.0
Get Started
← Back to all entries
patternMinor

Help with Query Tuning

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
withtuningqueryhelp

Problem

EDIT - drachenstern: see this question for more information on this problem and on the source of his query below:


Group data by non-unique keys by distinct time range

Can someone please advise some tuning options for the following query? When running for one project it is fast but for all the records it takes hours to finish running. NUM_ROWS in PA table: 2,101,528

I want to group data by non-unique keys by a distinct time range

SELECT project_nbr, status, MIN(aud_timestamp) start_dt, end_dt 
FROM (
    SELECT a.project_nbr
         , status
         , aud_timestamp
         , ( SELECT MAX(p.aud_timestamp) 
             FROM pa p 
             WHERE p.project_nbr = a.project_nbr 
               AND p.status = a.status 
               AND p.aud_timestamp >= a.aud_timestamp
               AND NOT EXISTS (
                            SELECT NULL 
                            FROM pa q
                            WHERE q.project_nbr = p.project_nbr 
                              AND q.status <> p.status 
                              AND q.aud_timestamp  a.aud_timestamp
                               )
            ) end_dt
     FROM pa a
     )
GROUP BY project_nbr, status, end_dt

Solution

This is the offending code causing your query to take hours to return results:

and not exists (
                        select null 
                        from pa q
                        where q.PROJect_NBR = p.project_nbr 
                          and q.status <> p.status 
                          and q.aud_timestamp  a.aud_timestamp
                           )


You should really convert that to a join instead, because as it stands you are generating a new lookup for each record in the 2million source rows.

Code Snippets

and not exists (
                        select null 
                        from pa q
                        where q.PROJect_NBR = p.project_nbr 
                          and q.status <> p.status 
                          and q.aud_timestamp < p.aud_timestamp 
                          and q.aud_timestamp > a.aud_timestamp
                           )

Context

StackExchange Database Administrators Q#1407, answer score: 7

Revisions (0)

No revisions yet.