HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

Optimize select on subquery with COALESCE(…)

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
coalescewithsubqueryoptimizeselect

Problem

I have a large view that I use from within an application. I think I've narrowed down my performance problem, but I'm unsure how to fix it. A simplified version of the view looks like this:

SELECT ISNULL(SEId + '-' + PEId, '0-0') AS Id,
   *,
   DATEADD(minute, Duration, EventTime) AS EventEndTime
FROM (
    SELECT se.SEId, pe.PEId,
        COALESCE(pe.StaffName, se.StaffName) AS StaffName, -- << Problem!
        COALESCE(pe.EventTime, se.EventTime) AS EventTime,
        COALESCE(pe.EventType, se.EventType) AS EventType,
        COALESCE(pe.Duration, se.Duration) AS Duration,
        COALESCE(pe.Data, se.Data) AS Data,
        COALESCE(pe.Field, se.Field) AS Field,
        pe.ThisThing, se.OtherThing
    FROM PE pe FULL OUTER JOIN SE se 
      ON pe.StaffName = se.StaffName
     AND pe.Duration = se.Duration
     AND pe.EventTime = se.EventTime
    WHERE NOT(pe.ThisThing = 1 AND se.OtherThing = 0)
) Z


That probably doesn't justify the whole reason for the query structure, but maybe gives you an idea--this view joins two very poorly designed tables that I don't have control over and tries to synthesize some information out of it.

So, since this is a view used from the application, while trying to optimize I wrap it in another SELECT, like this:

SELECT * FROM (
    -- … above code …
) Q
WHERE StaffName = 'SMITH, JOHN Q'


because the application is searching for specific staff members in the result.

The problem seems to be the COALESCE(pe.StaffName, se.StaffName) AS StaffName section, and that I'm selecting from the view on StaffName. If I change that to pe.StaffName AS StaffName or se.StaffName AS StaffName, the performance problems disappear (but see updated 2 below). But that won't do because one side or the other of the FULL OUTER JOIN could be missing, so one or the other field may be NULL.

Can I refactor this replacing the COALESCE(…) with something else, which will get rewritten down into the subquery?

Other notes:

  • I've alr

Solution

This was rather longshot but since the OP says it worked, I'm adding it as an answer (feel free to correct it if you find anything wrong).

Try to break the internal query into three parts (INNER JOIN, LEFT JOIN with WHERE IS NULL check, RIGHT JOIN with IS NULL check) and then UNION ALL the three parts. This has the following advantages:

-
The optimizer has less transformation options available for FULL joins than for (the more common) INNER and LEFT joins.

-
The Z derived table can be removed (you can do that anyway) from the view definition.

-
The NOT(pe.ThisThing = 1 AND se.OtherThing = 0) will be needed only on the INNER join part.

-
Minor improvement, the use COALESCE() will be minimal if any at all (I assumed that se.SEId and pe.PEId are not nullable. If more columns are not nullable, you'll be able to remove more COALESCE() calls.)

More important, the optimizer may push down any conditions in your queries that involve these columns (now that COALESCE() is not blocking the push.)

-
All the above will give the optimizer more options to transform/rewrite any query that uses the view so it may find an execution plan that indexes on the underlying tables can be used.

In all, the view can be written as:

SELECT 
    se.SEId + '-' + pe.PEId AS Id,
    se.SEId, pe.PEId,
    pe.StaffName, 
    pe.EventTime,
    COALESCE(pe.EventType, se.EventType) AS EventType,
    pe.Duration,
    COALESCE(pe.Data, se.Data) AS Data,
    COALESCE(pe.Field, se.Field) AS Field,
    pe.ThisThing, se.OtherThing,
    DATEADD(minute, pe.Duration, pe.EventTime) AS EventEndTime
FROM PE pe INNER JOIN SE se 
  ON pe.StaffName = se.StaffName
 AND pe.Duration = se.Duration
 AND pe.EventTime = se.EventTime
WHERE NOT (pe.ThisThing = 1 AND se.OtherThing = 0) 

UNION ALL

SELECT 
    '0-0',
    NULL, pe.PEId,
    pe.StaffName, 
    pe.EventTime,
    pe.EventType,
    pe.Duration,
    pe.Data,
    pe.Field,
    pe.ThisThing, NULL,
    DATEADD(minute, pe.Duration, pe.EventTime) AS EventEndTime
FROM PE pe LEFT JOIN SE se 
  ON pe.StaffName = se.StaffName
 AND pe.Duration = se.Duration
 AND pe.EventTime = se.EventTime
WHERE NOT (pe.ThisThing = 1)
  AND se.StaffName IS NULL

UNION ALL

SELECT 
    '0-0',
    se.SEId, NULL,
    se.StaffName, 
    se.EventTime,
    se.EventType,
    se.Duration,
    se.Data,
    se.Field,
    NULL, se.OtherThing, 
    DATEADD(minute, se.Duration, se.EventTime) AS EventEndTime
FROM PE pe RIGHT JOIN SE se 
  ON pe.StaffName = se.StaffName
 AND pe.Duration = se.Duration
 AND pe.EventTime = se.EventTime
WHERE NOT (se.OtherThing = 0)
  AND pe.StaffName IS NULL ;

Code Snippets

SELECT 
    se.SEId + '-' + pe.PEId AS Id,
    se.SEId, pe.PEId,
    pe.StaffName, 
    pe.EventTime,
    COALESCE(pe.EventType, se.EventType) AS EventType,
    pe.Duration,
    COALESCE(pe.Data, se.Data) AS Data,
    COALESCE(pe.Field, se.Field) AS Field,
    pe.ThisThing, se.OtherThing,
    DATEADD(minute, pe.Duration, pe.EventTime) AS EventEndTime
FROM PE pe INNER JOIN SE se 
  ON pe.StaffName = se.StaffName
 AND pe.Duration = se.Duration
 AND pe.EventTime = se.EventTime
WHERE NOT (pe.ThisThing = 1 AND se.OtherThing = 0) 

UNION ALL

SELECT 
    '0-0',
    NULL, pe.PEId,
    pe.StaffName, 
    pe.EventTime,
    pe.EventType,
    pe.Duration,
    pe.Data,
    pe.Field,
    pe.ThisThing, NULL,
    DATEADD(minute, pe.Duration, pe.EventTime) AS EventEndTime
FROM PE pe LEFT JOIN SE se 
  ON pe.StaffName = se.StaffName
 AND pe.Duration = se.Duration
 AND pe.EventTime = se.EventTime
WHERE NOT (pe.ThisThing = 1)
  AND se.StaffName IS NULL

UNION ALL

SELECT 
    '0-0',
    se.SEId, NULL,
    se.StaffName, 
    se.EventTime,
    se.EventType,
    se.Duration,
    se.Data,
    se.Field,
    NULL, se.OtherThing, 
    DATEADD(minute, se.Duration, se.EventTime) AS EventEndTime
FROM PE pe RIGHT JOIN SE se 
  ON pe.StaffName = se.StaffName
 AND pe.Duration = se.Duration
 AND pe.EventTime = se.EventTime
WHERE NOT (se.OtherThing = 0)
  AND pe.StaffName IS NULL ;

Context

StackExchange Database Administrators Q#71415, answer score: 4

Revisions (0)

No revisions yet.