HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

Filtering UNION ALL result is much slower than filtering each subquery

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
resultmucheachallunionslowerthansubqueryfiltering

Problem

(edit: see end for a simpler example)

I'm searching in a table named "cases" (135k rows, 29 columns). Some of the rows in this table have a type of parent-child relationship (of different types), which means that for these records a mix of parent/child fields must be used for filtering and display.

I have identified four different parent-child relationships and created views for them:

  • caselist_no_specials: not a child record, use record data as is; 116106 rows total.



  • caselist_disputes_with_ipr: child record; 138 rows total.



  • caselist_mark_children: child record; 18132 rows total.



  • caselist_design_children: child record; 671 rows total.



The results from these views do not overlap and together cover 100% of the table.

When I select a union of all of them, and separately filter each view, the query takes about 9 ms. Selecting a union of all views and filtering the result of that takes about 500 ms.

I have also tested this without views, inlining the queries they contain, which did not produce a measurable improvement.

This is the fast query (explain):

SELECT  c.*
    FROM  caselist_no_specials c
    JOIN  case_clients cacl ON cacl.case_id = c.main_id
   WHERE  cacl.client_id = 12046

   UNION ALL

  SELECT  c.*
    FROM  caselist_disputes_with_ipr c
    JOIN  case_clients cacl ON cacl.case_id = c.main_id
   WHERE  cacl.client_id = 12046

   UNION ALL

  SELECT  c.*
    FROM  caselist_mark_children c
    JOIN  case_clients cacl ON cacl.case_id = c.main_id
   WHERE  cacl.client_id = 12046

   UNION ALL

  SELECT  c.*
    FROM  caselist_design_children c
    JOIN  case_clients cacl ON cacl.case_id = c.main_id
   WHERE  cacl.client_id = 12046

ORDER BY  sort_nr,
          id;


As you can see, the join and filter is duplicated for each view. Trying to avoid the duplication yielded this query, which takes a lot longer (explain):

```
SELECT x.*
FROM (
SELECT * FROM caselist_no_specials
UNION ALL
SELECT

Solution

What you're asking the DB to do in Query one is:
Give me ALL from table A FILTERED
Give me ALL from table B FILTERED
Give me ALL from table C FILTERED
Give me ALL from table D FILTERED
And then Union.

In the second query you first get all the data, and only after that you do the join and the filter. JOIN and WHERE on a UNION query, which doesn't really enable you to index anything, obviously runs slower.
(It has nothing to do with the Server variant or the OS).

Context

StackExchange Database Administrators Q#136653, answer score: 5

Revisions (0)

No revisions yet.