HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

Optimising query on view that merges similar tables with a clear discriminator

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
tablesmergeswithqueryviewoptimisingthatcleardiscriminatorsimilar

Problem

Using PostgreSQL 8.4, I have a number of tables that have a very similar structure, but that belong to different categories:

CREATE TABLE table_a (
id SERIAL PRIMARY KEY,
event_time TIMESTAMP WITH TIME ZONE NOT NULL,
value_a REAL
);

CREATE TABLE table_b (
id SERIAL PRIMARY KEY,
event_time TIMESTAMP WITH TIME ZONE NOT NULL,
value_b REAL
);

CREATE TABLE table_c (
id SERIAL PRIMARY KEY,
event_time TIMESTAMP WITH TIME ZONE NOT NULL,
value_c REAL
);


I need to link these values to a central table (using joins or sub-selects depending on the query):

CREATE TABLE periods_table (
id SERIAL PRIMARY KEY,
start_time TIMESTAMP WITH TIME ZONE NOT NULL,
end_time TIMESTAMP WITH TIME ZONE NOT NULL,
category TEXT NOT NULL
);


Here, category is one of 'Category A', 'Category B' or 'Category C'.

In order to abstract the similarities between the A, B and C tables, I've created a view:

CREATE VIEW table_values AS
SELECT 'Category A' AS category, event_time, value_a AS value
FROM table_a
UNION
SELECT 'Category B' AS category, event_time, value_b AS value
FROM table_b
UNION
SELECT 'Category C' AS category, event_time, value_c AS value
FROM table_c;


A typical query would be something like:

SELECT p.start_time, p.end_time, p.category,
(SELECT SUM(v.value) FROM table_values v
WHERE v.category=p.category
AND v.event_time >= t.start_time AND v.event_time

The problem is that the
category column that could potentially be used to discriminate between the separate tables in the view is only used at the end.

Even an
EXPLAIN ANALYZE on SELECT * FROM table_values WHERE category='Category A' shows that all 3 tables are sub-queried when rows matching this criterion will ever only come from table_a:

``
QUERY PLAN
----------------------------------------------------------------------

Solution

There are a few problems with your query. It is quite clear that your 'wrapper' view - while looks like an elegant solution at first - kills the performance involving 7.7M completely unnecessary rows. This is because UNION requires all this data being sorted, and since those don't fit into memory (you can see this from Sort Method: external merge Disk: 300792kB), it gets 'swapped' to disk and sorted there, which is a very slow process.

As a first attempt, try recreating the 'wrapper' view with UNION ALL instead of simple UNION (you can find the difference here - note that in order to make the rows distinct, Postgres has to sort them first); this way you may avoid the sort. If the result is not good enough, try joining the five tables on the 'main' query one by one and UNION ALL the results.

Context

StackExchange Database Administrators Q#25622, answer score: 3

Revisions (0)

No revisions yet.