patternsqlMinor
Optimizing query using view on EAV structure
Viewed 0 times
eavqueryviewoptimizingusingstructure
Problem
An application is writing into a database that follows an EAV structure, similar to this:
(I gather that EAV is a bit controversial, but this question isn't about EAV: this legacy application can't be changed anyway.)
There can be a number of attributes, but usually, up to 200 attributes per items (often similar). Out of these 200 attributes, there's a group of about 25 that are more common than the others and that are used more often in queries.
To make it easier to write new queries based on some of those 25 attributes (the requirements tend to change and I need to be flexible), I have written a view that joins the attribute table for these 25 attributes. Following the example above, this looks like this:
A typical report would only make use of a few of those 25 attributes, for example:
Some of these queries are not quite as fast as I wish they were.
CREATE TABLE item (
id INTEGER PRIMARY KEY,
description TEXT
);
CREATE TABLE item_attr (
item INTEGER REFERENCES item(id),
name TEXT,
value INTEGER,
PRIMARY KEY (item, name)
);
INSERT INTO item VALUES (1, 'Item 1');
INSERT INTO item_attr VALUES (1, 'height', 20);
INSERT INTO item_attr VALUES (1, 'width', 30);
INSERT INTO item_attr VALUES (1, 'weight', 40);
INSERT INTO item VALUES (2, 'Item 2');
INSERT INTO item_attr VALUES (2, 'height', 10);
INSERT INTO item_attr VALUES (2, 'weight', 35);(I gather that EAV is a bit controversial, but this question isn't about EAV: this legacy application can't be changed anyway.)
There can be a number of attributes, but usually, up to 200 attributes per items (often similar). Out of these 200 attributes, there's a group of about 25 that are more common than the others and that are used more often in queries.
To make it easier to write new queries based on some of those 25 attributes (the requirements tend to change and I need to be flexible), I have written a view that joins the attribute table for these 25 attributes. Following the example above, this looks like this:
CREATE VIEW exp_item AS SELECT
i.id AS id,
i.description AS description,
ia_height.value AS height,
ia_width.value AS width,
ia_weight.value AS weight,
ia_depth.value AS depth
FROM item i
LEFT JOIN item_attr ia_height ON i.id=ia_height.item AND ia_height.name='height'
LEFT JOIN item_attr ia_width ON i.id=ia_width.item AND ia_width.name='width'
LEFT JOIN item_attr ia_weight ON i.id=ia_weight.item AND ia_weight.name='weight'
LEFT JOIN item_attr ia_depth ON i.id=ia_depth.item AND ia_depth.name='depth';A typical report would only make use of a few of those 25 attributes, for example:
SELECT id, description, height, width FROM exp_item;Some of these queries are not quite as fast as I wish they were.
Solution
This is one (of many) downsides of EAV designs.
You can't really improve the JOIN: because of the necessary complexity, a cost based optimiser won't get to the perfect plan. It finds "good enough"
Suggestions:
The first option scales better becauses a few indexes on the main EAV fact table can cover all queries nicely.
You can't really improve the JOIN: because of the necessary complexity, a cost based optimiser won't get to the perfect plan. It finds "good enough"
Suggestions:
- don't use a view: use aggregate type queries (eg COUNT(*) = 2 if I match both height and weight)
- use a trigger to maintain a real (or sparse) table and query that
The first option scales better becauses a few indexes on the main EAV fact table can cover all queries nicely.
Context
StackExchange Database Administrators Q#3492, answer score: 7
Revisions (0)
No revisions yet.