patternsqlMinor
Help optimizing this query with multiple where exists
Viewed 0 times
thiswithquerywherehelpoptimizingmultipleexists
Problem
In the application I'm building, the user is able to define 'types' where each 'type' has a set of 'attributes'.
The user is able to create instances of products by defining a value for each attribute the product's type has.
A pic of the schema:
I'm creating the query where the user specifies the attributes values and the product type and with that I should return all the product id's that meets the query.
The problem I see in my query is that I'm performing a whole
Is there a way to optimize this?
If I create an index in the column attributes_products.product_id would this query be actually optimal?
Example of a query where I'm looking for a product whose type has 3 attributes:
Many thanks.
Conclusions
So, Gareth Rees (selected answer) proposed another solution which involves multiple Joins.
Here is the explanation of its query (done by PGAdmin):
This is the explanation of the original query:
I believe that the selected answer is slightly faster, but consumes a lot more memory (because of the triple join).
I believe that my original query is slightly slower (very slightly, since there's an index on the attributes_products table) but a lot more efficient in memory.
The user is able to create instances of products by defining a value for each attribute the product's type has.
A pic of the schema:
I'm creating the query where the user specifies the attributes values and the product type and with that I should return all the product id's that meets the query.
The problem I see in my query is that I'm performing a whole
select * from attributes_products ... for each attribute that the product's type has.Is there a way to optimize this?
If I create an index in the column attributes_products.product_id would this query be actually optimal?
Example of a query where I'm looking for a product whose type has 3 attributes:
select p.id
from Products as p
where
exists(
select * from attributes_products
where
product_id = p.id AND
attribute_id = 27 AND
value = 'some_value'
) AND
exists(
select * from attributes_products
where
product_id = p.id AND
attribute_id = 28 AND
value = 'other_value'
) AND
exists(
select * from attributes_products
where
product_id = p.id AND
attribute_id = 29 AND
value = 'oother_value'
)Many thanks.
Conclusions
So, Gareth Rees (selected answer) proposed another solution which involves multiple Joins.
Here is the explanation of its query (done by PGAdmin):
This is the explanation of the original query:
I believe that the selected answer is slightly faster, but consumes a lot more memory (because of the triple join).
I believe that my original query is slightly slower (very slightly, since there's an index on the attributes_products table) but a lot more efficient in memory.
Solution
SQL allows you to join the same table multiple times, so what you need here is:
Here's the toy MySQL database that I'm using to answer this question:
With my query above, MySQL reports the following query plan:
See the MySQL documentation for an explanation of the
This looks better than the plan for the OP's query:
But results will vary from one database to another: a good query planner might be able to make something efficient out of the OP's query.
SELECT p.id FROM products AS p
JOIN attributes_products AS ap1
ON ap1.product_id = p.id AND ap1.attribute_id = 27 AND ap1.value = '...'
JOIN attributes_products AS ap2
ON ap2.product_id = p.id AND ap2.attribute_id = 28 AND ap2.value = '...'
JOIN attributes_products AS ap3
ON ap3.product_id = p.id AND ap3.attribute_id = 29 AND ap3.value = '...'Here's the toy MySQL database that I'm using to answer this question:
CREATE TABLE products (
id INTEGER PRIMARY KEY AUTO_INCREMENT
);
CREATE TABLE attributes_products (
product_id INTEGER NOT NULL,
attribute_id INTEGER NOT NULL,
value CHAR(40)
);
CREATE INDEX ap_product ON attributes_products (product_id);
CREATE INDEX ap_attribute ON attributes_products (attribute_id);
INSERT INTO products VALUES (1);
INSERT INTO products VALUES (2);
INSERT INTO attributes_products VALUES (1, 27, 'a');
INSERT INTO attributes_products VALUES (1, 28, 'b');
INSERT INTO attributes_products VALUES (1, 29, 'c');With my query above, MySQL reports the following query plan:
+----+-------------+-------+--------+-------------------------+--------------+---------+---------------------+------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+-------------------------+--------------+---------+---------------------+------+--------------------------+
| 1 | SIMPLE | ap1 | ref | ap_product,ap_attribute | ap_attribute | 4 | const | 1 | Using where |
| 1 | SIMPLE | ap2 | ref | ap_product,ap_attribute | ap_attribute | 4 | const | 1 | Using where |
| 1 | SIMPLE | ap3 | ref | ap_product,ap_attribute | ap_attribute | 4 | const | 1 | Using where |
| 1 | SIMPLE | p | eq_ref | PRIMARY | PRIMARY | 4 | temp.ap3.product_id | 1 | Using where; Using index |
+----+-------------+-------+--------+-------------------------+--------------+---------+---------------------+------+--------------------------+See the MySQL documentation for an explanation of the
EXPLAIN output.This looks better than the plan for the OP's query:
+----+--------------------+---------------------+-------+-------------------------+--------------+---------+-------+------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+---------------------+-------+-------------------------+--------------+---------+-------+------+--------------------------+
| 1 | PRIMARY | p | index | NULL | PRIMARY | 4 | NULL | 2 | Using where; Using index |
| 4 | DEPENDENT SUBQUERY | attributes_products | ref | ap_product,ap_attribute | ap_attribute | 4 | const | 1 | Using where |
| 3 | DEPENDENT SUBQUERY | attributes_products | ref | ap_product,ap_attribute | ap_attribute | 4 | const | 1 | Using where |
| 2 | DEPENDENT SUBQUERY | attributes_products | ref | ap_product,ap_attribute | ap_attribute | 4 | const | 1 | Using where |
+----+--------------------+---------------------+-------+-------------------------+--------------+---------+-------+------+--------------------------+But results will vary from one database to another: a good query planner might be able to make something efficient out of the OP's query.
Code Snippets
SELECT p.id FROM products AS p
JOIN attributes_products AS ap1
ON ap1.product_id = p.id AND ap1.attribute_id = 27 AND ap1.value = '...'
JOIN attributes_products AS ap2
ON ap2.product_id = p.id AND ap2.attribute_id = 28 AND ap2.value = '...'
JOIN attributes_products AS ap3
ON ap3.product_id = p.id AND ap3.attribute_id = 29 AND ap3.value = '...'CREATE TABLE products (
id INTEGER PRIMARY KEY AUTO_INCREMENT
);
CREATE TABLE attributes_products (
product_id INTEGER NOT NULL,
attribute_id INTEGER NOT NULL,
value CHAR(40)
);
CREATE INDEX ap_product ON attributes_products (product_id);
CREATE INDEX ap_attribute ON attributes_products (attribute_id);
INSERT INTO products VALUES (1);
INSERT INTO products VALUES (2);
INSERT INTO attributes_products VALUES (1, 27, 'a');
INSERT INTO attributes_products VALUES (1, 28, 'b');
INSERT INTO attributes_products VALUES (1, 29, 'c');+----+-------------+-------+--------+-------------------------+--------------+---------+---------------------+------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+-------------------------+--------------+---------+---------------------+------+--------------------------+
| 1 | SIMPLE | ap1 | ref | ap_product,ap_attribute | ap_attribute | 4 | const | 1 | Using where |
| 1 | SIMPLE | ap2 | ref | ap_product,ap_attribute | ap_attribute | 4 | const | 1 | Using where |
| 1 | SIMPLE | ap3 | ref | ap_product,ap_attribute | ap_attribute | 4 | const | 1 | Using where |
| 1 | SIMPLE | p | eq_ref | PRIMARY | PRIMARY | 4 | temp.ap3.product_id | 1 | Using where; Using index |
+----+-------------+-------+--------+-------------------------+--------------+---------+---------------------+------+--------------------------++----+--------------------+---------------------+-------+-------------------------+--------------+---------+-------+------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+---------------------+-------+-------------------------+--------------+---------+-------+------+--------------------------+
| 1 | PRIMARY | p | index | NULL | PRIMARY | 4 | NULL | 2 | Using where; Using index |
| 4 | DEPENDENT SUBQUERY | attributes_products | ref | ap_product,ap_attribute | ap_attribute | 4 | const | 1 | Using where |
| 3 | DEPENDENT SUBQUERY | attributes_products | ref | ap_product,ap_attribute | ap_attribute | 4 | const | 1 | Using where |
| 2 | DEPENDENT SUBQUERY | attributes_products | ref | ap_product,ap_attribute | ap_attribute | 4 | const | 1 | Using where |
+----+--------------------+---------------------+-------+-------------------------+--------------+---------+-------+------+--------------------------+Context
StackExchange Code Review Q#33772, answer score: 5
Revisions (0)
No revisions yet.