patternsqlMinor
Is there a way to optimize sorting by columns of joined tables?
Viewed 0 times
sortingtablescolumnsjoinedwayoptimizethere
Problem
This is my slow query:
Average query time is 4.5s on my dataset and this is unacceptable.
Solutions i see:
Add all columns from order clause to
Is there other solution?
Explain:
```
+----+-------------+-----------------+--------+
SELECT `products_counts`.`cid`
FROM
`products_counts` `products_counts`
LEFT OUTER JOIN `products` `products` ON (
`products_counts`.`product_id` = `products`.`id`
)
LEFT OUTER JOIN `trademarks` `trademark` ON (
`products`.`trademark_id` = `trademark`.`id`
)
LEFT OUTER JOIN `suppliers` `supplier` ON (
`products_counts`.`supplier_id` = `supplier`.`id`
)
WHERE
`products_counts`.product_id IN
(159, 572, 1075, 1102, 1145, 1162, 1660, 2355, 2356, 2357, 3236, 6471, 6472, 6473, 8779, 9043, 9095, 9336, 9337, 9338, 9445, 10198, 10966, 10967, 10974, 11124, 11168, 16387, 16689, 16827, 17689, 17920, 17938, 17946, 17957, 21341, 21352, 21420, 21421, 21429, 21544, 27944, 27988, 30194, 30196, 30230, 30278, 30699, 31306, 31340, 32625, 34021, 34047, 38043, 43743, 48639, 48720, 52453, 55667, 56847, 57478, 58034, 61477, 62301, 65983, 66013, 66181, 66197, 66204, 66407, 66844, 66879, 67308, 68637, 73944, 74037, 74060, 77502, 90963, 101630, 101900, 101977, 101985, 101987, 105906, 108112, 123839, 126316, 135156, 135184, 138903, 142755, 143046, 143193, 143247, 144054, 150164, 150406, 154001, 154546, 157998, 159896, 161695, 163367, 170173, 172257, 172732, 173581, 174001, 175126, 181900, 182168, 182342, 182858, 182976, 183706, 183902, 183936, 184939, 185744, 287831, 362832, 363923, 7083107, 7173092, 7342593, 7342594, 7342595, 7728766)
ORDER BY
products_counts.inflow ASC,
supplier.delivery_period ASC,
trademark.sort DESC,
trademark.name ASC
LIMIT
0, 3;Average query time is 4.5s on my dataset and this is unacceptable.
Solutions i see:
Add all columns from order clause to
products_counts table. But i have ~10 order types in application, so i should create a lot of columns and indexes. Plus products_counts have very intensively updates/inserts/deletes, so i need to perform immediately update all order-related columns (using triggers?).Is there other solution?
Explain:
```
+----+-------------+-----------------+--------+
Solution
Reviewing your table definitions shows that you have indexes matching across the tables involved. This should cause the joins to happen as quickly as possible within the limits of
However, sorting from multiple tables is more complex.
In 2007 Sergey Petrunia described the 3
http://s.petrunia.net/blog/?m=201407
From the table definitions and joins shown above, you can see that you will never get the fastest sort. That means that you will be dependent on
However, if you design and use a Materialized View you will be able to use the fastest sort algorithm.
To see the details defined for
http://dev.mysql.com/doc/refman/5.5/en/order-by-optimization.html
For
• Increase the
• Increase the
• Use less RAM per row by declaring columns only as large as needed for the actual values to be stored. [E.g. Reduce a varchar(256) to varchar(ActualLongestString)]
• Change the
There is more detail provided in the
increase
http://dev.mysql.com/doc/refman/5.7/en/order-by-optimization.html
Materialized Views - A Different Approach to Sorting Joined Tables
You alluded to Materialized Views with your question referring to using triggers. MySQL has no built in functionality to create a Materialized View but you do have the tools needed. By using triggers to spread the load you can maintain the Materialized View up to the moment.
The Materialized View is actually a table which is populated through procedural code to build or rebuild the Materialized View and maintained by triggers to keep the data up-to-date.
Since you are building a table which will have an index, then the Materialized View when queried can use the fastest sort method: Use index-based access method that produces ordered output
Since
But that is obviously too heavy a process to run after each update to the base tables where you manage the data. That is where the triggers come into play to keep the data up-to-date as changes are made. This way each
The FROMDUAL organization at http://www.fromdual.com/ has sample code for maintaining a Materialized View. So, rather than write my own samples I will point you to their samples:
http://www.fromdual.com/mysql-materialized-views
Example 1: Building a Materialized View
This gives you the Materialized View at the moment of the refresh. However, since you have a fast moving database, you also want to keep this view as up-to-date as possible.
Therefore the base data tables affected need to have triggers to propagate the changes from a base table to the Materialized View table. As one example:
Example 2: Inserting New Data Into a Materialized View
```
DELIMITER $$
CREATE TRIGGER sales_ins
AFTER INSERT ON sales
FOR EACH ROW
BEGIN
SET @old_price_sum = 0;
SET @old_amount_sum = 0;
SET @old_price_avg = 0;
SET @old_amount_avg = 0;
SET @old_sales_cnt = 0;
SELECT IFNULL(price_sum, 0), IFNULL(amount_sum, 0), IFNULL(price_avg, 0)
, IFNULL(amount_avg, 0), IFNULL(sales_cnt, 0)
FROM sales_mv
WHERE product_name = NEW.product_name
INTO @old_price_sum, @old_amount_sum, @old_price_avg
, @old_amount_avg, @old_sales_cnt
;
SET @new_price_sum = @old_price_sum + NEW.product_price;
SET @new_amount_sum = @old_amount_sum + NEW.product_amount;
SET @new_sales_cnt = @old_sales_cnt + 1;
SET @new_price_avg = @new_price_sum / @new_sales_cnt;
SET @new_amount_avg = @new
MySQL's join logic. However, sorting from multiple tables is more complex.
In 2007 Sergey Petrunia described the 3
MySQL sorting algorithms in order of speed for MySQL at: http://s.petrunia.net/blog/?m=201407
- Use index-based access method that produces ordered output
- Use
filesort()on 1st non-constant table
- Put join result into a temporary table and use
filesort()on it
From the table definitions and joins shown above, you can see that you will never get the fastest sort. That means that you will be dependent on
filesort() for the sort criteria you are using. However, if you design and use a Materialized View you will be able to use the fastest sort algorithm.
To see the details defined for
MySQL 5.5 sorting methods see:http://dev.mysql.com/doc/refman/5.5/en/order-by-optimization.html
For
MySQL 5.5 (in this example) to increase ORDER BY speed if you cannot get MySQL to use indexes rather than an extra sorting phase, try the following strategies: • Increase the
sort_buffer_size variable value. • Increase the
read_rnd_buffer_size variable value. • Use less RAM per row by declaring columns only as large as needed for the actual values to be stored. [E.g. Reduce a varchar(256) to varchar(ActualLongestString)]
• Change the
tmpdir system variable to point to a dedicated file system with large amounts of free space. (Other details are offered in the link above.)There is more detail provided in the
MySQL 5.7 documentation toincrease
ORDER speed, some of which may be slightly upgraded behaviors:http://dev.mysql.com/doc/refman/5.7/en/order-by-optimization.html
Materialized Views - A Different Approach to Sorting Joined Tables
You alluded to Materialized Views with your question referring to using triggers. MySQL has no built in functionality to create a Materialized View but you do have the tools needed. By using triggers to spread the load you can maintain the Materialized View up to the moment.
The Materialized View is actually a table which is populated through procedural code to build or rebuild the Materialized View and maintained by triggers to keep the data up-to-date.
Since you are building a table which will have an index, then the Materialized View when queried can use the fastest sort method: Use index-based access method that produces ordered output
Since
MySQL 5.5 uses triggers to maintain a Materialized View, you will also need a process, script, or stored procedure to build the initial Materialized View. But that is obviously too heavy a process to run after each update to the base tables where you manage the data. That is where the triggers come into play to keep the data up-to-date as changes are made. This way each
insert, update, and delete will propagate their changes, using your triggers, to the Materialized View.The FROMDUAL organization at http://www.fromdual.com/ has sample code for maintaining a Materialized View. So, rather than write my own samples I will point you to their samples:
http://www.fromdual.com/mysql-materialized-views
Example 1: Building a Materialized View
DROP TABLE sales_mv;
CREATE TABLE sales_mv (
product_name VARCHAR(128) NOT NULL
, price_sum DECIMAL(10,2) NOT NULL
, amount_sum INT NOT NULL
, price_avg FLOAT NOT NULL
, amount_avg FLOAT NOT NULL
, sales_cnt INT NOT NULL
, UNIQUE INDEX product (product_name)
);
INSERT INTO sales_mv
SELECT product_name
, SUM(product_price), SUM(product_amount)
, AVG(product_price), AVG(product_amount)
, COUNT(*)
FROM sales
GROUP BY product_name;This gives you the Materialized View at the moment of the refresh. However, since you have a fast moving database, you also want to keep this view as up-to-date as possible.
Therefore the base data tables affected need to have triggers to propagate the changes from a base table to the Materialized View table. As one example:
Example 2: Inserting New Data Into a Materialized View
```
DELIMITER $$
CREATE TRIGGER sales_ins
AFTER INSERT ON sales
FOR EACH ROW
BEGIN
SET @old_price_sum = 0;
SET @old_amount_sum = 0;
SET @old_price_avg = 0;
SET @old_amount_avg = 0;
SET @old_sales_cnt = 0;
SELECT IFNULL(price_sum, 0), IFNULL(amount_sum, 0), IFNULL(price_avg, 0)
, IFNULL(amount_avg, 0), IFNULL(sales_cnt, 0)
FROM sales_mv
WHERE product_name = NEW.product_name
INTO @old_price_sum, @old_amount_sum, @old_price_avg
, @old_amount_avg, @old_sales_cnt
;
SET @new_price_sum = @old_price_sum + NEW.product_price;
SET @new_amount_sum = @old_amount_sum + NEW.product_amount;
SET @new_sales_cnt = @old_sales_cnt + 1;
SET @new_price_avg = @new_price_sum / @new_sales_cnt;
SET @new_amount_avg = @new
Code Snippets
DROP TABLE sales_mv;
CREATE TABLE sales_mv (
product_name VARCHAR(128) NOT NULL
, price_sum DECIMAL(10,2) NOT NULL
, amount_sum INT NOT NULL
, price_avg FLOAT NOT NULL
, amount_avg FLOAT NOT NULL
, sales_cnt INT NOT NULL
, UNIQUE INDEX product (product_name)
);
INSERT INTO sales_mv
SELECT product_name
, SUM(product_price), SUM(product_amount)
, AVG(product_price), AVG(product_amount)
, COUNT(*)
FROM sales
GROUP BY product_name;DELIMITER $$
CREATE TRIGGER sales_ins
AFTER INSERT ON sales
FOR EACH ROW
BEGIN
SET @old_price_sum = 0;
SET @old_amount_sum = 0;
SET @old_price_avg = 0;
SET @old_amount_avg = 0;
SET @old_sales_cnt = 0;
SELECT IFNULL(price_sum, 0), IFNULL(amount_sum, 0), IFNULL(price_avg, 0)
, IFNULL(amount_avg, 0), IFNULL(sales_cnt, 0)
FROM sales_mv
WHERE product_name = NEW.product_name
INTO @old_price_sum, @old_amount_sum, @old_price_avg
, @old_amount_avg, @old_sales_cnt
;
SET @new_price_sum = @old_price_sum + NEW.product_price;
SET @new_amount_sum = @old_amount_sum + NEW.product_amount;
SET @new_sales_cnt = @old_sales_cnt + 1;
SET @new_price_avg = @new_price_sum / @new_sales_cnt;
SET @new_amount_avg = @new_amount_sum / @new_sales_cnt;
REPLACE INTO sales_mv
VALUES(NEW.product_name, @new_price_sum, @new_amount_sum, @new_price_avg
, @new_amount_avg, @new_sales_cnt)
;
END;
$$
DELIMITER ;Context
StackExchange Database Administrators Q#116494, answer score: 6
Revisions (0)
No revisions yet.