HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

Is there a way to optimize sorting by columns of joined tables?

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
sortingtablescolumnsjoinedwayoptimizethere

Problem

This is my slow query:

SELECT `products_counts`.`cid`
FROM
  `products_counts` `products_counts`

  LEFT OUTER JOIN `products` `products` ON (
  `products_counts`.`product_id` = `products`.`id`
  )
  LEFT OUTER JOIN `trademarks` `trademark` ON (
  `products`.`trademark_id` = `trademark`.`id`
  )
  LEFT OUTER JOIN `suppliers` `supplier` ON (
  `products_counts`.`supplier_id` = `supplier`.`id`
  )
WHERE
  `products_counts`.product_id IN
  (159, 572, 1075, 1102, 1145, 1162, 1660, 2355, 2356, 2357, 3236, 6471, 6472, 6473, 8779, 9043, 9095, 9336, 9337, 9338, 9445, 10198, 10966, 10967, 10974, 11124, 11168, 16387, 16689, 16827, 17689, 17920, 17938, 17946, 17957, 21341, 21352, 21420, 21421, 21429, 21544, 27944, 27988, 30194, 30196, 30230, 30278, 30699, 31306, 31340, 32625, 34021, 34047, 38043, 43743, 48639, 48720, 52453, 55667, 56847, 57478, 58034, 61477, 62301, 65983, 66013, 66181, 66197, 66204, 66407, 66844, 66879, 67308, 68637, 73944, 74037, 74060, 77502, 90963, 101630, 101900, 101977, 101985, 101987, 105906, 108112, 123839, 126316, 135156, 135184, 138903, 142755, 143046, 143193, 143247, 144054, 150164, 150406, 154001, 154546, 157998, 159896, 161695, 163367, 170173, 172257, 172732, 173581, 174001, 175126, 181900, 182168, 182342, 182858, 182976, 183706, 183902, 183936, 184939, 185744, 287831, 362832, 363923, 7083107, 7173092, 7342593, 7342594, 7342595, 7728766)
ORDER BY
  products_counts.inflow ASC,
  supplier.delivery_period ASC,
  trademark.sort DESC,
  trademark.name ASC
LIMIT
  0, 3;


Average query time is 4.5s on my dataset and this is unacceptable.

Solutions i see:

Add all columns from order clause to products_counts table. But i have ~10 order types in application, so i should create a lot of columns and indexes. Plus products_counts have very intensively updates/inserts/deletes, so i need to perform immediately update all order-related columns (using triggers?).

Is there other solution?

Explain:

```
+----+-------------+-----------------+--------+

Solution

Reviewing your table definitions shows that you have indexes matching across the tables involved. This should cause the joins to happen as quickly as possible within the limits of MySQL's join logic.

However, sorting from multiple tables is more complex.

In 2007 Sergey Petrunia described the 3 MySQL sorting algorithms in order of speed for MySQL at:
http://s.petrunia.net/blog/?m=201407

  • Use index-based access method that produces ordered output



  • Use filesort() on 1st non-constant table



  • Put join result into a temporary table and use filesort() on it



From the table definitions and joins shown above, you can see that you will never get the fastest sort. That means that you will be dependent on filesort() for the sort criteria you are using.

However, if you design and use a Materialized View you will be able to use the fastest sort algorithm.


To see the details defined for MySQL 5.5 sorting methods see:
http://dev.mysql.com/doc/refman/5.5/en/order-by-optimization.html

For MySQL 5.5 (in this example) to increase ORDER BY speed if you cannot get MySQL to use indexes rather than an extra sorting phase, try the following strategies:

• Increase the sort_buffer_size variable value.

• Increase the read_rnd_buffer_size variable value.

• Use less RAM per row by declaring columns only as large as needed for the actual values to be stored. [E.g. Reduce a varchar(256) to varchar(ActualLongestString)]

• Change the tmpdir system variable to point to a dedicated file system with large amounts of free space. (Other details are offered in the link above.)


There is more detail provided in the MySQL 5.7 documentation to
increase ORDER speed, some of which may be slightly upgraded behaviors:


http://dev.mysql.com/doc/refman/5.7/en/order-by-optimization.html

Materialized Views - A Different Approach to Sorting Joined Tables

You alluded to Materialized Views with your question referring to using triggers. MySQL has no built in functionality to create a Materialized View but you do have the tools needed. By using triggers to spread the load you can maintain the Materialized View up to the moment.

The Materialized View is actually a table which is populated through procedural code to build or rebuild the Materialized View and maintained by triggers to keep the data up-to-date.

Since you are building a table which will have an index, then the Materialized View when queried can use the fastest sort method: Use index-based access method that produces ordered output

Since MySQL 5.5 uses triggers to maintain a Materialized View, you will also need a process, script, or stored procedure to build the initial Materialized View.

But that is obviously too heavy a process to run after each update to the base tables where you manage the data. That is where the triggers come into play to keep the data up-to-date as changes are made. This way each insert, update, and delete will propagate their changes, using your triggers, to the Materialized View.

The FROMDUAL organization at http://www.fromdual.com/ has sample code for maintaining a Materialized View. So, rather than write my own samples I will point you to their samples:

http://www.fromdual.com/mysql-materialized-views

Example 1: Building a Materialized View

DROP TABLE sales_mv;
CREATE TABLE sales_mv (
    product_name VARCHAR(128)  NOT NULL
  , price_sum    DECIMAL(10,2) NOT NULL
  , amount_sum   INT           NOT NULL
  , price_avg    FLOAT         NOT NULL
  , amount_avg   FLOAT         NOT NULL
  , sales_cnt    INT           NOT NULL
  , UNIQUE INDEX product (product_name)
);

INSERT INTO sales_mv
SELECT product_name
    , SUM(product_price), SUM(product_amount)
    , AVG(product_price), AVG(product_amount)
    , COUNT(*)
  FROM sales
GROUP BY product_name;


This gives you the Materialized View at the moment of the refresh. However, since you have a fast moving database, you also want to keep this view as up-to-date as possible.

Therefore the base data tables affected need to have triggers to propagate the changes from a base table to the Materialized View table. As one example:

Example 2: Inserting New Data Into a Materialized View

```
DELIMITER $$

CREATE TRIGGER sales_ins
AFTER INSERT ON sales
FOR EACH ROW
BEGIN

SET @old_price_sum = 0;
SET @old_amount_sum = 0;
SET @old_price_avg = 0;
SET @old_amount_avg = 0;
SET @old_sales_cnt = 0;

SELECT IFNULL(price_sum, 0), IFNULL(amount_sum, 0), IFNULL(price_avg, 0)
, IFNULL(amount_avg, 0), IFNULL(sales_cnt, 0)
FROM sales_mv
WHERE product_name = NEW.product_name
INTO @old_price_sum, @old_amount_sum, @old_price_avg
, @old_amount_avg, @old_sales_cnt
;

SET @new_price_sum = @old_price_sum + NEW.product_price;
SET @new_amount_sum = @old_amount_sum + NEW.product_amount;
SET @new_sales_cnt = @old_sales_cnt + 1;
SET @new_price_avg = @new_price_sum / @new_sales_cnt;
SET @new_amount_avg = @new

Code Snippets

DROP TABLE sales_mv;
CREATE TABLE sales_mv (
    product_name VARCHAR(128)  NOT NULL
  , price_sum    DECIMAL(10,2) NOT NULL
  , amount_sum   INT           NOT NULL
  , price_avg    FLOAT         NOT NULL
  , amount_avg   FLOAT         NOT NULL
  , sales_cnt    INT           NOT NULL
  , UNIQUE INDEX product (product_name)
);

INSERT INTO sales_mv
SELECT product_name
    , SUM(product_price), SUM(product_amount)
    , AVG(product_price), AVG(product_amount)
    , COUNT(*)
  FROM sales
GROUP BY product_name;
DELIMITER $$

CREATE TRIGGER sales_ins
AFTER INSERT ON sales
FOR EACH ROW
BEGIN

  SET @old_price_sum = 0;
  SET @old_amount_sum = 0;
  SET @old_price_avg = 0;
  SET @old_amount_avg = 0;
  SET @old_sales_cnt = 0;

  SELECT IFNULL(price_sum, 0), IFNULL(amount_sum, 0), IFNULL(price_avg, 0)
       , IFNULL(amount_avg, 0), IFNULL(sales_cnt, 0)
    FROM sales_mv
   WHERE product_name = NEW.product_name
    INTO @old_price_sum, @old_amount_sum, @old_price_avg
       , @old_amount_avg, @old_sales_cnt
  ;

  SET @new_price_sum = @old_price_sum + NEW.product_price;
  SET @new_amount_sum = @old_amount_sum + NEW.product_amount;
  SET @new_sales_cnt = @old_sales_cnt + 1;
  SET @new_price_avg = @new_price_sum / @new_sales_cnt;
  SET @new_amount_avg = @new_amount_sum / @new_sales_cnt;

  REPLACE INTO sales_mv
  VALUES(NEW.product_name, @new_price_sum, @new_amount_sum, @new_price_avg
       , @new_amount_avg, @new_sales_cnt)
  ;

END;
$$
DELIMITER ;

Context

StackExchange Database Administrators Q#116494, answer score: 6

Revisions (0)

No revisions yet.