patternsqlMinor
MySQL order by is equal, in what order are the results returned?
Viewed 0 times
returnedorderthewhatequalaremysqlresults
Problem
I use
I have rows that are ordered by an order column.
Turns out, many times, the order number is the same, say 1. But the results are returned in a fixed order.
Will return results ordered by what if fooOrder is always 1?
There is obviously an order, but I don't know which one.
mysql Ver 14.14 Distrib 5.5.44, for debian-linux-gnu (x86_64) using readline 6.3I have rows that are ordered by an order column.
Turns out, many times, the order number is the same, say 1. But the results are returned in a fixed order.
SELECT * from mytable ORDER BY fooOrder;Will return results ordered by what if fooOrder is always 1?
There is obviously an order, but I don't know which one.
Solution
Consider running the
Example:
This output will display information about how the query optimiser went about retrieving the data.
Example output:
As you can see in my example the
With this information you can then proceed to consult the MySQL documentation for each condition.
ORDER BY Optimization
Depending on what your
I'll proceed with the
The Original filesort Algorithm
The original filesort algorithm works as follows:
-
Read all rows according to key or by table scanning. Skip rows that do not match the WHERE clause.
-
For each row, store in the sort buffer a tuple consisting of a pair of values (the sort key value and the row ID).
-
If all pairs fit into the sort buffer, no temporary file is created. Otherwise, when the sort buffer becomes full, run a quicksort on it in memory and write it to a temporary file. Save a pointer to the sorted block.
-
Repeat the preceding steps until all rows have been read.
-
Do a multi-merge of up to MERGEBUFF (7) regions to one block in another temporary file. Repeat until all blocks from the first file are in the second file.
-
Repeat the following until there are fewer than MERGEBUFF2 (15) blocks left.
-
On the last multi-merge, only the row ID (the last part of the value pair) is written to a result file.
-
Read the rows in sorted order using the row IDs in the result file. To optimize this, read in a large block of row IDs, sort them, and use them to read the rows in sorted order into a row buffer. The row buffer size is the read_rnd_buffer_size system variable value. The code for this step is in the sql/records.cc source file.
One problem with this approach is that it reads rows twice: Once during WHERE clause evaluation, and again after sorting the value pairs. And even if the rows were accessed successively the first time (for example, if a table scan is done), the second time they are accessed randomly. (The sort keys are ordered, but the row positions are not.)
The Modified filesort Algorithm
The modified filesort algorithm incorporates an optimization to avoid reading the rows twice: It records the sort key value, but instead of the row ID, it records the columns referenced by the query. The modified filesort algorithm works like this:
-
Read the rows that match the WHERE clause.
-
For each row, store in the sort buffer a tuple consisting of the sort key value and the columns referenced by the query.
-
When the sort buffer becomes full, sort the tuples by sort key value in memory and write it to a temporary file.
-
After merge-sorting the temporary file, retrieve the rows in sorted order, but read the columns required by the query directly from the sorted tuples rather than by accessing the table a second time.
The tuples used by the modified filesort algorithm are longer than the pairs used by the original algorithm, and fewer of them fit in the sort buffer. As a result, it is possible for the extra I/O to make the modified approach slower, not faster. To avoid a slowdown, the optimizer uses the modified algorithm only if the total size of the extra columns in the sort tuple does not exceed the value of the max_length_for_sort_data system variable. (A symptom of setting the value of this variable too high is a combination of high disk activity and low CPU activity.)
Solution
Run your query with
If you have an index on the table it might retrieve the data according to the index. If you don't have an index, then it might fall back to the
Reference
EXPLAIN or EXPLAIN EXTENDED command in front of your SELECT statement. This will produce an explanation of how the query optimiser queries the database to retrieve the data you are asking it for.Example:
EXPLAIN EXTENDED SELECT * from mytable ORDER BY fooOrder;This output will display information about how the query optimiser went about retrieving the data.
Example output:
+----+-------------+-------------+------+---------------+------+---------+------+------+----------+-----------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------------+------+---------------+------+---------+------+------+----------+-----------------------------+
| 1 | SIMPLE | mytable | ALL | NULL | NULL | NULL | NULL | 1141 | 100.00 | Using filesort |
+----+-------------+-------------+------+---------------+------+---------+------+------+----------+-----------------------------+
1 row in set, 1 warning (0.00 sec)As you can see in my example the
EXTRA column displays additional information. Here a Using filesort. With this information you can then proceed to consult the MySQL documentation for each condition.
ORDER BY Optimization
Depending on what your
EXPLAIN EXTENDED results in, you can consult the official ORDER BY Optimization page to retrieve additional information. I'll proceed with the
Using filesort condition from my example which is explained as follows:The Original filesort Algorithm
The original filesort algorithm works as follows:
-
Read all rows according to key or by table scanning. Skip rows that do not match the WHERE clause.
-
For each row, store in the sort buffer a tuple consisting of a pair of values (the sort key value and the row ID).
-
If all pairs fit into the sort buffer, no temporary file is created. Otherwise, when the sort buffer becomes full, run a quicksort on it in memory and write it to a temporary file. Save a pointer to the sorted block.
-
Repeat the preceding steps until all rows have been read.
-
Do a multi-merge of up to MERGEBUFF (7) regions to one block in another temporary file. Repeat until all blocks from the first file are in the second file.
-
Repeat the following until there are fewer than MERGEBUFF2 (15) blocks left.
-
On the last multi-merge, only the row ID (the last part of the value pair) is written to a result file.
-
Read the rows in sorted order using the row IDs in the result file. To optimize this, read in a large block of row IDs, sort them, and use them to read the rows in sorted order into a row buffer. The row buffer size is the read_rnd_buffer_size system variable value. The code for this step is in the sql/records.cc source file.
One problem with this approach is that it reads rows twice: Once during WHERE clause evaluation, and again after sorting the value pairs. And even if the rows were accessed successively the first time (for example, if a table scan is done), the second time they are accessed randomly. (The sort keys are ordered, but the row positions are not.)
The Modified filesort Algorithm
The modified filesort algorithm incorporates an optimization to avoid reading the rows twice: It records the sort key value, but instead of the row ID, it records the columns referenced by the query. The modified filesort algorithm works like this:
-
Read the rows that match the WHERE clause.
-
For each row, store in the sort buffer a tuple consisting of the sort key value and the columns referenced by the query.
-
When the sort buffer becomes full, sort the tuples by sort key value in memory and write it to a temporary file.
-
After merge-sorting the temporary file, retrieve the rows in sorted order, but read the columns required by the query directly from the sorted tuples rather than by accessing the table a second time.
The tuples used by the modified filesort algorithm are longer than the pairs used by the original algorithm, and fewer of them fit in the sort buffer. As a result, it is possible for the extra I/O to make the modified approach slower, not faster. To avoid a slowdown, the optimizer uses the modified algorithm only if the total size of the extra columns in the sort tuple does not exceed the value of the max_length_for_sort_data system variable. (A symptom of setting the value of this variable too high is a combination of high disk activity and low CPU activity.)
Solution
Run your query with
EXPLAIN EXTENDED and see what the query optimiser tells you. Then proceed from there.If you have an index on the table it might retrieve the data according to the index. If you don't have an index, then it might fall back to the
FILESORT algorithm. The conditions dictate the ORDER BY Optimization.Reference
- 8.8.2 EXPLAIN Output Format (MySQ
Code Snippets
EXPLAIN EXTENDED SELECT * from mytable ORDER BY fooOrder;+----+-------------+-------------+------+---------------+------+---------+------+------+----------+-----------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------------+------+---------------+------+---------+------+------+----------+-----------------------------+
| 1 | SIMPLE | mytable | ALL | NULL | NULL | NULL | NULL | 1141 | 100.00 | Using filesort |
+----+-------------+-------------+------+---------------+------+---------+------+------+----------+-----------------------------+
1 row in set, 1 warning (0.00 sec)Context
StackExchange Database Administrators Q#120961, answer score: 3
Revisions (0)
No revisions yet.