HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

ORDER BY indexedColumn ridiculously slow when used with LIMIT on MySQL

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
orderwithusedridiculouslyindexedcolumnlimitslowmysqlwhen

Problem

The MyISAM table has 4.5m rows. Query with just LIMIT returns fast; query with just ORDER BY returns fast; combining both it never finishes. I've also tried USE INDEX(indexedColumn, Latitude, Longitude), but it doens't help.

SELECT * FROM stuff WHERE (Latitude BETWEEN '29.187190582784076' AND '29.761053992852936' AND Longitude BETWEEN '-101.0597705588786' AND '-99.7414111838786') LIMIT 100;


89 rows; returns in 0.71 seconds

SELECT * FROM stuff WHERE (Latitude BETWEEN '29.187190582784076' AND '29.761053992852936' AND Longitude BETWEEN '-101.0597705588786' AND '-99.7414111838786') ORDER BY indexedColumn DESC;


89 rows, returns in 0.84 seconds

SELECT * FROM stuff WHERE (Latitude BETWEEN '29.187190582784076' AND '29.761053992852936' AND Longitude BETWEEN '-101.0597705588786' AND '-99.7414111838786') ORDER BY indexedColumn DESC LIMIT 100;


hasnt returned after several minutes

EXPLAIN ORDER BY with LIMIT

+----+-------------+-------+-------+--------------------------+--------------+---------+------+------+-------------+
| id | select_type | table | type  | possible_keys            | key          | key_len | ref  | rows | Extra       |
+----+-------------+-------+-------+--------------------------+--------------+---------+------+------+-------------+
|  1 | SIMPLE      | Stuff | index | Latitude,Longitude       | indexedColumn| 5       | NULL | 9643 | Using where |
+----+-------------+-------+-------+--------------------------+--------------+---------+------+------+-------------+


LIMIT ONLY

```
+----+-------------+-------+-------+--------------------+----------+---------+------+-------+-----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+--------------------+----------+---------+------+-------+-----------------------------------------------+
| 1 | SIMPLE | s

Solution

From the EXPLAINs it seems quite clear what happens - the "limit only" one uses index on latitude because it finds it most useful, then tries all found rows for longitude until it gathers 100 of them and quits.

The "order only" uses the same path, but does not stop after 100 because it needs all of them - it then sorts all matching rows by a filesort (I suppose only a small part of those 81k are returned so the sort is fast).

But the slow one (wrongly) uses "order by .. limit" optimization - for some reason it thinks that lot of rows will satisfy the condition so goes by the indexedColumn order. But in reality there are too few matching ones so it walks a big part of the table before finding the 100 matching rows. That means that a big portion of the table needs to be read to memory from disk in the case the available memory is not big enough to contain it (or it was not loaded before).

The results - you actually do NOT want to use the index on indexedColumn as there are so few matching rows that sorting them "the slow way" is actually more effective than using the index. You can do that by ignore index(indexedColumn) and/or you can probably add a composite index over both (Latitude, Longitude). That should return only few hundreds of rows and any operation on them will become trivial.

Context

StackExchange Database Administrators Q#118907, answer score: 3

Revisions (0)

No revisions yet.