HiveBrain v1.2.0
Get Started
← Back to all entries
snippetsqlMinor

MySQL query WHERE IN + ORDER BY, which EXPLAIN is better and how to avoid filesort?

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
orderquerywhereandbetteravoidmysqlfilesorthowwhich

Problem

This is my table definition, without any special index for now:

CREATE TABLE `filter` (
  `field_name` varchar(32) COLLATE utf8_unicode_ci NOT NULL,
  `category_id` varchar(32) COLLATE utf8_unicode_ci NOT NULL,
  `position` smallint(6) NOT NULL,
  `options` longtext COLLATE utf8_unicode_ci COMMENT '(DC2Type:json_array)',
  PRIMARY KEY (`field_name`,`category_id`),
  KEY `IDX_702C956612469DE2` (`category_id`),
  CONSTRAINT `FK_702C956612469DE2` 
      FOREIGN KEY (`category_id`) REFERENCES `category` (`id`) ON DELETE CASCADE
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;


Sample data:

+------------+-----------------------+-------------+----------+
| field_name |        options        | category_id | position |
+------------+-----------------------+-------------+----------+
| color      | {"label": "Color"}    |        1895 |        1 |
| material   | {"label": "Material"} |        1895 |        2 |
| color      | {"label": "Color"}    |        1896 |        1 |
| color      | {"label": "Color"}    |        1897 |        1 |
+------------+-----------------------+-------------+----------+


Most of the time I do a query with WHERE IN and ORDER BY position and selecting only field_name and options:

EXPLAIN SELECT 
  field_name
  options
FROM 
  filter
WHERE 
  category_id IN ('1895', '1896', '1897') 
ORDER BY 
  position ASC


This gives me Using index condition; Using filesort, quite bad uh?

``
{
"query_block":{
"select_id":1,
"ordering_operation":{
"using_filesort":true,
"table":{
"table_name":"filter",
"access_type":"range",
"possible_keys":[
"IDX_702C956612469DE2"
],
"key":"IDX_702C956612469DE2",
"used_key_parts":[
"category_id"
],
"key_length":"98",
"rows":5,
"filtered":100,
"index_condition":"(
filter.category_id` in ('18

Solution

Given the structure of multicolumn B-tree index it is not viable to do a sort on position when using IN on category_id. But as the test data suggest the position is not "global" but seems to have a meaning only for given category_id. So as I suggested in comments, it is instead possible to ORDER BY (category_id, position) - that can use two-column index on those columns to get it without filesort. http://sqlfiddle.com/#!9/aabaa/7

Because of the longtext column it is not possible to turn this into an index-only scan (text columns can be only indexed on prefix, not on entire value, as index key length is limited).

Using varchar fields as primary key has some drawbacks, but it is not "wrong" by itself so just a little suggestion - would it make sense in your desing to use some integer ids instead?

Context

StackExchange Database Administrators Q#108140, answer score: 2

Revisions (0)

No revisions yet.