HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

Why is mysql using the wrong index for order by query?

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
whytheorderquerymysqlusingwrongforindex

Problem

Here is my table with ~10,000,000 rows data

CREATE TABLE `votes` (
  `subject_name` varchar(32) COLLATE utf8_unicode_ci NOT NULL,
  `subject_id` int(11) NOT NULL,
  `voter_id` int(11) NOT NULL,
  `rate` int(11) NOT NULL,
  `updated_at` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
  PRIMARY KEY (`subject_name`,`subject_id`,`voter_id`),
  KEY `IDX_518B7ACFEBB4B8AD` (`voter_id`),
  KEY `subject_timestamp` (`subject_name`,`subject_id`,`updated_at`),
  KEY `voter_timestamp` (`voter_id`,`updated_at`),
  CONSTRAINT `FK_518B7ACFEBB4B8AD` FOREIGN KEY (`voter_id`) REFERENCES `users` (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;


Here is the indexes cardinalities

So when I do this query:

SELECT SQL_NO_CACHE * FROM votes WHERE 
    voter_id = 1099 AND 
    rate = 1 AND 
    subject_name = 'medium'
ORDER BY updated_at DESC
LIMIT 20 OFFSET 100;


I was expecting it uses index voter_timestamp
but mysql chooses to use this instead:

explain select SQL_NO_CACHE * from votes  where subject_name = 'medium' and voter_id = 1001 and rate = 1 order by updated_at desc limit 20 offset 100;`

type:
    index_merge
possible_keys: 
    PRIMARY,IDX_518B7ACFEBB4B8AD,subject_timestamp,voter_timestamp
key:
    IDX_518B7ACFEBB4B8AD,PRIMARY
key_len:
    102,98
ref:
    NULL
rows:
    9255
filtered:
    10.00
Extra:
    Using intersect(IDX_518B7ACFEBB4B8AD,PRIMARY); Using where; Using filesort


And I got 200-400ms query time.

If I force it to use the right index like:

SELECT SQL_NO_CACHE * FROM votes USE INDEX (voter_timestamp) WHERE 
    voter_id = 1099 AND 
    rate = 1 AND 
    subject_name = 'medium'
ORDER BY updated_at DESC
LIMIT 20 OFFSET 100;


Mysql can return the results in 1-2ms

and here is the explain:

type:
    ref
possible_keys:
    voter_timestamp
key:
    voter_timestamp
key_len:
    4
ref:
    const
rows:
    18714
filtered:
    1.00
Extra:
    Using where


So why didn't mysql choose the voter_timestamp index fo

Solution

For that query, you need this index:

INDEX(voter_id, rate, subject_name, updated_at)


The updated_at must be last; the other three can be in any order. (ypercube's 3-column indexes are not very useful since they don't finish off the WHERE columns before hitting the ORDER BY column.)

As you add this index, you can probably get rid of all the other secondary keys:

KEY IDX_518B7ACFEBB4B8AD (voter_id), -- The FK can use my index
KEY subject_timestamp (subject_name,subject_id,updated_at), -- mostly redundant
KEY voter_timestamp (voter_id,updated_at), -- may have been your attempt

With the 4-column index, you have a chance of optimizing the "pagination" and avoid OFFSET. See this blog.

On another topic... When I see X_name and X_id, I assume "normalization" is going on. I would expect to see those two columns in a table, with virtually nothing else. I would not expect to see both in some other table.

(voter_id, updated_at) won't get past voter_id since it has not finished with filtering (the WHERE). Then, since a the other index is smaller, it is picked. Mine has 3 columns to take care of filtering, then the column for ORDER BY.

Code Snippets

INDEX(voter_id, rate, subject_name, updated_at)

Context

StackExchange Database Administrators Q#151376, answer score: 7

Revisions (0)

No revisions yet.