HiveBrain v1.2.0
Get Started
← Back to all entries
gotchasqlMinor

Unexpected index scan when doing query against varchar column with numeric parameter

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
scancolumnwithnumericquerydoingvarcharagainstunexpectedwhen

Problem

I have a test table with schema like this:

CREATE TABLE `indextest` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `name` varchar(10) DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `idx_name` (`name`)
) ENGINE=InnoDB AUTO_INCREMENT=5 DEFAULT CHARSET=latin1


and here are the rows in the table:

mysql [localhost] {msandbox} (test) > select * from indextest;
+----+--------+
| id | name   |
+----+--------+
|  3 | 111222 |
|  1 | hello  |
|  2 | world  |
|  4 | wow    |
+----+--------+
4 rows in set (0.00 sec)


when I query the table against the name column with a string, it looks good:

mysql [localhost] {msandbox} (test) > explain select * from indextest where name='111222'\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: indextest
         type: ref
possible_keys: idx_name
          key: idx_name
      key_len: 13
          ref: const
         rows: 1
        Extra: Using where; Using index
1 row in set (0.00 sec)


But if I use a numeric as the query parameter, explain shows that the query optimizer is doing index scan:

mysql [localhost] {msandbox} (test) > explain select * from indextest where name=111222\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: indextest
         type: index
possible_keys: idx_name
          key: idx_name
      key_len: 13
          ref: NULL
         rows: 4
        Extra: Using where; Using index
1 row in set (0.00 sec)


and under some other conditions (found in online slow queries), the query optimizer even suggests doing similar queries with table scan.

I don't understand why it behaves like this but not just raises error or converts the numeric to a string automatically.

Solution

I'd assume it's because numbers can be represented multiple ways in text. According to the docs on How MySQL Uses Indexes:


Comparison of dissimilar columns may prevent use of indexes if values cannot be compared directly without conversion. Suppose that a numeric column is compared to a string column. For a given value such as 1 in the numeric column, it might compare equal to any number of values in the string column such as '1', ' 1', '00001', or '01.e1' (sic). This rules out use of any indexes for the string column.

(I think 01.e1 is a typo which should read 0.1e1.)

Since a number does not equate to one and only one text value, the index can't be used.

On the other hand, this same situation does not apply in the opposite direction. A text representation of a number does equate to one and only one numeric value. In other words:

While numeric 1 could equate to texts '1', ' 1', '00001', or '0.1e1',

the texts '1', ' 1', '00001', and '0.1e1' all equate only to numeric 1.

So, if instead you had a numeric column and were comparing string values to it, the index can be used as you expect. (Really, I just tried it.)

Context

StackExchange Database Administrators Q#12385, answer score: 4

Revisions (0)

No revisions yet.