HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

Optimize a WHERE IN condition using an index

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
conditionwhereusingoptimizeindex

Problem

I have a table that looks like the following one:

CREATE TABLE foo ( 
    type       TINYINT,    -- low cardinality (2..5)
    id         BINARY(16), -- high cardinality
    other_data OTHERTYPE(n),
    CONSTRAINT foo_PK PRIMARY KEY (type, id)
);


One application accessing the database that contains that table runs point queries by type and id, and queries that filter by multiple (type, id) pairs.

The question is: What index would perform better in these cases? Currently, I group type column and do UNION on SELECT * FROM table WHERE type = ? AND id IN (?)

The rule of thumb to use the highest cardinality column first would still work best, or should I define a (type, id) index?

Note: The database is running on top of MySQL, but answers considering other RDBMSs would be interesting too.

Solution

Caveat: This answer applies to InnoDB in MySQL. It may not apply to other MySQL Engines, nor to other RDBMSs.

A PRIMARY KEY is

  • Clustered with the data, and



  • A uniqueness constraint, and



  • An index



SELECT * FROM table WHERE type = ? AND id IN (?,?,?) is best handled by PRIMARY KEY(type, id) with the columns in that order. (Second best is INDEX(type, id)).

  • Being an index, the lookup does not need to scan the entire table.



  • Being clustered, the lookup and the (of SELECT ) are done at the same time.



  • Being unique is not relevant to this SELECT.



  • The column order is required because the = must come first.



  • If there is only one item in the IN, the IN will be optimized to =; the index is still optimal.



Cardinality of type versus id is, contrary to a well entrenched wives' tale, irrelevant, at least for this query.

Side note: The order of items in the WHERE has no impact on performance; the order in the index does.

In this case the = should come first in the index, then more complex item(s), IN in this case.

id smells like a packed UUID. For huge tables that is a terrible type of index, probably in all RDBMSs, since it is so random that caching is impractical.

Index Cookbook for MySQL/MariaDB.

Context

StackExchange Database Administrators Q#169447, answer score: 2

Revisions (0)

No revisions yet.