patternsqlMinor
Optimize a WHERE IN condition using an index
Viewed 0 times
conditionwhereusingoptimizeindex
Problem
I have a table that looks like the following one:
One application accessing the database that contains that table runs point queries by
The question is: What index would perform better in these cases? Currently, I group
The rule of thumb to use the highest cardinality column first would still work best, or should I define a
Note: The database is running on top of MySQL, but answers considering other RDBMSs would be interesting too.
CREATE TABLE foo (
type TINYINT, -- low cardinality (2..5)
id BINARY(16), -- high cardinality
other_data OTHERTYPE(n),
CONSTRAINT foo_PK PRIMARY KEY (type, id)
);One application accessing the database that contains that table runs point queries by
type and id, and queries that filter by multiple (type, id) pairs.The question is: What index would perform better in these cases? Currently, I group
type column and do UNION on SELECT * FROM table WHERE type = ? AND id IN (?)The rule of thumb to use the highest cardinality column first would still work best, or should I define a
(type, id) index?Note: The database is running on top of MySQL, but answers considering other RDBMSs would be interesting too.
Solution
Caveat: This answer applies to InnoDB in MySQL. It may not apply to other MySQL Engines, nor to other RDBMSs.
A
Cardinality of
Side note: The order of items in the
In this case the
Index Cookbook for MySQL/MariaDB.
A
PRIMARY KEY is- Clustered with the data, and
- A uniqueness constraint, and
- An index
SELECT * FROM table WHERE type = ? AND id IN (?,?,?) is best handled by PRIMARY KEY(type, id) with the columns in that order. (Second best is INDEX(type, id)).- Being an index, the lookup does not need to scan the entire table.
- Being clustered, the lookup and the
(ofSELECT) are done at the same time.
- Being unique is not relevant to this
SELECT.
- The column order is required because the
=must come first.
- If there is only one item in the
IN, theINwill be optimized to=; the index is still optimal.
Cardinality of
type versus id is, contrary to a well entrenched wives' tale, irrelevant, at least for this query.Side note: The order of items in the
WHERE has no impact on performance; the order in the index does.In this case the
= should come first in the index, then more complex item(s), IN in this case.id smells like a packed UUID. For huge tables that is a terrible type of index, probably in all RDBMSs, since it is so random that caching is impractical.Index Cookbook for MySQL/MariaDB.
Context
StackExchange Database Administrators Q#169447, answer score: 2
Revisions (0)
No revisions yet.