HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

Query optimization with extreme statistic

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
withstatisticqueryoptimizationextreme

Problem

I have a table:

mysql> describe table_1; 
+---------------+------------------+------+-----+---------------------+----------------+
| Field         | Type             | Null | Key | Default             | Extra          |
+---------------+------------------+------+-----+---------------------+----------------+
| id            | int(12)          | NO   | PRI | NULL                | auto_increment |
| date          | datetime         | YES  |     | 0000-00-00 00:00:00 |                |
| col1          | tinyint(1)       | YES  |     | 1                   |                |
| col2          | longtext         | YES  |     | NULL                |                |
| col3          | tinyint(1)       | YES  |     | 1                   |                |
+---------------+------------------+------+-----+---------------------+----------------+


with the following characteristics:

mysql> select distinct(col1), count(col1) from table_1 group by col1;
+--------+---------------+
| col1   | count(col1)   |
+--------+---------------+
|      0 |           200 |
|      1 |        689747 |
+--------+---------------+


How can I optimize queries like these, and with what kind of indexes?

1. SELECT * FROM table_1 WHERE col1 = 1
AND id NOT IN (79869)  ORDER BY  date  DESC LIMIT 0, 10;
2. SELECT * FROM table_1 WHERE col1 = 1 AND id <> 18983
AND date > '2012-10-26'  AND ( col2 LIKE '%word1%' OR  col2 LIKE '%word2%')
ORDER BY date DESC LIMIT 5;

Solution

At the risk of stating the obvious, you have a cardinality problem with col1.

Looking at the two queries you posted in the question:

SELECT * FROM table_1 WHERE col1 = 1
AND id NOT IN (79869) ORDER BY date DESC LIMIT 0, 10;

SELECT * FROM table_1 WHERE col1 = 1 AND id <> 18983
AND date > '2012-10-26' AND (col2 LIKE '%word1%' OR col2 LIKE '%word2%')
ORDER BY date DESC LIMIT 5;


you need good indexes that will accommodate these two queries. In your particular case, you will need just one index.

If the table is MyISAM, here is that index

ALTER TABLE table_1 ADD INDEX col1_date_ndx (col1,date,col2,id);


If the table is InnoDB, here is that index

ALTER TABLE table_1 ADD INDEX col1_date_ndx (col1,date,col2);


How will this index help? Although you have to live with the lopsided cardinality of col1, you improve each query's chances of being effectively searched.

You could also take a chance on refactoring the queries. How ???

Perhaps you could collect the ids only then join the ids back to the original table.

QUERY #1

Here is the first query

SELECT * FROM table_1 WHERE col1 = 1
AND id NOT IN (79869) ORDER BY date DESC LIMIT 0, 10;


Collect the ids and join back to table_1

SELECT B.* FROM
(SELECT id FROM table_1 WHERE col1 = 1
AND id NOT IN (79869) ORDER BY date DESC LIMIT 0, 10) A
LEFT JOIN table_1 B USING (id);


QUERY #2

Here is the second query

SELECT * FROM table_1 WHERE col1 = 1 AND id <> 18983
AND date > '2012-10-26' AND (col2 LIKE '%word1%' OR col2 LIKE '%word2%')
ORDER BY date DESC LIMIT 5;


Collect the ids and join back to table_1

SELECT B* FROM
(SELECT * FROM table_1 WHERE col1 = 1 AND id <> 18983
AND date > '2012-10-26' AND (col2 LIKE '%word1%' OR col2 LIKE '%word2%')
ORDER BY date DESC LIMIT 5) A
LEFT JOIN table_1 B USING (id);


I cannot make promises on the queries, but, at the very least, the index suggestion should help.

Give it a Try !!!

Code Snippets

SELECT * FROM table_1 WHERE col1 = 1
AND id NOT IN (79869) ORDER BY date DESC LIMIT 0, 10;

SELECT * FROM table_1 WHERE col1 = 1 AND id <> 18983
AND date > '2012-10-26' AND (col2 LIKE '%word1%' OR col2 LIKE '%word2%')
ORDER BY date DESC LIMIT 5;
ALTER TABLE table_1 ADD INDEX col1_date_ndx (col1,date,col2,id);
ALTER TABLE table_1 ADD INDEX col1_date_ndx (col1,date,col2);
SELECT * FROM table_1 WHERE col1 = 1
AND id NOT IN (79869) ORDER BY date DESC LIMIT 0, 10;
SELECT B.* FROM
(SELECT id FROM table_1 WHERE col1 = 1
AND id NOT IN (79869) ORDER BY date DESC LIMIT 0, 10) A
LEFT JOIN table_1 B USING (id);

Context

StackExchange Database Administrators Q#33533, answer score: 3

Revisions (0)

No revisions yet.