patternsqlModerate
MySQL Explain has different row count than slow query log
Viewed 0 times
countlogthanqueryslowhasmysqldifferentexplainrow
Problem
I have this entry in slow query log:
When I explain this query, it's absolutely fine.
How do I optimize this
# User@Host: user[host] @ [ip]
# Thread_id: 1514428 Schema: db Last_errno: 0 Killed: 0
# Query_time: 2.795454 Lock_time: 0.000116 Rows_sent: 15 Rows_examined: 65207 Rows_affected: 0 Rows_read: 65207
# Bytsent: 26618
SET timestamp=1407511874;
select off.*,translated_title,translated_description
from ephpb2b_products off USE INDEX(id_viewed)
INNER JOIN ephpb2b_members mem
ON off.uid = mem.id
Left Join ephpb2b_product_language_new pol
ON off.id = pol.offer_id
and pol.language='en'
where off.approved=1
order by off.viewed
LIMIT 15;When I explain this query, it's absolutely fine.
mysql> explain select off.*,translated_title,translated_description from ephpb2b_products off USE INDEX(id_viewed) INNER JOIN ephpb2b_members mem ON off.uid = mem.id Left Join ephpb2b_product_language_new pol ON off.id = pol.offer_id and pol.language='en' where off.approved=1 order by off.viewed LIMIT 15;
+----+-------------+-------+--------+-------------------------+-------------+---------+---------------------------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+-------------------------+-------------+---------+---------------------------+------+-------------+
| 1 | SIMPLE | off | index | NULL | id_viewed | 4 | NULL | 3 | Using where |
| 1 | SIMPLE | mem | eq_ref | PRIMARY | PRIMARY | 4 | db.off.uid | 1 | Using index |
| 1 | SIMPLE | pol | ref | offer_id,id_language | offer_id | 5 | db.off.id | 4 | |
+----+-------------+-------+--------+-------------------------+-------------+---------+---------------------------+------+-------------+
3 rows in set (0.17 sec)How do I optimize this
Solution
In order to answer this question, you must understand what the rows column on explain means, and the difference between calculations based on statistics and post-execution statistics.
When you run explain, the rows column will tell you, for each table access, how many rows will be examined by using the intended filter. There are two ways to calculate that: either an index dive (that usually should give you exact results) or by using approximate statistics that each engine stores independently -up to 5.6- for each table. While the first method is preferred when it can be used (simple filters on a single indexed column), in many cases, only an approximation could be used -otherwise, the query optimiser would take as much time as the query execution itself.
In any case, the rows are the calculated rows to be read (not to be returned) per table access. Even if it were exact (and many times has differences with several order of magnitudes, but still is good enough for the optimiser), it does not predict the real number of rows accessed throughout a join. For example, if you join table
The slow log, like the handler statistics, or other profiling mechanisms tell you the real number of rows processed and sent, because those statistics are gathered after the execution, thus exact.
Regarding your particular case,
To check the actual number and kind (PK access, ref, index scan, table scan) of row operations. I would use it to test each table access individually.
For more specific help, we would need the table structure of each table and the approximate selectivity of each filter condition.
When you run explain, the rows column will tell you, for each table access, how many rows will be examined by using the intended filter. There are two ways to calculate that: either an index dive (that usually should give you exact results) or by using approximate statistics that each engine stores independently -up to 5.6- for each table. While the first method is preferred when it can be used (simple filters on a single indexed column), in many cases, only an approximation could be used -otherwise, the query optimiser would take as much time as the query execution itself.
In any case, the rows are the calculated rows to be read (not to be returned) per table access. Even if it were exact (and many times has differences with several order of magnitudes, but still is good enough for the optimiser), it does not predict the real number of rows accessed throughout a join. For example, if you join table
A (reading exactly X rows) and table B (reading exactly Y rows), in the order A -> B, the real number of rows read will be: X + # of rows returned by A (<=X) multiplied by Y, as standard mysql only supports nested loop joins.The slow log, like the handler statistics, or other profiling mechanisms tell you the real number of rows processed and sent, because those statistics are gathered after the execution, thus exact.
Regarding your particular case,
EXPLAIN is to blame because it shows that only 3 rows would be scanned for the first access, when in reality it may be doing a full index scan (as it is using the key only for sorting), which later gets multiplied for each join performed. Do not trust explain. You can use:FLUSH STATUS;
-- Execute your query here
SHOW STATUS like 'Hand%';To check the actual number and kind (PK access, ref, index scan, table scan) of row operations. I would use it to test each table access individually.
For more specific help, we would need the table structure of each table and the approximate selectivity of each filter condition.
Code Snippets
FLUSH STATUS;
-- Execute your query here
SHOW STATUS like 'Hand%';Context
StackExchange Database Administrators Q#73520, answer score: 11
Revisions (0)
No revisions yet.