patternsqlMinor
Optimizing Join between a very small and a very large table
Viewed 0 times
joinbetweensmalloptimizinglargeveryandtable
Problem
I have two tables:
The fields
The first table represents a call to a service in a SOA Bus, and the second table represents the service attributes.
The first can contains more than 100.000.000 rows and the second about 100 rows.
My goal is to select filtering by some field, and paginating by
For example, if the user wants to see the first page filtering by
This works well and the result is shown very fast.
However, the results need to be ordered by the las
CREATE TABLE `TBL_REPORT` (
`ID_TBL_REPORT` bigint(20) NOT NULL AUTO_INCREMENT,
`INDEX_NAME` varchar(100) NOT NULL,
`INDEX_VALUE` varchar(100) NOT NULL,
`FK_TBL_REPORT_PROXY` bigint(20) NOT NULL,
`PAYLOAD` mediumtext NOT NULL,
`OPERATION` varchar(200) DEFAULT NULL,
`ERROR` varchar(50) NOT NULL,
`SERVER` varchar(50) NOT NULL,
`DATA` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
PRIMARY KEY (`ID_TBL_REPORT`),
KEY `TBL_INTER_TBL_INTER_PROXY_FK` (`FK_TBL_REPORT_PROXY`),
KEY `IX_INDEX_NAME` (`INDEX_NAME`),
KEY `IX_INDEX_VALUE` (`INDEX_VALUE`),
KEY `IX_DATA` (`DATA`),
CONSTRAINT `TBL_INTER_TBL_INTER_PROXY_FK` FOREIGN KEY (`FK_TBL_REPORT_PROXY`) REFERENCES `TBL_REPORT_PROXY` (`ID_TBL_REPORT_PROXY`)
) ENGINE=InnoDB AUTO_INCREMENT=4769095 DEFAULT CHARSET=latin1
CREATE TABLE `TBL_REPORT_PROXY` (
`ID_TBL_REPORT_PROXY` bigint(20) NOT NULL AUTO_INCREMENT,
`PROXY` varchar(100) NOT NULL,
`SERVICE_GROUP` varchar(150) NOT NULL,
PRIMARY KEY (`ID_TBL_REPORT_PROXY`)
) ENGINE=InnoDB AUTO_INCREMENT=6 DEFAULT CHARSET=latin1The fields
INDEX_NAME, INDEX_VALUE and DATA has indexes created.The first table represents a call to a service in a SOA Bus, and the second table represents the service attributes.
The first can contains more than 100.000.000 rows and the second about 100 rows.
My goal is to select filtering by some field, and paginating by
limit n,10. The fields that can be used to filter are PROXY, INDEX_NAME, ERROR and DATA.For example, if the user wants to see the first page filtering by
PROXY, the query is:select R.ID_TBL_REPORT, R.INDEX_NAME, R.INDEX_VALUE, R.ERROR, R.DATA, P.PROXY, P.SERVICE_GROUP
from TBL_REPORT R, TBL_REPORT_PROXY P
where P.PROXY = 'MyProxy'
and R.FK_TBL_REPORT_PROXY = P.ID_TBL_REPORT_PROXY
limit 1,10;This works well and the result is shown very fast.
However, the results need to be ordered by the las
Solution
Wrong interpretation of the error.
You had a power failure (or some other crash). You are using MyISAM (you should use InnoDB). Those caused the "Incorrect key file for table".
If either says to "repair", then perform
If that does not work, look at this value:
Does that refer to a separate filesystem? Is that filesystem small? Perhaps you ran out of disk space in that filesystem (/tmp)? If so, consider not putting tmpdir in it. (Actually, I would expect a different error message if you ran out of disk space.)
Please use
The optimizer wants to start with the smaller table, especially since you have a WHERE clause restricting the rows there.
I have two suggestions; either or both might help:
This version tries to force it to use your index starting with DATA, which would let it quit after 10 rows:
This version switches to
(I switched from comma-JOIN to the preferred explicit JOIN.)
Please run
I have a blog on the evils of pagination using OFFSET and LIMIT. It includes a much more efficient alternative. However, you must first get past the immediate problem.
You had a power failure (or some other crash). You are using MyISAM (you should use InnoDB). Those caused the "Incorrect key file for table".
CHECK TABLE TBL_REPORT R;
CHECK TABLE TBL_REPORT_PROXY;If either says to "repair", then perform
REPAIR TABLE.If that does not work, look at this value:
SHOW VARIABLES LIKE 'tmpdir';Does that refer to a separate filesystem? Is that filesystem small? Perhaps you ran out of disk space in that filesystem (/tmp)? If so, consider not putting tmpdir in it. (Actually, I would expect a different error message if you ran out of disk space.)
Please use
SHOW CREATE TABLE; it is more descriptive than DESCRIBE. I can't tell what indexes you have. You need one starting with FK_TBL_REPORT_PROXY and/or one starting with DATA.The optimizer wants to start with the smaller table, especially since you have a WHERE clause restricting the rows there.
I have two suggestions; either or both might help:
This version tries to force it to use your index starting with DATA, which would let it quit after 10 rows:
select R.ID_TBL_REPORT, R.INDEX_NAME, R.INDEX_VALUE, R.ERROR,
R.DATA, P.PROXY, P.SERVICE_GROUP
from TBL_REPORT R FORCE INDEX(DATA)
JOIN TBL_REPORT_PROXY P IGNORE INDEX(PROXY)
ON R.FK_TBL_REPORT_PROXY = P.ID_TBL_REPORT_PROXY
where P.PROXY = 'MyProxy'
order by R.DATA desc -- ordering by the the last first
limit 1,10;This version switches to
HAVING in an attempt to trick it into not starting with the PROXY table:select R.ID_TBL_REPORT, R.INDEX_NAME, R.INDEX_VALUE, R.ERROR,
R.DATA, P.PROXY, P.SERVICE_GROUP
from TBL_REPORT R
JOIN TBL_REPORT_PROXY P
ON R.FK_TBL_REPORT_PROXY = P.ID_TBL_REPORT_PROXY
HAVING P.PROXY = 'MyProxy'
order by R.DATA desc -- ordering by the the last first
limit 1,10;(I switched from comma-JOIN to the preferred explicit JOIN.)
Please run
EXPLAIN EXTENDED SELECT ...; then SHOW WARNINGS; on each of them, and show us the outputs.I have a blog on the evils of pagination using OFFSET and LIMIT. It includes a much more efficient alternative. However, you must first get past the immediate problem.
Code Snippets
CHECK TABLE TBL_REPORT R;
CHECK TABLE TBL_REPORT_PROXY;SHOW VARIABLES LIKE 'tmpdir';select R.ID_TBL_REPORT, R.INDEX_NAME, R.INDEX_VALUE, R.ERROR,
R.DATA, P.PROXY, P.SERVICE_GROUP
from TBL_REPORT R FORCE INDEX(DATA)
JOIN TBL_REPORT_PROXY P IGNORE INDEX(PROXY)
ON R.FK_TBL_REPORT_PROXY = P.ID_TBL_REPORT_PROXY
where P.PROXY = 'MyProxy'
order by R.DATA desc -- ordering by the the last first
limit 1,10;select R.ID_TBL_REPORT, R.INDEX_NAME, R.INDEX_VALUE, R.ERROR,
R.DATA, P.PROXY, P.SERVICE_GROUP
from TBL_REPORT R
JOIN TBL_REPORT_PROXY P
ON R.FK_TBL_REPORT_PROXY = P.ID_TBL_REPORT_PROXY
HAVING P.PROXY = 'MyProxy'
order by R.DATA desc -- ordering by the the last first
limit 1,10;Context
StackExchange Database Administrators Q#96825, answer score: 2
Revisions (0)
No revisions yet.