patternsqlMinor
Innodb buffer pool and multiple joins
Viewed 0 times
innodbjoinsmultipleandpoolbuffer
Problem
assume we have 1 Billion records and want to store them in
Note: we have enough RAM to store all data in buffer pool. ( about 40 GB )
In our situation,
the question is: WHILE the whole data is ON buffer pool ( RAM ), is there any difference
In some cases it's better for us to have separated tables for other kind of light queries.
But we fear of
One or Three tables by post_type.post_id -> primary key
post_type -> ( page, group, user )
user_id -> index
date -> indexNote: we have enough RAM to store all data in buffer pool. ( about 40 GB )
In our situation,
Reading from database is too muchthe question is: WHILE the whole data is ON buffer pool ( RAM ), is there any difference
in performance to have 3 separated tables and Join them, or have 1 big table and Select from it without any Join.In some cases it's better for us to have separated tables for other kind of light queries.
But we fear of
JOINS.Solution
InnoDB Architecture
Please keep in mind what goes into the InnoDB Buffer Pool
MySQL's idea of a JOIN
Believe it or not, whether you have one big table and process a WHERE clause or you have joins on multiples, the query processing has the same mechanical behavior. (I wrote about this a year ago : Is there an execution difference between a JOIN condition and a WHERE condition?) You can also find this described in the MySQL Documentation.
You said you fears JOINs. MySQL treats every query like a JOIN. The book I quoted in my post calls a
Effects on the InnoDB Buffer Pool
A single table with all needed fields would be loaded into the buffer pool and may produce, at most, 1 temp table for gathering data satisfying a where clause. On the flipside of things, having a single table would mean denormalized data and introducing redundant data into the buffer pool as well as the physical table storage.
Doing joins may actually result in creating multiple tables. This would also load data and associated indexes into the buffer pool. Any subsequent accessing of data might actually take place in the temp tables. Accessing the Buffer Pool would no longer be necessary.
Tuning
While a large buffer pool helps in the initial processing of a query, tuning the per-connection settings would be more of a help. The parameters I have in mind are
Any query that causes need more space than what there setting are may trigger a paging of that buffer to disk. There is where a slowdown of a query would become manifest and felt.
For the InnoDB Buffer Pool, your main objective should be to have the data and index pages present as often as possible. Avoid have a single denormalized table or you may bloat the buffer pool.
For the query, you may have to run smaller queries if you fear joins. Conversely, you could tune your queries and setup better indexes.
MySQL 5.6
If you are using MySQL 5.6, I would further suggest flushing the map of the InnoDB buffer pool to disk by running this
every minute.
Add this to
and restart mysql.
Doing this will have all data and index pages availableon each restart of mysql. This gives every query you have the luxury of having data and indexes in RAM and up-to-date.
Please keep in mind what goes into the InnoDB Buffer Pool
- 16KB Data Pages for Tables that have been accessed
- 16KB Index Pages for Indexes that have been accessed
- Changes to Secondary Indexes (could take up to 50% of buffer pool in a high-write envrironment)
MySQL's idea of a JOIN
Believe it or not, whether you have one big table and process a WHERE clause or you have joins on multiples, the query processing has the same mechanical behavior. (I wrote about this a year ago : Is there an execution difference between a JOIN condition and a WHERE condition?) You can also find this described in the MySQL Documentation.
You said you fears JOINs. MySQL treats every query like a JOIN. The book I quoted in my post calls a
SELECT on a single table a degenerate join. Multiple tables just uses a little extra resources (file handles, temp tables).Effects on the InnoDB Buffer Pool
A single table with all needed fields would be loaded into the buffer pool and may produce, at most, 1 temp table for gathering data satisfying a where clause. On the flipside of things, having a single table would mean denormalized data and introducing redundant data into the buffer pool as well as the physical table storage.
Doing joins may actually result in creating multiple tables. This would also load data and associated indexes into the buffer pool. Any subsequent accessing of data might actually take place in the temp tables. Accessing the Buffer Pool would no longer be necessary.
Tuning
While a large buffer pool helps in the initial processing of a query, tuning the per-connection settings would be more of a help. The parameters I have in mind are
- join_buffer_size
- sort_buffer_size (if you are using
ORDER BYand/orGROUP BY)
Any query that causes need more space than what there setting are may trigger a paging of that buffer to disk. There is where a slowdown of a query would become manifest and felt.
For the InnoDB Buffer Pool, your main objective should be to have the data and index pages present as often as possible. Avoid have a single denormalized table or you may bloat the buffer pool.
For the query, you may have to run smaller queries if you fear joins. Conversely, you could tune your queries and setup better indexes.
MySQL 5.6
If you are using MySQL 5.6, I would further suggest flushing the map of the InnoDB buffer pool to disk by running this
SET GLOBAL innodb_buffer_pool_dump_now = 1;every minute.
Add this to
my.cnf[mysqld]
innodb_buffer_pool_load_at_startup=1
innodb_buffer_pool_dump_at_shutdown=1and restart mysql.
Doing this will have all data and index pages availableon each restart of mysql. This gives every query you have the luxury of having data and indexes in RAM and up-to-date.
Code Snippets
SET GLOBAL innodb_buffer_pool_dump_now = 1;[mysqld]
innodb_buffer_pool_load_at_startup=1
innodb_buffer_pool_dump_at_shutdown=1Context
StackExchange Database Administrators Q#61726, answer score: 5
Revisions (0)
No revisions yet.