patternsqlModerate
MySQL - fastest way to ALTER TABLE for InnoDB
Viewed 0 times
innodbwaymysqlfastestforaltertable
Problem
I have an InnoDB table that I want to alter. The table has ~80M rows, and quit a few indices.
I want to change the name of one of the columns and add a few more indices.
At this time, all I care about is speed :)
I want to change the name of one of the columns and add a few more indices.
- What is the fastest way to do it (assuming I could suffer even downtime - the server is an unused slave)?
- Is a "plain"
alter table, the fastest solution?
At this time, all I care about is speed :)
Solution
One sure way to speed up an ALTER TABLE is to remove unnecessary indexes
Here are the initial steps to load a new version of the table
Please note the following:
-
I dropped source_persona_index because it is the first column in 4 other indexes
-
I dropped target_persona_index because it is the first column in 2 other indexes
-
I dropped target_persona_relation_type_index because the first 2 columns are also in target_persona_relation_type_message_id_index
OK That takes care of unnecessary indexes. Are there any indexes that have low cardinality? Here is the way to determine that:
Run the following queries:
According to your question, there are about 80,000,000 rows. As a rule of thumb, the MySQL Query Optimizer will not use an index if the cardinality of the selected columns is greater that 5% of the table row count. In this case, that would be 4,000,000.
DROP INDEX sent_at_index;
DROP INDEX message_id_index;
DROP INDEX target_object_index;
Once the usefulness or uselessness of those indexes have been determined, you can reload the data
That's it, right? NOPE !!!
If your website has been up this whole time, there may have INSERTs running against s_relations during the loading of s_relations_new. How can you retrieve those missing rows?
Go find the maximum id in s_relations_new and append everything after that ID from s_relations. To assure that the table is frozen and used only for this update, you must have a little downtime for the sake of getting those last rows that were inserted into s_relation_new. Here is what you do:
In the OS, restart mysql so that no one else can log in but root@localhost (disables TCP/IP):
Next, login to mysql and load those last rows:
Then, restart mysql normally
Now, if you cannot take mysql down, you will have to do a bait-and-switch on s_relations. Just login to mysql and do the following:
Give it a Try !!!
CAVEAT : Once you are satisfied with this operation, you can drop the old table at your earliest convenience:
Here are the initial steps to load a new version of the table
CREATE TABLE s_relations_new LIKE s_relations;
#
# Drop Duplicate Indexes
#
ALTER TABLE s_relations_new
DROP INDEX source_persona_index,
DROP INDEX target_persona_index,
DROP INDEX target_persona_relation_type_index
;Please note the following:
-
I dropped source_persona_index because it is the first column in 4 other indexes
- unique_target_persona
- unique_target_object
- source_and_target_object_index
- source_target_persona_index
-
I dropped target_persona_index because it is the first column in 2 other indexes
- target_persona_relation_type_index
- target_persona_relation_type_message_id_index
-
I dropped target_persona_relation_type_index because the first 2 columns are also in target_persona_relation_type_message_id_index
OK That takes care of unnecessary indexes. Are there any indexes that have low cardinality? Here is the way to determine that:
Run the following queries:
SELECT COUNT(DISTINCT sent_at) FROM s_relations;
SELECT COUNT(DISTINCT message_id) FROM s_relations;
SELECT COUNT(DISTINCT target_object_id) FROM s_relations;According to your question, there are about 80,000,000 rows. As a rule of thumb, the MySQL Query Optimizer will not use an index if the cardinality of the selected columns is greater that 5% of the table row count. In this case, that would be 4,000,000.
- If
COUNT(DISTINCT sent_at)> 4,000,000
- then
ALTER TABLE s_relations_new
DROP INDEX sent_at_index;
- If
COUNT(DISTINCT message_id)> 4,000,000
- then
ALTER TABLE s_relations_new
DROP INDEX message_id_index;
- If
COUNT(DISTINCT target_object_id)> 4,000,000
- then
ALTER TABLE s_relations_new
DROP INDEX target_object_index;
Once the usefulness or uselessness of those indexes have been determined, you can reload the data
#
# Change the Column Name
# Load the Table
#
ALTER TABLE s_relations_new CHANGE sent_at sent_at_new int(11) DEFAULT NULL;
INSERT INTO s_relations_new SELECT * FROM s_relations;That's it, right? NOPE !!!
If your website has been up this whole time, there may have INSERTs running against s_relations during the loading of s_relations_new. How can you retrieve those missing rows?
Go find the maximum id in s_relations_new and append everything after that ID from s_relations. To assure that the table is frozen and used only for this update, you must have a little downtime for the sake of getting those last rows that were inserted into s_relation_new. Here is what you do:
In the OS, restart mysql so that no one else can log in but root@localhost (disables TCP/IP):
$ service mysql restart --skip-networkingNext, login to mysql and load those last rows:
mysql> SELECT MAX(id) INTO @maxidnew FROM s_relations_new;
mysql> INSERT INTO s_relations_new SELECT * FROM s_relations WHERE id > @maxidnew;
mysql> ALTER TABLE s_relations RENAME s_relations_old;
mysql> ALTER TABLE s_relations_new RENAME s_relations;Then, restart mysql normally
$ service mysql restartNow, if you cannot take mysql down, you will have to do a bait-and-switch on s_relations. Just login to mysql and do the following:
mysql> ALTER TABLE s_relations RENAME s_relations_old;
mysql> SELECT MAX(id) INTO @maxidnew FROM s_relations_new;
mysql> INSERT INTO s_relations_new SELECT * FROM s_relations_old WHERE id > @maxidnew;
mysql> ALTER TABLE s_relations_new RENAME s_relations;Give it a Try !!!
CAVEAT : Once you are satisfied with this operation, you can drop the old table at your earliest convenience:
mysql> DROP TABLE s_relations_old;Code Snippets
CREATE TABLE s_relations_new LIKE s_relations;
#
# Drop Duplicate Indexes
#
ALTER TABLE s_relations_new
DROP INDEX source_persona_index,
DROP INDEX target_persona_index,
DROP INDEX target_persona_relation_type_index
;SELECT COUNT(DISTINCT sent_at) FROM s_relations;
SELECT COUNT(DISTINCT message_id) FROM s_relations;
SELECT COUNT(DISTINCT target_object_id) FROM s_relations;#
# Change the Column Name
# Load the Table
#
ALTER TABLE s_relations_new CHANGE sent_at sent_at_new int(11) DEFAULT NULL;
INSERT INTO s_relations_new SELECT * FROM s_relations;$ service mysql restart --skip-networkingmysql> SELECT MAX(id) INTO @maxidnew FROM s_relations_new;
mysql> INSERT INTO s_relations_new SELECT * FROM s_relations WHERE id > @maxidnew;
mysql> ALTER TABLE s_relations RENAME s_relations_old;
mysql> ALTER TABLE s_relations_new RENAME s_relations;Context
StackExchange Database Administrators Q#9746, answer score: 17
Revisions (0)
No revisions yet.