patternsqlMinor
Recommendation about mysqldump?
Viewed 0 times
recommendationmysqldumpabout
Problem
I have approx 90 GB sized database, and I need to schedule a mysqldump backup. How long will it take to perform a full hot backup approximately? I'm asking because I've never had to backup a database as big as this one yet.
Solution
LE: giving the tips below I may estimate that your dump could take 20 minutes (and longer). But with these tips and high performance CPUs and lot of memory the time may be smaller.
You didn't provide enough informations regarding the specific of your data. Are there a single database or multiple? Is all data InnoDB or there are MyISAM tables around?
If you have only InnoDB tables then you may use the "--single-transaction" option of mysqldump utility.
According to manual:
--single-transaction This option sets the transaction isolation
mode to REPEATABLE READ and sends a START TRANSACTION SQL statement to
the server before dumping data. It is useful only with transactional
tables such as InnoDB, because then it dumps the consistent state of
the database at the time when START TRANSACTION was issued without
blocking any applications.
When using this option, you should keep in mind that only InnoDB
tables are dumped in a consistent state. For example, any MyISAM or
MEMORY tables dumped while using this option may still change state.
While a --single-transaction dump is in process, to ensure a valid
dump file (correct table contents and binary log coordinates), no
other connection should use the following statements: ALTER TABLE,
CREATE TABLE, DROP TABLE, RENAME TABLE, TRUNCATE TABLE. A consistent
read is not isolated from those statements, so use of them on a table
to be dumped can cause the SELECT that is performed by mysqldump to
retrieve the table contents to obtain incorrect contents or fail.
The --single-transaction option and the --lock-tables option are
mutually exclusive because LOCK TABLES causes any pending transactions
to be committed implicitly.
This option is not supported for MySQL Cluster tables; the results
cannot be guaranteed to be consistent due to the fact that the
NDBCLUSTER storage engine supports only the READ_COMMITTED transaction
isolation level. You should always use NDB backup and restore instead.
To dump large tables, combine the --single-transaction option with
the --quick option.
Therefore, for a consistent snapshot, if there are any chance to get "ALTER TABLE, CREATE TABLE, DROP TABLE, RENAME TABLE, TRUNCATE TABLE" commands on the database, then you should use the "--lock-tables" option. Also, to get a fast restore you could use the "--disable-keys".
However, there are also other techniques to speed up the dumping, such as:
Example of use (ordered by the speed!):
PIGZ: Parallel GZip, grade 6 from 9
PIGZ: 30 sec : 319976345 GZIP - multi-threading: 8 cpu by default
Arch: mysqldump -u root -p --opt DATABASE | pigz -6 -c > dumpname.sql.gz
Unarch: pigz -cdk dumpname.sql.gz | mysql -u root -p DATABASE
PBZIP2: Parallel BZip2. grade 6 from 9(my personal favourite)
PBZIP2: 45-60 sec : 231360931 BZ2 - multi-threading: maxim sau cu -p8
Arh: mysqldump -u root -p --opt DATABASE | pbzip2 -c >dumpname.sql.bz2
Unarch: pbzip2 -cdk dumpname.sql.bz2 | mysql -u root -p DATABASE
7Zip, container ZIP, metoda BZip2, grade 3 from 9
7ZA: 75 sec : 233503528 ZIP - multi-threading: maxim
Arch: mysqldump -u root -p --opt DATABASE | 7za a -tbzip2 -mx=3 -mmt=on -sidumpname.sql dumpname.sql.zip
Unarch: 7za e -tbzip2 -so dumpname.sql.gz | mysql -u root -p DATABASE
7Zip, container ZIP, metoda BZip2, grade 6 from 9
7ZA: 95 sec : 230986947 ZIP - multi-threading: maxim
Arch: mysqldump -u root -p --opt DATABASE | 7za a -tbzip2 -mx=6 -mmt=on -sidumpname.sql dumpname.sql.zip
Unarch: 7za e -tbzip2 -so dumpname.sql.zip | mysql -u root -p DATABASE
GZIP: container GZ, grade 6 from 9
GZIP: 95 sec : 318920792 - multi-threading: none
Arch: mysqldump -u root -p --opt DATABASE | gzip -6 --rsyncable > dumpname.sql.gz
Unarch: gzip -cdk dumpname.sql.gz | mysql -u root -p DATABASE
RAR: container RAR, grade 3 from 5, test -mt16
RAR: 95 sec : 248149972 RAR - multi-threading: auto - maxim 5 cpu
Arch: mysqldump -u root -p --opt DATABASE | rar a -m3 -sidumpname.sql dumpname.sql.rar
Unarch: unrar p -inul dumpname.sql.rar | mysql -u root -p DATABASE
"-si" switch undocumented in manual; may be found in rar.txt (/usr/share/doc/rar/)
RAR: container RAR, grade 4 from 5
RAR: 125 sec : 246759443 RAR - multi-threading: auto - maxim 6 cpu
Arch: mysqldump -u root -p --opt DATABASE | rar a -m4 -sid
You didn't provide enough informations regarding the specific of your data. Are there a single database or multiple? Is all data InnoDB or there are MyISAM tables around?
If you have only InnoDB tables then you may use the "--single-transaction" option of mysqldump utility.
According to manual:
--single-transaction This option sets the transaction isolation
mode to REPEATABLE READ and sends a START TRANSACTION SQL statement to
the server before dumping data. It is useful only with transactional
tables such as InnoDB, because then it dumps the consistent state of
the database at the time when START TRANSACTION was issued without
blocking any applications.
When using this option, you should keep in mind that only InnoDB
tables are dumped in a consistent state. For example, any MyISAM or
MEMORY tables dumped while using this option may still change state.
While a --single-transaction dump is in process, to ensure a valid
dump file (correct table contents and binary log coordinates), no
other connection should use the following statements: ALTER TABLE,
CREATE TABLE, DROP TABLE, RENAME TABLE, TRUNCATE TABLE. A consistent
read is not isolated from those statements, so use of them on a table
to be dumped can cause the SELECT that is performed by mysqldump to
retrieve the table contents to obtain incorrect contents or fail.
The --single-transaction option and the --lock-tables option are
mutually exclusive because LOCK TABLES causes any pending transactions
to be committed implicitly.
This option is not supported for MySQL Cluster tables; the results
cannot be guaranteed to be consistent due to the fact that the
NDBCLUSTER storage engine supports only the READ_COMMITTED transaction
isolation level. You should always use NDB backup and restore instead.
To dump large tables, combine the --single-transaction option with
the --quick option.
Therefore, for a consistent snapshot, if there are any chance to get "ALTER TABLE, CREATE TABLE, DROP TABLE, RENAME TABLE, TRUNCATE TABLE" commands on the database, then you should use the "--lock-tables" option. Also, to get a fast restore you could use the "--disable-keys".
However, there are also other techniques to speed up the dumping, such as:
- increase the "key_buffer_size" at the maximum 4GB. The variable is dynamic so you can issue "SET GLOBAL key_buffer_size=4102410241024;*" (or whatever you can) just before the dump and revert the value after the dump.
- increase the "max_heap_table_size". If you have "--order-by-primary" option issued there may be a lot of temporary tables produced. I've found that setting max_heap_table_size larger than the most tables increases the speed.
- redirect the dump directly to an archive utility which support pipeline, such as: gzip, pigz, bzip2, pbzip2, 7zip, rar, etc. This way, you can avoid using the HDD during the dump.
Example of use (ordered by the speed!):
PIGZ: Parallel GZip, grade 6 from 9
PIGZ: 30 sec : 319976345 GZIP - multi-threading: 8 cpu by default
Arch: mysqldump -u root -p --opt DATABASE | pigz -6 -c > dumpname.sql.gz
Unarch: pigz -cdk dumpname.sql.gz | mysql -u root -p DATABASE
PBZIP2: Parallel BZip2. grade 6 from 9(my personal favourite)
PBZIP2: 45-60 sec : 231360931 BZ2 - multi-threading: maxim sau cu -p8
Arh: mysqldump -u root -p --opt DATABASE | pbzip2 -c >dumpname.sql.bz2
Unarch: pbzip2 -cdk dumpname.sql.bz2 | mysql -u root -p DATABASE
7Zip, container ZIP, metoda BZip2, grade 3 from 9
7ZA: 75 sec : 233503528 ZIP - multi-threading: maxim
Arch: mysqldump -u root -p --opt DATABASE | 7za a -tbzip2 -mx=3 -mmt=on -sidumpname.sql dumpname.sql.zip
Unarch: 7za e -tbzip2 -so dumpname.sql.gz | mysql -u root -p DATABASE
7Zip, container ZIP, metoda BZip2, grade 6 from 9
7ZA: 95 sec : 230986947 ZIP - multi-threading: maxim
Arch: mysqldump -u root -p --opt DATABASE | 7za a -tbzip2 -mx=6 -mmt=on -sidumpname.sql dumpname.sql.zip
Unarch: 7za e -tbzip2 -so dumpname.sql.zip | mysql -u root -p DATABASE
GZIP: container GZ, grade 6 from 9
GZIP: 95 sec : 318920792 - multi-threading: none
Arch: mysqldump -u root -p --opt DATABASE | gzip -6 --rsyncable > dumpname.sql.gz
Unarch: gzip -cdk dumpname.sql.gz | mysql -u root -p DATABASE
RAR: container RAR, grade 3 from 5, test -mt16
RAR: 95 sec : 248149972 RAR - multi-threading: auto - maxim 5 cpu
Arch: mysqldump -u root -p --opt DATABASE | rar a -m3 -sidumpname.sql dumpname.sql.rar
Unarch: unrar p -inul dumpname.sql.rar | mysql -u root -p DATABASE
"-si" switch undocumented in manual; may be found in rar.txt (/usr/share/doc/rar/)
RAR: container RAR, grade 4 from 5
RAR: 125 sec : 246759443 RAR - multi-threading: auto - maxim 6 cpu
Arch: mysqldump -u root -p --opt DATABASE | rar a -m4 -sid
Context
StackExchange Database Administrators Q#94359, answer score: 3
Revisions (0)
No revisions yet.