patternsqlMinor
inserting large quantity of data into mysql table
Viewed 0 times
intoquantitymysqllargeinsertingdatatable
Problem
I need to insert about 400 000 000 rows of test data into the InnoDB table.
Table is quite simple:
Data is coming from another table:
I need to insert into tgt all possible combinations of different ids from table src, so naive method is to execute:
And, of course it take hours or even days... So I am searching for a faster method of doing that.
Table is quite simple:
CREATE TABLE IF NOT EXISTS `tgt` (
`id_1` int(10) unsigned NOT NULL,
`id_2` int(10) unsigned NOT NULL,
PRIMARY KEY (`id_1`,`id_2`),
KEY `id_1` (`id_1`),
KEY `id_2` (`id_2`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;Data is coming from another table:
CREATE TABLE IF NOT EXISTS `src` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
# skip some columns
PRIMARY KEY (`id`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;I need to insert into tgt all possible combinations of different ids from table src, so naive method is to execute:
INSERT INTO tgt (id_1, id_2) (SELECT a.id, b.id FROM `src` a INNER JOIN src b WHERE a.id <> b.id)And, of course it take hours or even days... So I am searching for a faster method of doing that.
Solution
-
Dump the source data into a file:
-
Tune MySQL for large writes:
See how to change the log size
-
Start a pipe to split results.txt in chunks
(pt-fifo-split is a part of Percona Toolkit)
-
Load the dump
(if you have problem with "File '/tmp/pt-fifo-split' not found", try this solution )
Dump the source data into a file:
SELECT a.id, b.id INTO OUTFILE '/tmp/result.txt' FROM `src` a INNER JOIN src b WHERE a.id <> b.id-
Tune MySQL for large writes:
innodb_flush_log_at_trx_commit=2
innodb_log_file_size=2GSee how to change the log size
-
Start a pipe to split results.txt in chunks
pt-fifo-split --lines 1000000 /tmp/result.txt(pt-fifo-split is a part of Percona Toolkit)
-
Load the dump
while [ -e /tmp/pt-fifo-split ]; do mysql -u -p -e "LOAD DATA INFILE '/tmp/pt-fifo-split' INTO TABLE tgt" ; done(if you have problem with "File '/tmp/pt-fifo-split' not found", try this solution )
Code Snippets
SELECT a.id, b.id INTO OUTFILE '/tmp/result.txt' FROM `src` a INNER JOIN src b WHERE a.id <> b.idinnodb_flush_log_at_trx_commit=2
innodb_log_file_size=2Gpt-fifo-split --lines 1000000 /tmp/result.txtwhile [ -e /tmp/pt-fifo-split ]; do mysql -u <user> -p<pass> -e "LOAD DATA INFILE '/tmp/pt-fifo-split' INTO TABLE tgt" <tgt db>; doneContext
StackExchange Database Administrators Q#65932, answer score: 4
Revisions (0)
No revisions yet.