patternsqlModerate
Importing large SQL dump with millions of INSERT statements
Viewed 0 times
dumpinsertmillionssqlwithstatementslargeimporting
Problem
I need to import a big
How does one speed up the import? I need to import this data weekly.
The first 2000 lines can be found here, while the compressed 1 GB dump can be found here
That's the only way I can get the data. The full file has approx. 38,000,000 lines. How can I speed up the import?
.sql file (8.1GB when unpacked) into PostgreSQL. I tried to use \i /path/to/file.sql but it is way too slow.How does one speed up the import? I need to import this data weekly.
The first 2000 lines can be found here, while the compressed 1 GB dump can be found here
--
-- PostgreSQL database dump
--
-- Dumped from database version 9.5.3
-- Dumped by pg_dump version 9.5.2
SET statement_timeout = 0;
SET lock_timeout = 0;
SET client_encoding = 'UTF8';
SET standard_conforming_strings = on;
SET check_function_bodies = false;
SET client_min_messages = warning;
SET row_security = off;
--
-- Name: rpo; Type: SCHEMA; Schema: -; Owner: -
--That's the only way I can get the data. The full file has approx. 38,000,000 lines. How can I speed up the import?
Solution
This dump was dumped as individual statements (with
This is documented as being slow (from
That's why it's so slow. What you're going to want to do is to turn off some of the durability settings, specifically
You can do this very simply by running the following command before you run your
That will do a lot to speed it up. Don't forget to turn back on the durability options after you're done. I bet it'll finish in a few hours after you've set that. If you need more speed though don't hesitate to turn off
pg_dump --inserts)INSERT INTO esa2010_codes VALUES (11002, 'Národn
INSERT INTO esa2010_codes VALUES (11003, 'Nefina
INSERT INTO esa2010_codes VALUES (12502, 'Národn
INSERT INTO esa2010_codes VALUES (11001, 'Verejn
INSERT INTO esa2010_codes VALUES (12602, 'Národn
INSERT INTO esa2010_codes VALUES (12603, 'Finanč
INSERT INTO esa2010_codes VALUES (12503, 'OstatnThis is documented as being slow (from
man pg_dump)--inserts Dump data as INSERT commands (rather than COPY). This will make restoration very slow; it is mainly useful for making dumps that can be loaded into non-PostgreSQL databases. However, since this option generates a separate command for each row, an error in reloading a row causes only that row to be lost rather than the entire table contents. Note that the restore might fail altogether if you have rearranged column order. The --column-inserts option is safe against column order changes, though even slower.That's why it's so slow. What you're going to want to do is to turn off some of the durability settings, specifically
synchronous_commit, though fsync will help tooYou can do this very simply by running the following command before you run your
\i file.sql.SET synchronous_commit TO off;That will do a lot to speed it up. Don't forget to turn back on the durability options after you're done. I bet it'll finish in a few hours after you've set that. If you need more speed though don't hesitate to turn off
fsync and full_page_writes on the cluster until you get the data up -- though I won't do it if the DB had data you needed in it, or if it was production. As a last note, if you need the speed and this is a production DB you can go all out on your own copy and dump it with the default options by pg_dump, which you'll be able to load much faster.Code Snippets
INSERT INTO esa2010_codes VALUES (11002, 'Národn
INSERT INTO esa2010_codes VALUES (11003, 'Nefina
INSERT INTO esa2010_codes VALUES (12502, 'Národn
INSERT INTO esa2010_codes VALUES (11001, 'Verejn
INSERT INTO esa2010_codes VALUES (12602, 'Národn
INSERT INTO esa2010_codes VALUES (12603, 'Finanč
INSERT INTO esa2010_codes VALUES (12503, 'OstatnSET synchronous_commit TO off;Context
StackExchange Database Administrators Q#216183, answer score: 15
Revisions (0)
No revisions yet.