patternsqlpostgresqlModerate
pg_dump strategies for large databases without downtime
Viewed 0 times
pg_dumppg_restoreparallel backupdirectory formatlarge database backupcustom format
Problem
pg_dump of a multi-terabyte database takes so long that the window for a consistent backup conflicts with operational requirements, and restores from plain SQL format are unacceptably slow.
Solution
Use parallel custom-format dumps and parallel restores:
# Custom format (compressed, supports parallel restore):
pg_dump -Fc -Z 5 -f mydb.dump mydb
# Parallel dump to directory format (fastest for large DBs):
pg_dump -Fd -j 8 -f mydb_dump_dir mydb
# Parallel restore:
pg_restore -Fd -j 8 -d mydb mydb_dump_dir
# Dump only schema (fast, useful for migrations):
pg_dump --schema-only -f schema.sql mydb
# Exclude large tables (dump data separately):
pg_dump --exclude-table=audit_log -f partial.dump mydb
pg_dump -t audit_log --data-only -Fc -f audit_log.dump mydb
# Custom format (compressed, supports parallel restore):
pg_dump -Fc -Z 5 -f mydb.dump mydb
# Parallel dump to directory format (fastest for large DBs):
pg_dump -Fd -j 8 -f mydb_dump_dir mydb
# Parallel restore:
pg_restore -Fd -j 8 -d mydb mydb_dump_dir
# Dump only schema (fast, useful for migrations):
pg_dump --schema-only -f schema.sql mydb
# Exclude large tables (dump data separately):
pg_dump --exclude-table=audit_log -f partial.dump mydb
pg_dump -t audit_log --data-only -Fc -f audit_log.dump mydb
Why
The directory format (-Fd) splits each table into a separate file, enabling pg_dump and pg_restore to use multiple parallel workers (-j) that process different tables simultaneously, dramatically reducing total time.
Gotchas
- pg_dump takes a consistent snapshot at start time using REPEATABLE READ; long-running transactions on the source can cause it to wait for locks
- Parallel dump (-j) requires PostgreSQL 9.3+; parallel restore requires pg_restore from 9.3+
- Do not use -j with -Fc (custom single-file format); only -Fd (directory) supports parallel dump
- pg_dumpall is needed to dump roles and tablespaces; pg_dump does not include them
Code Snippets
Parallel pg_dump and pg_restore commands
# Parallel dump with compression:
pg_dump -Fd -j 8 -Z 5 -f /backups/mydb_$(date +%Y%m%d) mydb
# Parallel restore with pre/post data separation:
pg_restore -Fd -j 8 --no-owner --no-privileges -d mydb /backups/mydb_20240315Context
Production databases too large for a simple pg_dump plain text backup within operational windows
Revisions (0)
No revisions yet.