patternsqlMinor
What is the "last" data that get's backed up, when running pg_dump on a live server?
Viewed 0 times
lastthepg_dumpwhatlivegetthatrunningbackedwhen
Problem
I'm wondering: If I start doing a pg_dump of a very large database (it would take hours), that is still running actively receiving writes, what is then the last data that goes into the backup? Is it:
Bonus question: If I'm trying to dump a database as part of trying to rescue it from corrupt data, will it then make any difference whether I use the
- The data as-it-was, at the point in time where the pg_dump command was initiated.
- The last changes that it encountered at some point where it was dumping that individual record.
- Something else.
Bonus question: If I'm trying to dump a database as part of trying to rescue it from corrupt data, will it then make any difference whether I use the
directory format or the custom format?Solution
It's the data at the start of the command for an entire database. According to the manpage:
It makes consistent backups even if the database is being used
concurrently. pg_dump does not block other users accessing the
database (readers or writers).
and in SQL Dump:
Dumps created by pg_dump are internally consistent, meaning, the dump
represents a snapshot of the database at the time pg_dump began
running
Dumping in parallel (
For a consistent backup, the database server needs to support
synchronized snapshots, a feature that was introduced in PostgreSQL
9.2
I don't think the output format makes any difference in the rescue operation. Note that for a parallel dump,
It makes consistent backups even if the database is being used
concurrently. pg_dump does not block other users accessing the
database (readers or writers).
and in SQL Dump:
Dumps created by pg_dump are internally consistent, meaning, the dump
represents a snapshot of the database at the time pg_dump began
running
Dumping in parallel (
--jobs) may be problematic with changing data, but only when targeting less recent versions:For a consistent backup, the database server needs to support
synchronized snapshots, a feature that was introduced in PostgreSQL
9.2
I don't think the output format makes any difference in the rescue operation. Note that for a parallel dump,
directory is the only possible format.Context
StackExchange Database Administrators Q#62857, answer score: 2
Revisions (0)
No revisions yet.