patternsqlMinor
pg_dump does not finish
Viewed 0 times
pg_dumpfinishdoesnot
Problem
I am trying to backup our postgres database (8.4.17), which is about 25GB in size.
unfortunately, the backup stops after about 600MB in, and does not continue. It always stops when trying to dump a specific table (fb_crawler_event). I was able to successfully dump all the other tables using the
server log after a restart of the DB (looks ok to me):
verbose output from pg_dump (nothing suspicious, but the obvious table, which doesn't let the database continue to dump)
And then it just halts.
output from the pg_locks table, which seems to be huge to me (294 entries all together):
```
locktype | database | relation | page | tuple | virtualxid | transactionid | classid | objid | objsubid | virtualtransaction | pid | mode | granted
------------+----------+----------+------+-------+------------+---------------+---------+-------+----------+--------------------+-------+-----------------+---------
relation | 16384 | 2674 | | | | | | | | 1/23
pg_dump database_name > database_db_dump_2014-05-05.sqlunfortunately, the backup stops after about 600MB in, and does not continue. It always stops when trying to dump a specific table (fb_crawler_event). I was able to successfully dump all the other tables using the
--table swtich. I have already stopped any other process that could interact with the database.server log after a restart of the DB (looks ok to me):
2014-05-05 14:34:46 CEST LOG: all server processes terminated; reinitializing
2014-05-05 14:34:46 CEST LOG: database system was interrupted; last known up at 2014-05-05 14:32:50 CEST
2014-05-05 14:34:46 CEST LOG: database system was not properly shut down; automatic recovery in progress
2014-05-05 14:34:46 CEST LOG: record with zero length at 1A1/AD6A78C0
2014-05-05 14:34:46 CEST LOG: redo is not required
2014-05-05 14:34:47 CEST LOG: database system is ready to accept connections
2014-05-05 14:34:47 CEST LOG: autovacuum launcher startedverbose output from pg_dump (nothing suspicious, but the obvious table, which doesn't let the database continue to dump)
... lots of lines ...
pg_dump: restoring data for table "django_site"
pg_dump: dumping contents of table django_site
pg_dump: restoring data for table "fb_crawler_event"
pg_dump: dumping contents of table fb_crawler_event
And then it just halts.
output from the pg_locks table, which seems to be huge to me (294 entries all together):
```
locktype | database | relation | page | tuple | virtualxid | transactionid | classid | objid | objsubid | virtualtransaction | pid | mode | granted
------------+----------+----------+------+-------+------------+---------------+---------+-------+----------+--------------------+-------+-----------------+---------
relation | 16384 | 2674 | | | | | | | | 1/23
Solution
I was able to figure out which file in the database storage was the culprit, by copying all the files to /dev/null.
(The path to your DB files might differ)
The currupt file could not be copied, but there was nothing I could do to change that. (so it was most probably a FS error or hardware failure)
So I restarted the server with a forced fsck (e.g.
After reboot I could finally access the inaccessable table again, but I am not sure, if the data contained is corrupted or not. Anyway, I do have a backup now, which I can disect to find out, and my server can go back online for now...
I recommend reading the wiki of postgres about corruption and the slides of this FOSDEM presentation for some more info on DB corruption
cp -vR /usr/lib/postgresql/8.4 /dev/null(The path to your DB files might differ)
The currupt file could not be copied, but there was nothing I could do to change that. (so it was most probably a FS error or hardware failure)
So I restarted the server with a forced fsck (e.g.
touch /forcefsck), to make sure the FS would do the best to fix itself. This might not be the way you'll want to go, since it is possible to have a total data loss afterwards, but I was able to preserve the most precious data already beforehand, so I took the risk.After reboot I could finally access the inaccessable table again, but I am not sure, if the data contained is corrupted or not. Anyway, I do have a backup now, which I can disect to find out, and my server can go back online for now...
I recommend reading the wiki of postgres about corruption and the slides of this FOSDEM presentation for some more info on DB corruption
Code Snippets
cp -vR /usr/lib/postgresql/8.4 /dev/nullContext
StackExchange Database Administrators Q#64481, answer score: 4
Revisions (0)
No revisions yet.