HiveBrain v1.2.0
Get Started
← Back to all entries
debugsqlMinor

PostgreSQL DELETE FROM fails with `Error: attempted to delete invisible tuple`

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
postgresqlfailserrorattempteddeletewithtuplefrominvisible

Problem

The error

Trying to delete tuples containing invalid timestamps with

DELETE FROM comments WHERE date > '1 Jan 9999' OR date '1 Jan 9999' OR date_found

ends in

ERROR:  attempted to delete invisible tuple


There is a mailing list from 2009 discussing the exact same error message, where OP had it fixed, but I find no explanation of how he did it or what might have led to this error.

I am helpless due to lack of hits on Google and to my limited knowledge of PostgreSQL.

What led to corruption

I've had a PostgreSQL 9.5.5 server (~4TB data, all the default settings, except for upped memory limits) running on Debian 8, when OS kernel panicked – probably while rebuilding /dev/md1 where swap was located. Prior to that, PostgreSQL ate up almost all of the disk space with a 400GB log file. OS never booted again, disk checks were OK, so I have booted from a LiveCD and backed up each block device to images, just in case. I have successfully rebuilt / directory from /dev/md2, fsck showed a clean filesystem, and I have backed up the PGDATA folder on to an external HDD.

What I did to attempt recovery

After I've formatted md devices and reinstalled the OS along with a fresh postgresql-9.5, I have stopped the PostgreSQL server, moved and chowned the PGDATA folder to postgres user, and started the server – everything seemed fine, there were no errors.

As soon as I've started
pg_dumpall, it died with

Error message from server: ERROR:  timestamp out of range


I have naturally tried to delete the offending tuples, only to end up with the same
invisible tuple` error over and over again.

Things I tried

First off, DELETE queries failed because of damaged pages, so I've set following settings:

zero_damaged_pages = on
ignore_system_indexes = on
enable_indexscan = off
enable_bitmapscan = off
enable_indexonlyscan = off


Now I've noticed that when I run the same queries again, the server zeros out same pages over and over again, not sure what it means:

Solution

Well, I have managed to automate the recovery process of SELECT and INSERT INTO, skipping ranges and waiting if the server crashes. I have first coded it in Node - it ripped undamaged data from comments, and is still going.

Yesterday I've decided to try Golang, and here is a repo with Go code: https://github.com/kaivi/pg_ripper I will update it soon so it really works around bad tuples, and not just gives up on the whole range containing one.

Context

StackExchange Database Administrators Q#157533, answer score: 2

Revisions (0)

No revisions yet.