HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

PostgreSQL: Corrupt primary key, inconsistent table

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
postgresqlprimaryinconsistentcorrupttablekey

Problem

While recovering from a cloud failure, I found that some tables on a PostgreSQL database are behaving strangely. These tables are indexed using a primary key, but a pg_dump yielded duplicate fields, failing a pg_restore on a backup server.

I have tried to REINDEX:

REINDEX INDEX rank_details_pkey;
ERROR:  could not create unique index "rank_details_pkey"
DETAIL:  Table contains duplicated values.


The index is defined as:


Indexes:
    "rank_details_pkey" PRIMARY KEY, btree (user_id)


And, oddly,

SELECT user_id, COUNT(*) FROM  GROUP BY 1 HAVING COUNT(*) > 1;
 user_id | count 
---------+-------
(0 rows)


To conclude - I have duplicate values in my table which can not be found or cleared.

Any ideas how to fix this? This is a production server, so all fixes should be done without affecting service.

Solution

There are various ways this can happen in Oracle - I'm not sure about postgres, but I think I would call this an "integrity violation" rather than "corruption"

Perhaps you can do one of the things suggested here, ie set enable_indexscan = off or

begin;
drop index rank_details_pkey;
select user_id, count(*) from rank_details group by user_id having count(*) > 1;
rollback;


But "there are likely some locking issues with this, so be careful with it in production"

The idea is to force the query to scan the table rather than just the index (which does not have the duplicates). You may also, and more simply, be able to acheive the same by:

select user_id, f(), count(*)
from rank_details
group by user_id, f()
having count(*) > 1


where f() returns a constant, which may trick the planner into a table scan.

Code Snippets

begin;
drop index rank_details_pkey;
select user_id, count(*) from rank_details group by user_id having count(*) > 1;
rollback;
select user_id, f(<some other column>), count(*)
from rank_details
group by user_id, f(<some other column>)
having count(*) > 1

Context

StackExchange Database Administrators Q#4755, answer score: 5

Revisions (0)

No revisions yet.