HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMajor

Checking whether two tables have identical content in PostgreSQL

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
postgresqltablescheckingidenticaltwocontentwhetherhave

Problem

This has already been asked on Stack Overflow, but only for MySQL. I'm using PostgreSQL. Unfortunately (and surprisingly) PostgreSQL does not seem to have something like CHECKSUM table.

A PostgreSQL solution would be fine, but a generic one would be better. I found http://www.besttechtools.com/articles/article/sql-query-to-check-two-tables-have-identical-data, but I don't understand the logic used.

Background: I re-wrote some database generating code, so I need to check whether the old and new code produce identical results.

Solution

You can use the EXCEPT operator. For example, if the tables have identical structure, the following will return all rows that are in one table but not the other (so 0 rows if the tables have identical data):

(TABLE a EXCEPT TABLE b)
UNION ALL
(TABLE b EXCEPT TABLE a) ;


Or with EXISTS to return just a boolean value or a string with one of the 2 possible results:

SELECT CASE WHEN EXISTS (TABLE a EXCEPT TABLE b)
              OR EXISTS (TABLE b EXCEPT TABLE a)
            THEN 'different'
            ELSE 'same'
       END AS result ;


Tested at SQLfiddle

Also not the that EXCEPT removes duplicates (that should not be a worry if your tables have some PRIMARY KEY or UNIQUE constraint but it may be if you are comparing results of arbitrary queries that can potentially produce duplicate rows).

Another thing that EXCEPT keyword does is that it treats NULL values as identical, so if table A has a row with (1,2,NULL) and table B has a row with (1,2,NULL), the first query will not show these rows and the second query will return 'same' if the two tables have no other row.

If you want to count such rows as different, you could use a variation on gsiems' FULL JOIN answer, to get all the (different) rows:

SELECT *
FROM a NATURAL FULL JOIN b
WHERE a.some_not_null_column IS NULL 
   OR b.some_not_null_column IS NULL ;


and to get a yes/no answer:

SELECT CASE WHEN EXISTS
            ( SELECT *
              FROM a NATURAL FULL JOIN b
              WHERE a.some_not_null_column IS NULL 
                 OR b.some_not_null_column IS NULL
            )
            THEN 'different'
            ELSE 'same'
       END AS result ;


If all the columns of the two tables are not nullable, the two approaches will give identical answers.

Code Snippets

(TABLE a EXCEPT TABLE b)
UNION ALL
(TABLE b EXCEPT TABLE a) ;
SELECT CASE WHEN EXISTS (TABLE a EXCEPT TABLE b)
              OR EXISTS (TABLE b EXCEPT TABLE a)
            THEN 'different'
            ELSE 'same'
       END AS result ;
SELECT *
FROM a NATURAL FULL JOIN b
WHERE a.some_not_null_column IS NULL 
   OR b.some_not_null_column IS NULL ;
SELECT CASE WHEN EXISTS
            ( SELECT *
              FROM a NATURAL FULL JOIN b
              WHERE a.some_not_null_column IS NULL 
                 OR b.some_not_null_column IS NULL
            )
            THEN 'different'
            ELSE 'same'
       END AS result ;

Context

StackExchange Database Administrators Q#72641, answer score: 45

Revisions (0)

No revisions yet.