HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMajor

Easily show rows that are different between two tables or queries

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
showrowstablesaredifferenttwobetweenthatquerieseasily

Problem

Imagine you have two different tables/queries that are supposed to have/return identical data. You want to verify this. What's an easy way to show any unmatched rows from each table just like the example below, comparing every column? Assume there are 30 columns in the tables, many of which are NULLable.

When there is no PK or there could be duplicates per PK, joining on just PK columns isn't enough, and it would be a disaster to have to do a FULL JOIN with 30 join conditions that properly handle NULLs, plus a nasty WHERE condition to exclude the matched rows.

Usually it is when I'm writing a new query against unscrubbed or not-fully-understood data that the problem is worst and the likelihood of a PK being logically available is extremely low. I cook up two different ways to solve the problem and then compare their results, the differences highlighting special cases in the data that I was unaware of.

The result needs to look like this:

Which   Col1   Col2   Col3   ... Col30
------  ------ ------ ------     ------
TableA  Cat    27     86               -- mismatch
TableB  Cat    27     105              -- mismatch
TableB  Cat    27     87               -- mismatch 2
TableA  Cat    128    92               -- no corresponding row
TableB  Lizard 83     NULL             -- no corresponding row


If [Col1, Col2] do happen to be a composite key and we order by them in our final result, then we can easily see that A and B have one row different that should be the same, and each has one row that is not in the other.

In the above example, seeing the first row twice is not desirable.

Here's DDL and DML to set up sample tables and data:

```
CREATE TABLE dbo.TableA (
Col1 varchar(10),
Col2 int,
Col3 int,
Col4 varchar(10),
Col5 varchar(10),
Col6 varchar(10),
Col7 varchar(10),
Col8 varchar(10),
Col9 varchar(10),
Col10 varchar(10),
Col11 varchar(10),
Col12 varchar(10),
Col13 varchar(10),
Col14 varchar(10),
Col15 varchar(

Solution

This can be handled using EXCEPT and/or INTERSECT.
http://msdn.microsoft.com/en-us/library/ms188055.aspx

First find all records that are in table1 that are not in table 2, then find all records that are in table 2 that are not in table one.

SELECT * FROM table1
EXCEPT
SELECT * FROM table2

UNION

SELECT * FROM table2
EXCEPT
SELECT * FROM table1


There is undoubtedly a more efficient way to do this, but it is the first "quick and dirty" solution off the top of my head. Also, I do not recommend using a * wildcard, but it suits here for brevity.

Alternately, you could use an INTERSECT operator and exclude all the results from it.

Code Snippets

SELECT * FROM table1
EXCEPT
SELECT * FROM table2

UNION

SELECT * FROM table2
EXCEPT
SELECT * FROM table1

Context

StackExchange Database Administrators Q#12580, answer score: 23

Revisions (0)

No revisions yet.