HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

Except duplicates from duplicates based on columns

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
columnsbasedfromexceptduplicates

Problem

Here is what I am trying to do,

-
Get duplicates based on 2 columns (let say returns 500 rows)

-
Get duplicates based on above 2 columns + another column (let say returns 100 rows)

Now I want to get remaining 400 rows. In simple words, I want all duplicates where there are not duplicates because of columnC...

-- get duplicates based on ColumnA, ColumnB
SELECT '-'
    ,ColumnA
    ,ColumnB
    ,ColumnC
    ,COUNT(*)
FROM MainTable
     ...SOME joins(INNER AND left)
WHERE ColumnA IS NOT NULL
GROUP BY ColumnA
    ,ColumnB
    ,ColumnC
HAVING COUNT(*) > 1

EXCEPT

-- get duplicates based on ColumnA, ColumnB, ColumnC
SELECT '-'
    ,ColumnA
    ,ColumnB
    ,ColumnC
    ,COUNT(*)
FROM MainTable
     ...SOME joins(INNER AND left)
WHERE ColumnA IS NOT NULL
GROUP BY ColumnA
    ,ColumnB
    ,ColumnC
HAVING COUNT(*) > 1
ORDER BY COUNT(*) DESC


I am just not able to complete this query :(

Solution

Using window functions will likely be simpler here:

DB<>Fiddle

WITH cte AS
( 
    SELECT 
        ColumnA,
        ColumnB,
        ColumnC,
        COUNT(*) OVER (PARTITION BY ColumnA, ColumnB)
          AS count_ab,
        COUNT(*) OVER (PARTITION BY ColumnA, ColumnB, ColumnC)
          AS count_abc
    FROM MainTable
         ...SOME joins(INNER AND left)
    WHERE ColumnA IS NOT NULL
)
SELECT
    ColumnA,
    ColumnB,
    ColumnC,
    count_ab
FROM 
    cte
WHERE
    count_ab > 1
  AND
    count_abc = 1 ;

Code Snippets

WITH cte AS
( 
    SELECT 
        ColumnA,
        ColumnB,
        ColumnC,
        COUNT(*) OVER (PARTITION BY ColumnA, ColumnB)
          AS count_ab,
        COUNT(*) OVER (PARTITION BY ColumnA, ColumnB, ColumnC)
          AS count_abc
    FROM MainTable
         ...SOME joins(INNER AND left)
    WHERE ColumnA IS NOT NULL
)
SELECT
    ColumnA,
    ColumnB,
    ColumnC,
    count_ab
FROM 
    cte
WHERE
    count_ab > 1
  AND
    count_abc = 1 ;

Context

StackExchange Database Administrators Q#241015, answer score: 8

Revisions (0)

No revisions yet.