HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

Checking for Duplicates, Except

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
duplicatescheckingexceptfor

Problem

I have a query that returns records that have duplicate names within a given DBSTATUS. I used to do this with a direct connection to the remote oracle database server with this query. (Names changed to protect the innocent, but otherwise, this is my code.)

SELECT  SUBSTR(NAME, 1, 3) AS CATID, 
    TABLE1.NAME,
    TABLE1.DESC1 AS DESCRIPTION, 
    TABLE1.DBKEY, 
    TABLE1.DBSTATUS, 
    ISSUES.NAME_CT
FROM  (SELECT  NAME, DBSTATUS, COUNT(NAME) AS NAME_CT
       FROM  TABLE1
       WHERE  DBSTATUS IN (1,2)
       GROUP BY NAME, DBSTATUS
       HAVING COUNT(NAME) > 1) ISSUES, 
       TABLE1
WHERE ISSUES.NAME = TABLE1.NAME 
    AND ISSUES.DBSTATUS = TABLE1.DBSTATUS
ORDER BY CATID, NAME, DBKEY


The requirements have changed and now I need to exclude a list of CATIDs from this report, but only when they have a DBSTATUS of 2. This is what I came up with, but I don't much like the duplicated select in the WHERE statement. Since the exclusion table resides on the local server, I have switched to an OPENQUERY instead of a direct connection to the remote server. How can I improve this?

SELECT *
FROM OPENQUERY(REMOTE_ORACLE_SERVER,
'SELECT  SUBSTR(NAME, 1, 3) AS CATID, 
    TABLE1.NAME,
    TABLE1.DESC1 AS DESCRIPTION, 
    TABLE1.DBKEY, 
    TABLE1.DBSTATUS, 
    ISSUES.NAME_CT
FROM  (SELECT  NAME, DBSTATUS, COUNT(NAME) AS NAME_CT
       FROM  TABLE1
       WHERE  DBSTATUS IN (1,2)
       GROUP BY NAME, DBSTATUS
       HAVING COUNT(NAME) > 1) ISSUES, 
       TABLE1
WHERE ISSUES.NAME = TABLE1.NAME 
    AND ISSUES.DBSTATUS = TABLE1.DBSTATUS
ORDER BY CATID, NAME, DBKEY
')drv
WHERE (CATID IN(SELECT CATID FROM RACK_CATIDS) AND DBSTATUS = 1)
OR (CATID NOT IN(SELECT CATID FROM RACK_CATIDS))

Solution

Let me point out some things that I know can be improved so that other answers can focus on my problem.

SELECT *


I shouldn't be selecting all like this. Even if there's not really a performance hit because open query needs to get all of the meta data anyway, this is a bad habit to get into. Selects should be explicit in what we're selecting.

'SELECT  SUBSTR(NAME, 1, 3) AS CATID,


CATID isn't a very good alias. CategoryId would be much better and clearer to someone following me someday.

WHERE  DBSTATUS IN (1,2)


This needs a comment saying what these DBSTATUSes actually represent. Mr. Maintainer would have to go find the lookup table in order to find out. (And this query gives no hint that the lookup table even exists.)

Code Snippets

'SELECT  SUBSTR(NAME, 1, 3) AS CATID,
WHERE  DBSTATUS IN (1,2)

Context

StackExchange Code Review Q#73726, answer score: 6

Revisions (0)

No revisions yet.