HiveBrain v1.2.0
Get Started
← Back to all entries
snippetsqlMinor

How to write this self join based on three columns

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
thisthreecolumnsjoinwritebasedhowself

Problem

Hello there I have a following table

------------------------------------------
| id | language | parentid | no_daughter |
------------------------------------------
| 1  |     1    |    0     |      2      |
------------------------------------------
| 1  |     1    |    0     |      2      |
------------------------------------------
| 2  |     1    |    1     |      1      |
------------------------------------------
| 2  |     2    |    1     |      1      |
------------------------------------------
| 3  |     1    |    1     |      0      |
------------------------------------------
| 3  |     2    |    1     |      0      |    
------------------------------------------
| 4  |     1    |    2     |      0      |
------------------------------------------
| 4  |     2    |    2     |      0      |
------------------------------------------
| 5  |     1    |    2     |      0      |
------------------------------------------
| 5  |     2    |    2     |      1      |
-----------------------------------------
| 5  |     1    |    4     |      1      |
------------------------------------------
| 5  |     2    |    4     |      1      |
------------------------------------------


Scenario

Every record has more than one rows in table with different language ids. parentid tells who is the parent of this record. no_daughter columns tells against each record that how many child one record has. Means in Ideal scenario If no_daughter has value 2 of id = 1 , it means 1 should be parentid of 2 records in same table. But If a record has more than one exitance with respect to language, it will be considered as one record.

My Problem

I need to find out those records where no_daughter value is not correct. It means if no_daughter is 2, there must be two records whoes parentid has that id. In above case record with id = 1 is valid. But record having id = 2 is not valid because the no_daughter = 1 but actual daughter of this record is 2

Solution

The problem with your provided query is that you're doing an INNER JOIN. Because there are no rows with a parentid of 5, the subquery would return NULL on those rows and be excluded.

What you want is a LEFT JOIN, something like this (sqlfiddle):

SELECT t.*, parentchildrelation.parentid, childs
FROM tbl_info t
LEFT JOIN 
  (SELECT 
    parentid,
    COUNT(DISTINCT id) AS childs 
  FROM
    tbl_info  
  GROUP BY parentid) AS parentchildrelation 
  ON t.id = parentchildrelation.parentid 
WHERE t.no_daughter != parentchildrelation.childs 
   OR (t.no_daughter > 0 AND parentchildrelation.parentid IS NULL)
GROUP BY t.id, t.parentid;


I included more columns, because where id=5 and parentid=2 has two different values for no_daughter. You should be able to adjust those as needed.

Code Snippets

SELECT t.*, parentchildrelation.parentid, childs
FROM tbl_info t
LEFT JOIN 
  (SELECT 
    parentid,
    COUNT(DISTINCT id) AS childs 
  FROM
    tbl_info  
  GROUP BY parentid) AS parentchildrelation 
  ON t.id = parentchildrelation.parentid 
WHERE t.no_daughter != parentchildrelation.childs 
   OR (t.no_daughter > 0 AND parentchildrelation.parentid IS NULL)
GROUP BY t.id, t.parentid;

Context

StackExchange Database Administrators Q#37521, answer score: 2

Revisions (0)

No revisions yet.