patternsqlMinor
MySQL check duplicate with group by using wildcard?
Viewed 0 times
wildcardgroupwithduplicatemysqlusingcheck
Problem
+----+--------------+-----+-----------+----------+
| ID | NAME | AGE | ADDRESS | SALARY |
+----+--------------+-----+-----------+----------+
| 1 | Ramesh Olive | 32 | Ahmedabad | 2000.00 |
| 2 | Tan Kau | 25 | Delhi | 1500.00 |
| 3 | Jason Tan Kau| 25 | Delhi | 2000.00 |
| 4 | Chaitali | 25 | Mumbai | 6500.00 |
| 5 | Hardik | 27 | Bhopal | 8500.00 |
| 6 | Hardik Jass | 27 | Bhopal | 4500.00 |
| 7 | Muffy John | 24 | Indore | 10000.00 |
| 8 | Muffy Lee | 24 | Indore | 10000.00 |
+----+--------------+-----+-----------+----------+In example above, let said the table name is "table_a" and
1) "Tan Kau" is duplicate with "Jason Tan Kau" and 2) "Hardik" is duplicate with "Hardik Jass"
How to write SQL that will produce output like below?
I think this will work but it should be very slow. Any ideas to improve this?
Select A.*, IF(B.ID IS NULL, "", "DUP") as DUP
FROM table_a A
LEFT JOIN table_a B
ON A.NAME LIKE CONCATE("%", B.NAME, "%") AND A.ID != B.ID
+----+--------------+-----+-----------+----------+-----+
| ID | NAME | AGE | ADDRESS | SALARY | DUP |
+----+--------------+-----+-----------+----------+-----+
| 1 | Ramesh Olive | 32 | Ahmedabad | 2000.00 | |
| 2 | Tan Kau | 25 | Delhi | 1500.00 | Dup |
| 3 | Jason Tan Kau| 25 | Delhi | 2000.00 | Dup |
| 4 | Chaitali | 25 | Mumbai | 6500.00 | |
| 5 | Hardik | 27 | Bhopal | 8500.00 | Dup |
| 6 | Hardik Jass | 27 | Bhopal | 4500.00 | Dup |
| 7 | Muffy John | 24 | Indore | 10000.00 | |
| 8 | Muffy Lee | 24 | Indore | 10000.00 | |
+----+--------------+-----+-----------+----------+-----+Solution
Your query can return the expected results by adding the reverse condition:
I don't know if it will be faster, but another way to do it would be to use INSTR:
SQL Fiddle
SELECT A.*, IF(B.ID IS NULL, "", "DUP") as DUP
FROM persons A
LEFT JOIN persons B
ON a.ID <> b.ID
AND (a.Name LIKE CONCAT ("%", b.Name, "%") OR b.Name LIKE CONCAT ("%", a.Name, "%"))
ORDER BY ID;I don't know if it will be faster, but another way to do it would be to use INSTR:
SELECT A.*, IF(B.ID IS NULL, "", "DUP") as DUP
FROM persons A
LEFT JOIN persons B
ON a.ID <> b.ID
AND (Instr(a.Name, b.Name) > 0 OR Instr(b.Name, a.Name) > 0)
ORDER BY ID;SQL Fiddle
Code Snippets
SELECT A.*, IF(B.ID IS NULL, "", "DUP") as DUP
FROM persons A
LEFT JOIN persons B
ON a.ID <> b.ID
AND (a.Name LIKE CONCAT ("%", b.Name, "%") OR b.Name LIKE CONCAT ("%", a.Name, "%"))
ORDER BY ID;SELECT A.*, IF(B.ID IS NULL, "", "DUP") as DUP
FROM persons A
LEFT JOIN persons B
ON a.ID <> b.ID
AND (Instr(a.Name, b.Name) > 0 OR Instr(b.Name, a.Name) > 0)
ORDER BY ID;Context
StackExchange Database Administrators Q#28647, answer score: 3
Revisions (0)
No revisions yet.