patternsqlMinor
MySQL: unique combination of two columns
Viewed 0 times
combinationuniquecolumnsmysqltwo
Problem
This is my
I have about 200k rows like this.
I want to delete duplicates. For example if
I want unique combinations across the two columns. This means if (1, 2) is there then (2, 1) should not be there and should be deleted.
Then I can make a friend list query like this
My question is how to delete these duplicate rows?
Please also help me understand how to find the mutual friends of 2 users (userid of them may reside in either userid column or friendid in friends table).
friends tableuserid(int) | friendid(int) | confirmed(tiny int)
--------------------
1 | 2 | 1
2 | 1 | 1
1 | 3 | 1
3 | 1 | 1
1 | 5 | 0
2 | 4 | 1
4 | 2 | 1
4 | 1 | 1
1 | 4 | 1I have about 200k rows like this.
I want to delete duplicates. For example if
userid=1 and friendid=2 then I don't want the row userid=2, friendid=1.I want unique combinations across the two columns. This means if (1, 2) is there then (2, 1) should not be there and should be deleted.
Then I can make a friend list query like this
select * from friends where (userid=1 or friendid=1) and confirmed=1My question is how to delete these duplicate rows?
Please also help me understand how to find the mutual friends of 2 users (userid of them may reside in either userid column or friendid in friends table).
Solution
You can use a non-standard MySQL construction and delete from a self-join:
when you have cleaned up the table, next step is to prevent this from happening again. The obvious choice would be a CHECK constraint guaranteeing that userid < friendid:
and a unique constraint on these two columns:
Unfortunately CHECK constraint is not (AFAIK) supported by MySQL. You can however mimic this with generated columns:
Now we can add a unique constraint on the generated columns:
To find common friends among for example 3 and 5 you can use a query like:
delete f1.*
from friends f1
join friends f2
on f1.userid = f2.friendid
and f1.friendid = f2.userid
and f1.userid > f2.userid;when you have cleaned up the table, next step is to prevent this from happening again. The obvious choice would be a CHECK constraint guaranteeing that userid < friendid:
alter table friends add constraint ... check(userid < friendid)and a unique constraint on these two columns:
alter table friends add constraint ... unique (userid, friendid);Unfortunately CHECK constraint is not (AFAIK) supported by MySQL. You can however mimic this with generated columns:
alter table friends
add leastid int as (least(userid, friendid))
persistent after friendid;
alter table friends
add greatestid int as (greatest(userid, friendid))
persistent after friendid;Now we can add a unique constraint on the generated columns:
alter table friends add constraint ak1_friends
unique (leastid, greatestid);
insert into friends (userid, friendid) values (1,2);
ERROR 1062 (23000): Duplicate entry '1-2' for key 'ak1_friends'
insert into friends (userid, friendid) values (2,1);
ERROR 1062 (23000): Duplicate entry '1-2' for key 'ak1_friends'To find common friends among for example 3 and 5 you can use a query like:
select f1.userid, f1.friendid, f2.userid, f2.friendid
from friends f1
join friends f2
on f1.userid in (f2.userid, f2.friendid)
or f1.friendid in (f2.userid, f2.friendid)
where 5 in (f1.userid, f1.friendid)
and 3 in (f2.userid, f2.friendid);Code Snippets
delete f1.*
from friends f1
join friends f2
on f1.userid = f2.friendid
and f1.friendid = f2.userid
and f1.userid > f2.userid;alter table friends add constraint ... check(userid < friendid)alter table friends add constraint ... unique (userid, friendid);alter table friends
add leastid int as (least(userid, friendid))
persistent after friendid;
alter table friends
add greatestid int as (greatest(userid, friendid))
persistent after friendid;alter table friends add constraint ak1_friends
unique (leastid, greatestid);
insert into friends (userid, friendid) values (1,2);
ERROR 1062 (23000): Duplicate entry '1-2' for key 'ak1_friends'
insert into friends (userid, friendid) values (2,1);
ERROR 1062 (23000): Duplicate entry '1-2' for key 'ak1_friends'Context
StackExchange Database Administrators Q#187328, answer score: 4
Revisions (0)
No revisions yet.