patternsqlMinor
High school social network query for friends in common
Viewed 0 times
socialqueryhighschoolcommonfornetworkfriends
Problem
This is sourced from the Stanford Coursera self study DB class SQL quizzes:
Students at your hometown high school have decided to organize their
social network using databases. So far, they have collected
information about sixteen students in four grades, 9-12. Here's the
schema:
English: There is a high school student with unique ID and a given
first name in a certain grade.
English: The student with ID1 is friends with the student with ID2.
Friendship is mutual, so if (123, 456) is in the Friend table, so is
(456, 123).
English: The student with ID1 likes the student with ID2. Liking
someone is not necessarily mutual, so if (123, 456) is in the Likes
table, there is no guarantee that (456, 123) is also present.
Database
here
Prompt: For each student A who likes a student B where the two are not friends, find if they have a friend C in common (who can introduce
them!). For all such trios, return the name and grade of A, B, and C.
I would appreciate a brief code review of this working query:
Students at your hometown high school have decided to organize their
social network using databases. So far, they have collected
information about sixteen students in four grades, 9-12. Here's the
schema:
Highschooler (ID, name, grade)English: There is a high school student with unique ID and a given
first name in a certain grade.
Friend (ID1, ID2)English: The student with ID1 is friends with the student with ID2.
Friendship is mutual, so if (123, 456) is in the Friend table, so is
(456, 123).
Likes (ID1, ID2)English: The student with ID1 likes the student with ID2. Liking
someone is not necessarily mutual, so if (123, 456) is in the Likes
table, there is no guarantee that (456, 123) is also present.
Database
here
Prompt: For each student A who likes a student B where the two are not friends, find if they have a friend C in common (who can introduce
them!). For all such trios, return the name and grade of A, B, and C.
I would appreciate a brief code review of this working query:
select distinct a.name, a.grade, b.name, b.grade, c.name, c.grade
from highschooler a, highschooler b, highschooler c, likes, friend
where a.id=likes.id1
and b.id=likes.id2
and a.id=friend.id1
and b.id not in (select id2 from friend where a.id=id1)
and b.id in (select id1 from friend where id2=c.id)
and a.id in (select id1 from friend where id2=c.id)Solution
Generic issues
Query formulation
-
The a-b and b-c friendship relationships should be symmetrical:
$$\newcommand{rel}[2]{\overset{\mathtt{#1}}{\underset{\mathtt{#2}}{\longleftrightarrow}}}
\fbox{a} \rel{Friend}{ac} \fbox{c} \rel{Friend}{bc} \fbox{b}
$$
Therefore, I would expect two
Here's one way to write it:
However, joins would be better written using the
- The outline of the query should be made visible at a glance using indentation. This
SELECThas aFROMclause and aWHEREclause. Within theWHEREclause, there are multipleANDed conditions.
- It is conventional to use
ALL CAPSfor SQL keywords (though some people dislike that convention and disregard it).
- All selected columns should be given their own distinct names. Otherwise, MySQL will automatically name them, and in this case, you'll get three columns each called
nameandgrade.
- Use the same capitalization for table and column names as given in the question, since MySQL identifiers are case-sensitive on some platforms.
Query formulation
SELECT DISTINCTshould be avoided wherever possible. Deduplicating the results could hurt performance, and is a sign of a "wrong" query. A properly formulated query based on a well designed schema and accurate data should almost never require deduplicating the results.
-
The a-b and b-c friendship relationships should be symmetrical:
$$\newcommand{rel}[2]{\overset{\mathtt{#1}}{\underset{\mathtt{#2}}{\longleftrightarrow}}}
\fbox{a} \rel{Friend}{ac} \fbox{c} \rel{Friend}{bc} \fbox{b}
$$
Therefore, I would expect two
Friend tables included in the join on equal footing.- In fact, I'm not convinced that your query is correct. You have the condition
a.id=friend.id1, but there is no mention offriend.id2at all. Your query merely specifies that a is not friendless.
- It should be possible to formulate the join without so many subselects.
Here's one way to write it:
SELECT a.name AS a_name, a.grade AS a_grade
, b.name AS b_name, b.grade AS b_grade
, c.name AS c_name, c.grade AS c_grade
FROM Highschooler a, Highschooler b, Highschooler c, Likes, Friend ac, Friend bc
WHERE
a.ID = Likes.ID1 AND b.ID = Likes.ID2
AND a.ID = ac.ID1 AND ac.ID2 = bc.ID1 AND b.ID = bc.ID2
AND a.ID = ac.ID1
AND b.ID = ac.ID2
AND c.ID = bc.ID2
AND NOT EXISTS (
SELECT *
FROM Friend ab
WHERE ab.ID1 = a.ID AND ab.ID2 = b.ID
);However, joins would be better written using the
INNER JOIN syntax:SELECT a.name AS a_name, a.grade AS a_grade
, b.name AS b_name, b.grade AS b_grade
, c.name AS c_name, c.grade AS c_grade
FROM Likes
INNER JOIN Friend AS ac
ON Likes.ID1 = ac.ID1
INNER JOIN Friend AS bc
ON Likes.ID2 = bc.ID2
INNER JOIN Highschooler AS a
ON ac.ID1 = a.ID
INNER JOIN Highschooler AS b
ON bc.ID2 = b.ID
INNER JOIN Highschooler AS c
ON ac.ID2 = c.ID
WHERE
ac.ID2 = bc.ID1
AND NOT EXISTS (
SELECT *
FROM Friend AS ab
WHERE ab.ID1 = a.ID AND ab.ID2 = b.ID
);Code Snippets
SELECT a.name AS a_name, a.grade AS a_grade
, b.name AS b_name, b.grade AS b_grade
, c.name AS c_name, c.grade AS c_grade
FROM Highschooler a, Highschooler b, Highschooler c, Likes, Friend ac, Friend bc
WHERE
a.ID = Likes.ID1 AND b.ID = Likes.ID2
AND a.ID = ac.ID1 AND ac.ID2 = bc.ID1 AND b.ID = bc.ID2
AND a.ID = ac.ID1
AND b.ID = ac.ID2
AND c.ID = bc.ID2
AND NOT EXISTS (
SELECT *
FROM Friend ab
WHERE ab.ID1 = a.ID AND ab.ID2 = b.ID
);SELECT a.name AS a_name, a.grade AS a_grade
, b.name AS b_name, b.grade AS b_grade
, c.name AS c_name, c.grade AS c_grade
FROM Likes
INNER JOIN Friend AS ac
ON Likes.ID1 = ac.ID1
INNER JOIN Friend AS bc
ON Likes.ID2 = bc.ID2
INNER JOIN Highschooler AS a
ON ac.ID1 = a.ID
INNER JOIN Highschooler AS b
ON bc.ID2 = b.ID
INNER JOIN Highschooler AS c
ON ac.ID2 = c.ID
WHERE
ac.ID2 = bc.ID1
AND NOT EXISTS (
SELECT *
FROM Friend AS ab
WHERE ab.ID1 = a.ID AND ab.ID2 = b.ID
);Context
StackExchange Code Review Q#63918, answer score: 6
Revisions (0)
No revisions yet.