HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

High school social network query for friends in common

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
socialqueryhighschoolcommonfornetworkfriends

Problem

This is sourced from the Stanford Coursera self study DB class SQL quizzes:


Students at your hometown high school have decided to organize their
social network using databases. So far, they have collected
information about sixteen students in four grades, 9-12. Here's the
schema:

Highschooler (ID, name, grade)




English: There is a high school student with unique ID and a given
first name in a certain grade.

Friend (ID1, ID2)




English: The student with ID1 is friends with the student with ID2.
Friendship is mutual, so if (123, 456) is in the Friend table, so is
(456, 123).

Likes (ID1, ID2)




English: The student with ID1 likes the student with ID2. Liking
someone is not necessarily mutual, so if (123, 456) is in the Likes
table, there is no guarantee that (456, 123) is also present.


Database
here


Prompt: For each student A who likes a student B where the two are not friends, find if they have a friend C in common (who can introduce
them!). For all such trios, return the name and grade of A, B, and C.

I would appreciate a brief code review of this working query:

select distinct a.name, a.grade, b.name, b.grade, c.name, c.grade
from highschooler a, highschooler b, highschooler c, likes, friend
where a.id=likes.id1
and b.id=likes.id2
and a.id=friend.id1
and b.id not in (select id2 from friend where a.id=id1)
and b.id in (select id1 from friend where id2=c.id)
and a.id in (select id1 from friend where id2=c.id)

Solution

Generic issues

  • The outline of the query should be made visible at a glance using indentation. This SELECT has a FROM clause and a WHERE clause. Within the WHERE clause, there are multiple ANDed conditions.



  • It is conventional to use ALL CAPS for SQL keywords (though some people dislike that convention and disregard it).



  • All selected columns should be given their own distinct names. Otherwise, MySQL will automatically name them, and in this case, you'll get three columns each called name and grade.



  • Use the same capitalization for table and column names as given in the question, since MySQL identifiers are case-sensitive on some platforms.



Query formulation

  • SELECT DISTINCT should be avoided wherever possible. Deduplicating the results could hurt performance, and is a sign of a "wrong" query. A properly formulated query based on a well designed schema and accurate data should almost never require deduplicating the results.



-
The a-b and b-c friendship relationships should be symmetrical:

$$\newcommand{rel}[2]{\overset{\mathtt{#1}}{\underset{\mathtt{#2}}{\longleftrightarrow}}}
\fbox{a} \rel{Friend}{ac} \fbox{c} \rel{Friend}{bc} \fbox{b}
$$

Therefore, I would expect two Friend tables included in the join on equal footing.

  • In fact, I'm not convinced that your query is correct. You have the condition a.id=friend.id1, but there is no mention of friend.id2 at all. Your query merely specifies that a is not friendless.



  • It should be possible to formulate the join without so many subselects.



Here's one way to write it:

SELECT a.name AS a_name, a.grade AS a_grade
     , b.name AS b_name, b.grade AS b_grade
     , c.name AS c_name, c.grade AS c_grade
    FROM Highschooler a, Highschooler b, Highschooler c, Likes, Friend ac, Friend bc
    WHERE
        a.ID = Likes.ID1 AND b.ID = Likes.ID2
        AND a.ID = ac.ID1 AND ac.ID2 = bc.ID1 AND b.ID = bc.ID2
        AND a.ID = ac.ID1
        AND b.ID = ac.ID2
        AND c.ID = bc.ID2
        AND NOT EXISTS (
            SELECT *
                FROM Friend ab
                WHERE ab.ID1 = a.ID AND ab.ID2 = b.ID
        );


However, joins would be better written using the INNER JOIN syntax:

SELECT a.name AS a_name, a.grade AS a_grade
     , b.name AS b_name, b.grade AS b_grade
     , c.name AS c_name, c.grade AS c_grade
    FROM Likes
        INNER JOIN Friend AS ac
            ON Likes.ID1 = ac.ID1
        INNER JOIN Friend AS bc
            ON Likes.ID2 = bc.ID2
        INNER JOIN Highschooler AS a
            ON ac.ID1 = a.ID
        INNER JOIN Highschooler AS b
            ON bc.ID2 = b.ID
        INNER JOIN Highschooler AS c
            ON ac.ID2 = c.ID
    WHERE
        ac.ID2 = bc.ID1
        AND NOT EXISTS (
            SELECT *
                FROM Friend AS ab
                WHERE ab.ID1 = a.ID AND ab.ID2 = b.ID
        );

Code Snippets

SELECT a.name AS a_name, a.grade AS a_grade
     , b.name AS b_name, b.grade AS b_grade
     , c.name AS c_name, c.grade AS c_grade
    FROM Highschooler a, Highschooler b, Highschooler c, Likes, Friend ac, Friend bc
    WHERE
        a.ID = Likes.ID1 AND b.ID = Likes.ID2
        AND a.ID = ac.ID1 AND ac.ID2 = bc.ID1 AND b.ID = bc.ID2
        AND a.ID = ac.ID1
        AND b.ID = ac.ID2
        AND c.ID = bc.ID2
        AND NOT EXISTS (
            SELECT *
                FROM Friend ab
                WHERE ab.ID1 = a.ID AND ab.ID2 = b.ID
        );
SELECT a.name AS a_name, a.grade AS a_grade
     , b.name AS b_name, b.grade AS b_grade
     , c.name AS c_name, c.grade AS c_grade
    FROM Likes
        INNER JOIN Friend AS ac
            ON Likes.ID1 = ac.ID1
        INNER JOIN Friend AS bc
            ON Likes.ID2 = bc.ID2
        INNER JOIN Highschooler AS a
            ON ac.ID1 = a.ID
        INNER JOIN Highschooler AS b
            ON bc.ID2 = b.ID
        INNER JOIN Highschooler AS c
            ON ac.ID2 = c.ID
    WHERE
        ac.ID2 = bc.ID1
        AND NOT EXISTS (
            SELECT *
                FROM Friend AS ab
                WHERE ab.ID1 = a.ID AND ab.ID2 = b.ID
        );

Context

StackExchange Code Review Q#63918, answer score: 6

Revisions (0)

No revisions yet.