patternsqlMinor
Simplify SQL query that utilizes GROUP BY and HAVING BY and uses aggregation
Viewed 0 times
utilizesgroupsqlqueryhavingusessimplifythatandaggregation
Problem
I am taking Stanford's Introduction to Databases Self-Paced online course. I have gone through the videos in the SQL mini-course, and I am having trouble completing the exercises.
The following is the question from the SQL Movie-Rating Query Exercises, Question 6:
For all cases where the same reviewer rated the same movie twice and
gave it a higher rating the second time, return the reviewer's name
and the title of the movie.
The database can be found here.
My answer to this question is very complex; so much so that I the length of my query exceeds the query character limit in class. I will post my solution in steps, since I think that might help surface some of my conceptual misunderstandings.
Step 1:Give me all the movies where the same reviewer reviewed the same movie twice
Step 2: Let's join the three tables so that all the information is in one place:
Step 3: Filter the result set in Step 2 so that it only contains tuples that are also in Step 1
Step 4: Return the result set from Step 3 twice so that we can compare them the rating, reviewer, movie, stars and date
```
SELECT view_one.name, view_one.title
FROM (SELECT *
FROM
(SELECT Movie.mID, Movie.title, Movie.yea
The following is the question from the SQL Movie-Rating Query Exercises, Question 6:
For all cases where the same reviewer rated the same movie twice and
gave it a higher rating the second time, return the reviewer's name
and the title of the movie.
The database can be found here.
My answer to this question is very complex; so much so that I the length of my query exceeds the query character limit in class. I will post my solution in steps, since I think that might help surface some of my conceptual misunderstandings.
Step 1:Give me all the movies where the same reviewer reviewed the same movie twice
SELECT Rating.rID, Rating.mID
FROM Rating
GROUP BY Rating.rID, Rating.mID
HAVING count(*) > 1;Step 2: Let's join the three tables so that all the information is in one place:
SELECT Movie.mID, Movie.title, Movie.year, Movie.director, Rating.rID,
Rating.stars, Rating.ratingDate, Reviewer.name
FROM Movie, Rating, Reviewer
WHERE Movie.mID = Rating.mID and Rating.rID = Reviewer.rID;Step 3: Filter the result set in Step 2 so that it only contains tuples that are also in Step 1
SELECT *
FROM
(SELECT Movie.mID, Movie.title, Movie.year, Movie.director, Rating.rID,
Rating.stars, Rating.ratingDate, Reviewer.name
FROM Movie, Rating, Reviewer
WHERE Movie.mID = Rating.mID and Rating.rID = Reviewer.rID) compiled
WHERE compiled.mID in
(SELECT Rating.mID
FROM Rating
GROUP BY Rating.rID, Rating.mID
HAVING count(*) > 1) and compiled.rID in (SELECT Rating.rID
FROM Rating
GROUP BY Rating.rID, Rating.mID
HAVING count(*) > 1);Step 4: Return the result set from Step 3 twice so that we can compare them the rating, reviewer, movie, stars and date
```
SELECT view_one.name, view_one.title
FROM (SELECT *
FROM
(SELECT Movie.mID, Movie.title, Movie.yea
Solution
SELECT Rating.rID, Rating.mID
FROM Rating
GROUP BY Rating.rID, Rating.mID
HAVING count(*) > 1;This returns all ratings where the same person reviewed the same movie more than once, but that's not what we want. We want all the ratings where the same person reviewed the same movie twice, giving it a higher rating the second time.
SELECT DISTINCT S2.rID, S2.mID
FROM Rating S1 INNER JOIN Rating S2 ON S1.rID = S2.rID AND S1.mID = S2.mID
WHERE S1.ratingDate > S2.ratingDate AND S1.stars > S2.stars;Now let's look at your step 2:
SELECT Movie.mID, Movie.title, Movie.year, Movie.director, Rating.rID,
Rating.stars, Rating.ratingDate, Reviewer.name
FROM Movie, Rating, Reviewer
WHERE Movie.mID = Rating.mID and Rating.rID = Reviewer.rID;Why do we need so many columns? The assignment is to return the movie title and reviewer name, so let's just do that:
SELECT Movie.title, Reviewer.name
FROM Movie
INNER JOIN Rating S ON Movie.mID = S.mID
INNER JOIN Reviewer ON S.rID = Reviewer.rID;So we can finish with a new step 3:
SELECT Movie.title, Reviewer.name
FROM Movie
INNER JOIN (
SELECT DISTINCT S2.rID, S2.mID
FROM Rating S1 INNER JOIN Rating S2 ON S1.rID = S2.rID AND S1.mID = S2.mID
WHERE S1.ratingDate > S2.ratingDate AND S1.stars > S2.stars
) S ON Movie.mID = S.mID
INNER JOIN Reviewer ON S.rID = Reviewer.rID;This uses a single subselect to replace
Rating from step 2 with a new version that only includes those Reviewer/Movie pairs where the second rating was higher than the first.Code Snippets
SELECT Rating.rID, Rating.mID
FROM Rating
GROUP BY Rating.rID, Rating.mID
HAVING count(*) > 1;SELECT DISTINCT S2.rID, S2.mID
FROM Rating S1 INNER JOIN Rating S2 ON S1.rID = S2.rID AND S1.mID = S2.mID
WHERE S1.ratingDate > S2.ratingDate AND S1.stars > S2.stars;SELECT Movie.mID, Movie.title, Movie.year, Movie.director, Rating.rID,
Rating.stars, Rating.ratingDate, Reviewer.name
FROM Movie, Rating, Reviewer
WHERE Movie.mID = Rating.mID and Rating.rID = Reviewer.rID;SELECT Movie.title, Reviewer.name
FROM Movie
INNER JOIN Rating S ON Movie.mID = S.mID
INNER JOIN Reviewer ON S.rID = Reviewer.rID;SELECT Movie.title, Reviewer.name
FROM Movie
INNER JOIN (
SELECT DISTINCT S2.rID, S2.mID
FROM Rating S1 INNER JOIN Rating S2 ON S1.rID = S2.rID AND S1.mID = S2.mID
WHERE S1.ratingDate > S2.ratingDate AND S1.stars > S2.stars
) S ON Movie.mID = S.mID
INNER JOIN Reviewer ON S.rID = Reviewer.rID;Context
StackExchange Code Review Q#79916, answer score: 2
Revisions (0)
No revisions yet.