SQL Movie-Ratings Exercises


You’ve started a new movie-rating website, and you’ve been collecting data on reviewers’ ratings of various movies. There’s not much data yet, but you can still try out some interesting queries. A SQL file to set up the schema and data for these exercises is downloadable here. This schema and data can be loaded as specified in the file into SQLite, MySQL, or PostgreSQL; see our quick guide for installing and using these systems.

Schema:
Movie ( mID, title, year, director )
English: There is a movie with ID number mID, a title, a release year, and a director.

Reviewer ( rID, name )
English: The reviewer with ID number rID has a certain name.

Rating ( rID, mID, stars, ratingDate )
English: The reviewer rID gave the movie mID a number of stars rating (1-5) on a certain ratingDate.
 
The questions are organized into sections. If you’ve watched the videos up to the point indicated in the section title, then you should be able to write every query in the section. (A few queries may become easier once you’ve learned more advanced methods, but you should be able to write all of the queries using the constructs presented in the videos up to that point.) Unless a specific result ordering is asked for, you can return results in any order.

As a guide to test the accuracy of your SQL queries, the correct query results over the provided data can be seen by pressing the button at the bottom of the page.


Basic SELECT, Table Variables, Set Operators

1.  Favorite director: Find the titles of all movies directed by Steven Spielberg.

2.  What were some good years for movies? Find all years that have a movie that received a rating of 4 or 5, and sort them in increasing order.

3.  "After all, tomorrow is another day…" Find the names of all reviewers who rated Gone with the Wind.

4.  Conflict of interest? For any rating where the reviewer is the same as the director of the movie, return the reviewer name, movie title, and number of stars.

5.  Let’s make the Ratings table human-readable. Write a query to return the ratings data in a more readable format: reviewer name, movie title, stars, and ratingDate. Also, sort the data, first by reviewer name, then by movie title, and lastly by number of stars. (Blanks in the displayed result are NULL values.)

6.  All names in order: Return all reviewer names and movie names together, alphabetized. (Sorting by the first name of reviewer and first word in the title is fine; no need for last names or removing “The”.)

7.  What should I watch? Find the titles of all movies not yet reviewed by Chris Jackson. (Hint: Without subqueries, you might think of this as all movies minus the ones Chris has already rated.)

8.  Sometimes when you see a movie again, you like it better the second time. For all cases where the same reviewer rated the same movie twice and gave it a higher rating the second time, return the reviewer’s name and the title of the movie.

9.  I’ve seen that one too: For all pairs of reviewers such that both reviewers gave a rating to the same movie, return the names of both reviewers. Eliminate duplicates, don’t pair reviewers with themselves, and include each pair only once.


Subqueries in WHERE, FROM, SELECT


10.  What movies have poor coverage in our database? Find the titles of all movies that have no ratings.

11.  Harshest ratings: Find the reviewer name, movie title, and number of stars for all movies that have received the lowest rating currently in the database.

12.  Keep the best, ignore the rest: For each movie that has at least one rating, find the highest number of stars that movie received. Return the movie title and number of stars. Sort by movie title. (Hint: You CAN write this query without aggregation.)


Aggregation


13.  Let’s compute some statistics: List movie titles and average ratings, from highest-rated to lowest-rated.

14.  What are the most controversial movies? For each movie, return the title and the ‘rating spread’, that is, the difference between highest and lowest ratings given to that movie. Sort by rating spread from highest to lowest, then by movie title.

15.  Power users: Find the names of all reviewers who have made 3 or more ratings. (As an extra challenge, try writing the query without HAVING or without COUNT.)

16.  Busy directors: Some directors directed more than one movie. For all such directors, return the titles of all movies directed by them, as well as the director name. Sort by director name, then movie title. (As an extra challenge, try writing the query both with and without COUNT.)

17.  Best on average: Find the movie(s) with the highest average rating. Return the movie title(s) and average rating. (Hint: This query may be more difficult to write in SQLite than other systems; you might think of it as finding the highest average rating and then choosing the movie(s) with that average rating.)

18.  Worst on average: Find the movie(s) with the lowest average rating. Return the movie title(s) and average rating. (Analogous to previous query; see hint for that one)

19.  Are older movies better-rated? Find the difference between the average rating of movies released before 1980 and the average rating of movies released after 1980. (Make sure to calculate the average rating for each movie, then the average of those averages for movies before 1980 and movies after. Don’t just calculate the overall average rating before and after 1980.)


NULLs

20.  Timeless ratings: Some reviewers didn’t provide a date with their rating. Find the names of all reviewers who have ratings with a NULL value for the date.

21.  What’s the best film from each director? For each director, return their name, the title of the movie(s) they directed that received the highest individual rating, and the value of that rating. Ignore movies whose director is not specified in the database.


Data Modification Statements

Each of the questions in this section (except the first) assumes the database is in the state after the modifications from the previous question. If you find you’ve made a mistake or want to restart, just reload the schema and data from scratch.

22.  We have a new reviewer: Add the reviewer Roger Ebert to our database, with an rID of 209. To verify, write a query to return the revised number of reviewers in the database.

23.  A prolific and generous reviewer: Insert 5-star ratings by Roger Ebert for all movies in the database. Leave the review date as NULL. To verify, find the new average rating given to Snow White.

24.  Limited extended edition director’s cut re-release: For all movies that have an average rating of 4 stars or higher, add 25 to the release year. (Update the existing tuples; don't insert new tuples.) To verify, return the number of movies now with release years before 1990.

25.  Filter by decade: Now, remove all movies from the Movie table except those with years in the range 2000 to 2010. To verify, return the titles of all movies remaining in the Movie table.

26.  Whoops! Now many of the ratings refer to movies no longer in the Movie table. Delete all ratings where there is no corresponding movie in the Movie table. To verify, return the number of remaining ratings.

27.  Don’t need you any more: Now many of our reviewers have no ratings left. Remove all reviewers who have no ratings in the Ratings table. To verify, return the names of all remaining reviewers.