Recall that the MongoDB version of our movie database consists of three collections:
movies containing documents about moviespeople containing documents about actors and directorsoscars containing documents about Academy Awards.In addition, remember that:
Whenever we refer to a person or movie, we also embed the associated entity’s name.
Person documents include one or both of the boolean fields
hasActed and hasDirected. These fields are only included when
their values are true.
Here are some sample documents:
{ _id: "0499549", name: "Avatar", year: 2009, rating: "PG-13", runtime: 162, genre: "AVYS", earnings_rank: 4, actors: [ { id: "0000244", name: "Sigourney Weaver" }, { id: "0002332", name: "Stephen Lang" }, { id: "0735442", name: "Michelle Rodriguez" }, { id: "0757855", name: "Zoe Saldana" }, { id: "0941777", name: "Sam Worthington" } ], directors: [ { id: "0000116", name: "James Cameron" } ] } { _id: "0000059", name: "Laurence Olivier", dob: "1907-5-22", pob: "Dorking, Surrey, England, UK", hasActed: true, hasDirected: true } { _id: ObjectID("528bf38ce6d3df97b49a0569"), year: 2013, type: "BEST-ACTOR", person: { id: "0000358", name: "Daniel Day-Lewis" }, movie: { id: "0443272", name: "Lincoln" } }
If you have installed and configured everything according to our directions, feel free to try out your queries using MongoDB Compass. However, doing so is not required.
Notes:
You may only use aspects of the MongoDB query language that we discussed in lecture.
The results of the query should include only the requested
information, with no extraneous fields. In particular, you should
exclude the _id field from the results of your queries unless
the problem indicates otherwise.
You do not need to worry about the order of the fields in the results.
Here are the problems:
Find the top ten highest grossing movies in the database. For each such movie, output a document with the movie’s name and the name(s) of the director(s).
How could we solve this problem using the find method?
How could we solve this problem using an aggregation pipeline
(i.e., the aggegrate method)? In addition to finding the
same results as before, rename the field for the names of the
directors, giving it the name directors.
How many actors in the database were born in California?
How could we solve this problem using the count method?
How could we solve this problem using an aggregation pipeline?
The result should be a single document with a single field
named num_actors.
Hint: In order to obtain a single count from a pipeline (as
opposed to a separate count for each of several subgroups), we
can use a $group stage with an _id value of null. Doing
so will create a single subgroup containing all of the
documents that are passed into that stage, and thus we can
obtain a single count of all of those documents.
Which people in the database have directed a movie in which Tom
Hanks has acted? Use the distinct method to produce a list
containing the names of these directors, with each name appearing
only once.
Find all years before 1970 that have 6 or more movies in the
database. Each result document should include a field called
year for the year and a field called count for the number of
movies from that year.
For each actor in the database, create a document with two fields:
one called actor whose value is the name of the actor
one called num_movies whose value is the number of movies
in which the person has acted.
Hint: Begin by using an $unwind stage to “unwind” the arrays of actors in all of the movie documents. Doing so will allow you to create a separate group for each actor.
Some of the material needed for these problems will be covered in Friday’s lecture.
Building on your answer to the previous problem, find the person
in the database who has acted in the most movies. The final result
should be a single document with fields named actor and
num_movies. You may assume that actor names are unique. If
there are multiple actors who are tied for the most movies, the
result can be any one of them.
Who directs the longest movies? Of all directors who have
directed 5 or more movies, find the 10 directors whose movies have
the longest average runtime. The final result should consist of
documents with fields named director (for the director’s name),
avg_runtime, and num_movies.
Last updated on November 20, 2025.