Let’s consider the problem of designing a database of airline information—more specifically, information about flights, aircraft and employees.
The following is an ER diagram for this problem domain:

Here are some facts about the diagram:
Answer the following questions:
Why is there an arrow from Has Type to Aircraft Type (or, more precisely, from Airplane through Has Type to Aircraft Type)?
Because a given aircraft has at most one aircraft type. Thus, Has Type is a many-to-one relationship set from Airplane to Aircraft Type.
Why isn’t there an arrow from Is Certified to Aircraft Type?
Because a given employee may be certified to fly more than one type of aircraft. Thus, Is Certified is a many-to-many relationship set.
Why is there a thick line from Airplane to Has Type?
Because every aircraft has at least one aircraft type. Thus, Airplane has total participation in Has Type.
Why isn’t there a thick line from Employee to Is Certified?
Because not every employee is certified to fly. Thus, Employee has partial participation in Is Certified.
In transforming this diagram to a relational schema, would we need a separate relation to represent the Has Type relationship set? Why or why not?
No. Because Has Type is many-to-one from Airplane to Aircraft Type, we can combine the separate relations that would ordinarily be formed for Airplane and Has Type into a single relation:
Airplane(num, type_id)
Now, transform this diagram to a relational schema. Specify the name and attribute names of each relation in the schema, underlining the primary-key attributes of each relation.
Notes:
As discussed above, we did not give Has Type its own relation. Similarly, because Uses is many-to-one, we are able to combine its relation with the relation for Flight.
We would be unlikely to give Has Type its own relation, since the entity set that we combined it with (Airplane) has only one attribute of its own. However, we could have chosen to give Uses its own relation. If we did, you would end up with the following schema for it:
In addition, we would remove airplane_num from the relation for Flight.
Note that the primary key of Uses only includes the primary key from the constrained entity set (Flight). This allows us to ensure that a given flight appears at most once in Uses.
Finally, specify any foreign keys that are present in your relations along with the associated referential-integrity constraints.
airplane_num in Flight, which references Airplane.num; each value of airplane_num must match a value of the num attribute from Airplane.
type in Airplane, which references AircraftType.id; each value of type match a value of the id attribute from AircraftType.
employee_id in IsCertified, which references Employee.id; each value of employee_id must match a value of the id attribute from Employee.
aircraft_type_id in IsCertified, which references AircraftType.id; each value of aircraft_type_id must match a value of the id attribute from AircraftType.
If we gave Uses its own relation, both of its attributes would be foreign keys:
flight_num, which references Flight.flno; each value of flight_num must match a value of the flno attribute from Flight.
airplane_num, which references Airplane.num; each value of airplane_num must match a value of the num attribute from Airplane.
In lecture, we’ve been working with a university database. Here is one possible schema for this database:
Schema of the database
Student(id, name)
Department(name, office)
Room(id, name, capacity)
Course(name, start_time, end_time, room_id)
MajorsIn(student_id, dept_name)
Enrolled(student_id, course_name, credit_status)
You can also see sample contents of the relations/tables here.
Let’s say that we want to use relational algebra to find the names of all students in the database who are enrolled in CS 460, along with the credit status for which they are enrolled.
Why doesn’t the following query work?
π name, credit_status (σ course_name = ‘CS 460’ (Student x Enrolled))
How could we fix the query so that we get only the names of students who are actually enrolled in CS 460?
Notes:
This query still begins with the full Cartesian product of Student and Enrolled.
Next, it performs a selection (σ) that keeps only the logical combinations of tuples from the two relations – those in which the value of id from Student matches the value of student_id from Enrolled – and that also have ‘CS 460’ as the value of course_name.
Finally, it performs a projection (π) that keeps only the name and credit_status attributes from those combinations.
One alternate approach would be to take advantage of the fact that a
theta-join is equivalent to a Cartesian product followed by a
selection. As a result, we can use a theta-join to replace the
Cartesian product and selection in the query above:
π name, credit_status (Student ⨝id = student_id AND course_name = ‘CS 460’ Enrolled)
Would the following relational-algebra query allow us to solve the
same problem?
π name, credit_status (σ course_name = ‘CS 460’ (Student ⨝ Enrolled))
However, we could get a correct alternative query by adding in
the necessary join condition:
π name, credit_status (σ course_name = ‘CS 460’ (Student ⨝id = student_id Enrolled))
Assume that a given student can have 0 or more majors. If we want
to find the id numbers of all students who are not majoring in
computer science, why wouldn’t the following query work?
π id (σ id = student_id AND dept_name != ‘comp sci’ (Student x MajorsIn))
The results won’t include students that have no major at all. Those students don’t appear in the MajorsIn table, and thus none of the combinations formed by the Cartesian product will satisfy the condition id = student_id for those students.
The results will incorrectly include students who are majoring in both computer science and one or more other departments. That’s because their other major(s) will cause those students will appear in one or more combinations from the Cartesian product that satisfy both of the conditions in the selection operation, and thus they will still appear in the final results.
How could we use relational algebra to obtain the id numbers of all students who are not majoring in computer science?
In this query, the left-hand side of the set difference produces the id numbers of all students in the database, while the right-hand side produces the id numbers of all students who are majoring in computer science. Thus, taking the set difference of these two sets leaves us with the id numbers of all students who are not majoring in computer science.
In this example, we’ll consider a database of movie information. Suppose we want to allow users to get information about:
For the purposes of this exercise, we’ll assume that each movie has a single director, and we will only concern ourselves with the following awards: Best Picture, Best Actor, Best Actress, Best Supporting Actor, and Best Supporting Actress.
Let’s construct an ER diagram for this domain. While there are many possible answers for this problem, you should try to include at least one example of each of the following:
In addition to constructing the ER diagram, specify the primary-key attributes of all entity sets by underlining the attributes on the diagram.

Here are some points worth noting about this diagram:
The diagram uses a single entity set, Person, for both actors and directors. The advantage to using a single set for both is that some people may be both actors and directors. If we used two separate entity sets, we would end up storing information about some people twice—which wastes space and makes it possible for inconsistencies to arise in the database.
The Directed and Acted In relationship sets are binary relationship sets—they each store associations between two entities. Awarded is a ternary relationship set, because it involves associations between three entities: Oscar entities, Person entities, and Movie entities, such as the fact that Tom Hanks won the Best Actor award in 1994 for his performance in “Philadelphia”.
The Awarded relationship is many-to-one, because a given Oscar statue is awarded to one person. (We are assuming that the only Oscars we care about are the four major acting awards, all of which are won by one person, and the Best Picture award, which for our purposes is won/received by the director.) We show this many-to-one relationship by using an arrow from Awarded to Person. Strictly speaking, this indicates that a given Oscar-movie combination is associated with at most one person; while this is more than we really need to say, it’s not incorrect.
We also have an arrow from Directed to Person (or, more precisely, from Movie through Directed to Person), because we’re assuming that there is a single director for each movie, and thus Directed is a function that maps a given movie to its director.
A given entity set can participate either totally or partially in a given relationship set. In this case, Oscar has total participation in Awarded, because each entity in Oscar has a corresponding relationship in Awarded. Similarly, Movie participates totally in both Acted In and Directed if we assume that we always store the director and at least one actor for each movie in the database. We indicate total participation by using either a thick line (as shown here) or a double line. Thin/single lines indicate partial participation. For example, Person participates only partially in Acted In, because only some people in the database are actors.
The diagram shows an attribute for the Acted In relationship set. This salary attribute is associated with the relationship set because an actor earns a separate salary for each movie in which he or she acts.
The attributes that constitute the primary keys for the Oscar, Person, and Movie entity sets are underlined. Note that we need multiple attributes for each primary key (e.g., name and year for Movie). If we were to implement this database, we might well associate a unique ID with each entity in a given entity set, and use the IDs as the primary keys instead.
Last updated on September 12, 2025.