Feel free to ask questions at any point during the lab.
We may not finish all of the lab exercises during the actual lab session. Every week, we will post solutions to the current week’s lab on the Labs page of the course website.
We don’t grade the lab exercises. However, starting with today’s lab, your attendance at lab will count towards your participation grade. If you attend 85% of the lab sessions over the course of the semester, you will get full credit for lab participation. See the syllabus for more details.
To get full credit for participation in lab, you must work productively throughout the lab session. You will not be penalized if you cannot finish all of the lab exercises.
Feel free to work on the lab exercises with your classmates, and don’t hesitate to ask a staff member for help!
If you haven’t already created a folder named cs460
for your
work in this course, follow these
instructions to do so.
Then create a subfolder called lab2
within your cs460
folder,
and put all of the files for this lab in that folder.
If you haven’t already done so, download and install DB Browser for SQLite following the instructions in Problem Set 1.
To download the database we will be using in this lab, click on the following link:
and save the file in your lab2
folder.
Using the virtual desktop
If you encounter issues installing the DB Browser software,
you can use the copy of it that is available on the
virtual desktop. Instructions for doing so can be found
here.
Launch DB Browser for SQLite.
Click the Open Database button, and find and open the
university.sqlite
database file that you downloaded above.
Click on the Browse Data tab and explore the contents of the tables. Initially, you will see the Course table, but you should explore the other tables as well by choosing the appropriate table name from the drop-down menu.
Note that the tables have the same column names as the university database from the lecture notes, but the actual rows are different. In particular:
We have included some additional data to allow for more interesting queries and results.
We use the full names of the departments in the Department and MajorsIn tables. In the lecture notes, the department names are abbreviated.
Select the Execute SQL tab, which is where you will perform queries on the database.
For example, try entering the following SQL command in the space provided:
SELECT name, credit_status FROM Student, Enrolled WHERE id = student_id AND course_name = 'CS 460';
To run it, you can do any of the following:
What do the results of the previous SQL query represent?
They represent the names of all students in the database who are enrolled in CS 460 and the credit status for which they are enrolled.
How could we use relational algebra to solve the same problem as the previous SQL query?
The most straightforward approach is to recall how a SELECT command is logically carried out:
When there are multiple tables in the FROM clause, we begin with the full Cartesian product of those tables.
Then we perform a selection (σ) based on the WHERE clause.
Finally, the we perform a projection (π) based on the SELECT clause.
That gives you the following relational-algebra query:
π name, credit_status (σ id = student_id AND course_name = ‘CS 460’ (Student x Enrolled))
One alternate approach would be to take advantage of the fact that a
theta-join is equivalent to a Cartesian product followed by a
selection. As a result, we can use a theta-join to replace the
Cartesian product and selection in the query above:
π name, credit_status (Student ⨝id = student_id AND course_name = ‘CS 460’ Enrolled)
Would the following relational algebra allow us to solve the
same problem?
π name, credit_status (σ course_name = ‘CS 460’ (Student ⨝ Enrolled))
However, we could get a correct alternative query by adding in
the necessary join condition:
π name, credit_status (σ course_name = ‘CS 460’ (Student ⨝id = student_id Enrolled))
Download the following file:
and put it in your lab2
folder.
Open lab2_queries.py
in a Python IDE or a text editor.
Important
If you choose to use a text editor, instead of double-clicking on the downloaded file, you should launch your text editor and use its File->Open menu option to open the file. You may need to change the type of file that the text editor is looking for to All Files.
The file that you create must be a plain-text file. More information about what this means is available here.
In this task, you will continue to work with the university database from the previous task. Here is its schema:
Schema of the database
Student(id, name)
Department(name, office)
Room(id, name, capacity)
Course(name, start_time, end_time, room_id)
MajorsIn(student_id, dept_name)
Enrolled(student_id, course_name, credit_status)
When constructing each query, we recommend that you follow the rules of thumb given in the lecture notes.
Feel free to work with another student on these problems, and to ask the TA for help as needed.
Write a query that finds the names and capacities of all rooms in the CAS and CGS buildings.
Test your query, revise it as needed, and copy the final version
into your file, putting it between the triple quotes for query1
.
SELECT name, capacity FROM Room WHERE name LIKE 'CAS%' OR name LIKE 'CGS%';
Write a query that finds the number of students who are enrolled
in a course for undergraduate ('ugrad'
) credit.
Test your query. The correct result is 6. If you don’t get this
result, revise it as needed. Copy the final version into your
file, putting it between the triple quotes for query2
.
SELECT COUNT(DISTINCT student_id) FROM Enrolled WHERE credit_status = 'ugrad';
Write a query that finds the name and start time of the CS course with the earliest (i.e., smallest) start time. In order to adhere to standard SQL, you will need to use a subquery.
Important: You should not do something like the following:
SELECT name, MIN(start_time) ...
Although this works in SQLite, it is not allowed in standard
SQL, because the SELECT clause combines a regular column (name
)
with an aggregate function (MIN(start_time)
) and we are not
grouping on that attribute.
Test your query, revise it as needed, and copy the final version
into your file, putting it between the triple quotes for query3
.
SELECT name, start_time FROM Course WHERE name LIKE 'CS%'; AND start_time = (SELECT MIN(start_time) FROM Course WHERE name LIKE 'CS%');
Write a query that finds, for each type of credit status, the number of students who are enrolled in a course with that credit status.
Test your query, revise it as needed, and copy the final version
into your file, putting it between the triple quotes for query4
.
SELECT credit_status, COUNT(DISTINCT student_id) FROM Enrolled GROUP BY credit_status;
Note: For this query, you are allowed to combine a regular column with an aggregate function. Why is it allowed in this case?
Because we were grouping on the column that was being combined with the aggregate function.
Write a query that finds the names of all students majoring in
'computer science'
. Hint: Don’t forget to
include an appropriate join condition.
Test your query, revise it as needed, and copy the final version
into your file, putting it between the triple quotes for query5
.
SELECT name FROM Student, MajorsIn WHERE id = student_id AND dept_name = 'computer science';
Write a query that finds the names of all students who are not enrolled in CS 460. Hint: You will need a subquery.
Test your query, revise it as needed, and copy the final version
into your file, putting it between the triple quotes for query6
.
SELECT name FROM Student WHERE id NOT IN (SELECT student_id FROM Enrolled WHERE course_name = 'CS 460');
On paper, write a relational-algebra query for each of the following problems from Task 3:
problem 1 (names and capacities of all rooms in CAS and CGS)
problem 5 (names of students majoring in computer science)
problem 6 (names of students not enrolled in CS 460).
Hint: Take advantage of the set difference operator (-
).
Also, in order to make this query easier to write, you may assume that
all student names are unique.
The solutions to these problems can be found here.
Last updated on September 19, 2024.