Part I due by 11:59 p.m. on Thursday, March 5, 2026.
Part II due by 11:59 p.m. on Tuesday, March 24, 2026.
In your work on this assignment, make sure to abide by the collaboration policies of the course.
If you have questions, please come to office hours, post them on
Piazza, or email cs460-staff@cs.bu.edu.
Make sure to submit your work on Gradescope, following the procedures found at the end of Part I and Part II.
30 points total
This part of the assignment asks you to construct XPath and XQuery queries for an XML version of our entire movie database. The schema of this XML database is described here.
It is worth noting that our movie database was last updated in September of last year, so it doesn’t include information about the most recent movies.
Create a subfolder called ps3 within your
cs460 folder, and put all of the files for this assignment in that
folder.
To allow you to check your work, we’ll make use of a freely available XML DBMS called BaseX. You should begin by following the instructions for installing and using it that are available here.
As outlined in our instructions, you can perform queries by taking the following steps:
Start up the BaseX GUI by double-clicking on the JAR file that you downloaded.
Select the Database->New menu option, click the Browse button,
and use the resulting dialog box to find the imdb.xml
file that you downloaded above.
Click Open to select the file, and click OK to create the database.
To execute a query, enter it in the Editor pane in BaseX, and click the green play button to execute it. (You can also use Ctrl+Enter or Ctrl+Return for this purpose.)
The results (if any) will be displayed in the Result pane.
If you have trouble getting BaseX to work on your machine, see the troubleshooting tips on our BaseX page.
If you’re using a Mac, you should disable smart quotes, because they may lead to errors in BaseX and in our testing. There are instructions for doing so here.
ps3_queries.py is a Python file, so you could use a Python IDE
to edit it, but a regular text editor like TextEdit or Notepad++
would also be fine. However, if you use a text editor, you must
ensure that you save it as a plain-text
file.
Construct the XQuery commands needed to solve the problems given below. Test each command in BaseX to make sure that it works.
Once you have finalized the XQuery command for a given problem, copy
the command into your ps3_queries.py file, putting it between
the triple quotes provided for that query’s variable. We have
included a sample query to show you what the format of your
answers should look like.
Each of the problems must be solved by means of a single query. Unless the problem specifies otherwise, you may use either a standalone XPath expression or an XQuery FLWOR expression.
Unless the problem specifies otherwise, you must limit yourself to features of XQuery that we have discussed in lecture. See our general query-writing guidelines for more details.
The only place that you may use a subquery (i.e., a nested FLWOR
expression) is in the results clause of an outer FLWOR
expression. You should NOT have a nested FLWOR expression in a
for clause or a let clause.
The order of the clauses in each query/subquery must follow the
FLWOR acronym: a for clause (F), followed optionally by a let
clause (L), followed optionally by a where clause (W), followed
optionally by an order by clause (O), followed by a return clause
(R). You should not put the clauses in a different order –
e.g., for, followed by let, followed by another for, etc. BaseX
may allow you to do this, but it is never necessary to do so, and
such a query will often fail to run to completion in the Autograder.
Your queries should only use information provided in the problem itself. In addition, they should work for any XML database that follows the schema that we have specified.
You do not need to worry about indenting and line breaks in the results of your queries.
5 points
Make sure to read and follow the guidelines given above.
In Problem Set 1, you wrote a
SQL query to find information about two Oscar nominees from the movie
Hamnet: Jesse Buckley and Chloe Zhao. Write a standalone XPath
expression (not a FLWOR expression) to find only the places of
birth of these women. The result of the query should be two pob
elements.
5 points
In Problem Set 1, you
wrote a SQL query to find information about horror movies that have
won an Oscar of type 'BEST-PICTURE'. Write a FLWOR expression to solve a similar
problem. The results of the query should be new elements of type
horror_best_pic that each include two nested child elements:
the name child element of a movie that won Best Picture
and is a horror movie (i.e., that has the letter H somewhere
in the value of its genre attribute)
the year child element of the Best Picture Oscar that the
movie won.
For example:
<horror_best_pic> <name>The Silence of the Lambs</name> <year>1992</year> </horror_best_pic>
Hints:
Remember that you do not need to worry about indenting or line breaks in the results of your queries.
You will need to use the contains function to
perform substring matching.
You will need to use curly braces and commas as part of your
return clause. In lecture and lab, we’ve seen
examples of queries that use these delimiters to transform the
elements selected by a query into new types of elements.
5 points
In Problem Set
1, you wrote a
SQL query to find the number of top-grossing movies from 2015 until
2025. Write a FLWOR expression to solve a similar problem –
but expand it to include the average runtime of each year’s top
grossers. The result of the query should be new elements of type
top_grosser_stats that each have three nested child elements:
year for the year whose statistics are being
computed; see below for more details about this elementnum_top that has as its value
the number of the top grossers from that yearavg_runtime that has as its value
the average runtime of the top grossers from that year.For example:
<top_grosser_stats> <year>2025</year> <num_top>7</num_born_there> <avg_runtime>121</avg_runtime> </top_grosser_stats>
Sort the results from the 2025 to 2015.
Hints:
To ensure that you only consider a given year once, your query
should use the distinct-values function. For example, to
iterate over distinct Oscar types, you could do the following:
for $t in distinct-values(//oscar/type)
distinct-values gives you the text values of the
corresponding elements, not the elements themselves. As a
result, your return clause will need to construct new year
elements by adding back the begin and end tags that were
removed by distinct-values.
You should take advantage of the fact that our database only
includes an earnings_rank element for a movie if the movie
is a top grosser. Recall that you can check for the presence
of an element or attribute by just including its name in your
predicate as if the name were a boolean expression.
You will need to use one or more of the following built-in
aggregate functions: count(), sum(), avg(), min() and
max(). In lecture and lab, we’ve seen examples of
queries that illustrate how to use this type of function.
As we saw in PS 1, the year 2020 has no top-grossing movies.
However, it should still appear in the results with a num_top
value of 0. In addition, because there are no top grossers
from that year, XQuery will end up giving you an empty
element for its avg_runtime. In other words, the results for
2020 should look like this:
<top_grosser_stats> <year>2020</year> <num_top>0</num_top> <avg_runtime/> </top_grosser_stats>
In XQuery, when you ask it to sort the results, it puts them
in ascending/increasing order by default. To obtain
descending/decreasing order, put the word descending after
the thing that you are sorting on.
Some of the hints for the previous problems also apply here.
5 points
In Problem Set 2, you
wrote a SQL query to find the number of Oscars won by each movie
directed by Steven Spielberg, including movies that won no
Oscars. Write a FLWOR expression to solve a similar problem. The
results of the query should be new elements of type spielberg_movie
that include the following nested child elements:
the name child element of one of the movies that he directed
the year child element of that movie
a new element of type num_oscars that has as its
value the number of Oscars won by that movie; if the movie
did not win any Oscars, the value of this element should be 0
for each Oscar won by the movie (if any), the type child
element of the corresponding oscar element; if the movie
did not win any Oscars, the resulting spielberg_movie element
should not have any nested type elements.
For example, here is what the results for the movie Schindler’s List should look like:
<spielberg_movie> <name>Schindler's List</name> <year>1993</year> <num_oscars>2</num_oscars> <type>BEST-DIRECTOR</type> <type>BEST-PICTURE</type> </spielberg_movie>
You must sort the results in the following ways:
Sort the spielberg_movie elements in ascending order by year –
i.e., from oldest to most recent. If there are multiple movies
from the same year, sort them in ascending order by the name of the
movie. See the hints below for more about sorting.
Within a given spielberg_movie element, sort the
type child elements in ascending order.
Hints:
You will need to use a subquery (i.e., a nested FLWOR
expression). Remember that these are only allowed in the return
clause of the outer query. In lecture and lab, we’ve
seen examples of queries that illustrate how to use a nested FLWOR
expression to nest an arbitrary number of child elements within
one of the result elements returned by the outer query.
When sorting, you can break ties in the same way that we did in
SQL, by providing additional quantities to sort on.
For example, let’s say that our query was assigning movie
elements to a variable $m and we wanted to sort the results by
the movie’s rating and then by the movies’s runtime. To
do so, we would use the following clause:
order by $m/rating, $m/runtime
Some of the hints for the previous problems also apply here.
5 points
In Problem Set 2, you wrote a SQL query to find information about the two movies in the database named West Side Story. Write a FLWOR expression to solve a similar problem – but in addition to finding the year and director(s) of these movies, you should also find their actors.
The result of the query should be new elements of type
west_side_story that each include the following nested child
elements:
the year child element of one of the movies
named West Side Story
for each director of that movie, a new element of type director
that has as its value the name of the director; sort these
elements in ascending order by name
for each actor in that movie, a new element of type actor
that has as its value the name of the actor; sort these
elements in ascending order by name as well.
For example:
<west_side_story> <year>1961</year> <director>Jerome Robbins</director> <director>Robert Wise</director> <actor>George Chakiris</actor> <actor>Natalie Wood</actor> <actor>Richard Beymer</actor> <actor>Rita Moreno</actor> <actor>Russ Tamblyn</actor> </west_side_story>
Hints:
You will need to use two different subqueries: one to obtain the movie’s directors, and one to obtain its actors. When doing so, you should put a comma between the two nested FLWOR expressions.
Some of the hints for the previous problems also apply here.
5 points
In Problem Set 1, you
wrote a SQL query to find information about directors who have
directed at least four of the 200 top-grossing films. Write a FLWOR
expression to solve a similar problem. The result of the query should
be new elements of type top_grossing_director that each include the
following nested child elements:
one for the name child element of a person who has directed
four or more of the top-grossing movies
for each top-grossing movie directed by the person, a new element
of type top_grosser that has as its value the name and earnings
rank values of the movie, separated by a hyphen.
For example:
<top_grossing_director> <name>Jon Favreau</name> <top_grosser>Iron Man - 99</top_grosser> <top_grosser>Iron Man 2 - 104</top_grosser> <top_grosser>The Jungle Book - 67</top_grosser> <top_grosser>The Lion King - 18</top_grosser> </top_grossing_director>
You must sort the results in the following ways:
Sort the top_grossing_director elements in descending order by
the number of top-grossing movies. If multiple directors have the
same number of top grossers, sort them in ascending order by the
name of the director.
Within a given top_grossing_director element, sort the
top_grosser child elements in ascending order by the name
of the movie.
Some of the hints for the previous problems also apply here.
Coming soon
70 points total
In the next two problems, you will implement portions of a simple relational database management system that supports a subset of the SQL language. We have provided you with two of the three components of the system:
Your job is to implement parts of the “middle layer” of the system, which takes the parsed version of a SQL command and performs the necessary lower-level actions to execute the command. To help you, we have given you a code framework for the middle layer that already provides some of the necessary functionality.
You should begin by downloading the necessary files and configuring your work environment. The steps for doing so can be found here.
Please do this ASAP, so that you can be sure that you don’t run into any problems later on.
After configuring everything, you should spend some time
familiarizing yourself with the files that we have given you in the
dbms folder, and with Berkeley DB. In particular, you should
review/read the following resources:
The following additional resources may also be helpful:
highly recommended
Before you begin coding, we strongly encourage you to answer the questions found here:
code-reading and design questions
In the code that you write, you must limit yourself to the packages that we’ve imported at the top of the starter files. You must not use classes from any other Java package. In addition, you must not use any Java features that were not present in Java 8.
As discussed on the separate configuration page, you will need to compile and run the code from the command line in the Terminal window of VS Code.
Important: If you haven’t enabled Auto Save in VS Code, make sure to save any changes that you make to the code before you attempt to compile and run it. See our configuration guidelines for details about how to enable Auto Save.
to compile:
javac -cp 'lib/*' -d classes *.java
(see below for the expected warning messages)
to run on Windows:
java -cp 'lib/*;classes' DBMS
to run on macOS:
java -cp 'lib/*:classes' DBMS
Note: The two commands for running the program are almost
identical, but in the Windows version there is a semi-colon (;)
before the word classes, whereas the macOS version uses a
colon (:).
You will see one or more warnings when compiling your code (e.g., “Note: Parser.java uses unchecked or unsafe operations.”). These warnings are to be expected and should be ignored. Messages labeled as errors (not warnings) will keep your code from compiling and will need to be addressed. You shouldn’t see any errors when you compile the starter code that we’ve given you. If you do, let us know.
After making changes to the code, you will need to save the changes and recompile the code before you can try to re-run it. When you are at the command line of the Terminal, using the up arrow will allow you to access and reenter previously entered commands without needing to re-type them!
The code that we’ve given you can be run before you make any changes. It will begin by printing the following prompt:
Enter command (q to quit):
If you enter a valid SQL command, the program will parse the command
and display a summary of some of the command’s components (see the
notes on the DEBUG constant below for how to disable this summary).
Entering a lower-case q will allow you to quit the program.
When you run the program for the first time, it will create a
directory called db within your code directory. This is the home
directory for the Berkeley DB environment, and it will be used to
store the files that BDB creates for your database. If your program
crashes for any reason, these files may be corrupted. As a result,
we recommend that you remove all files from this directory after
a crash.
There is a constant named DEBUG that is defined in DBMS.java. When
it is set to true (as it is in the file that we have given you),
the values of many of the tokens generated by the parser are printed
after each SQL command is entered by the user. You may find this
information helpful as you implement the various types of commands.
You may also wish to add additional debugging code that is only
executed when this constant is set to true. To eliminate the
debugging messages, set DEBUG to false.
20 points; pair-optional
This is one of only two problems that you may complete with a partner. See the rules for working with a partner on pair-optional problems for details about how this type of collaboration must be structured.
Important
Before you begin coding, make sure that you have completed the necessary preparation, including performing the tasks listed under the Getting Started section and answering the code-reading questions.
In order to insert rows into a table, your DBMS needs to be able to
marshall a collection of column values into a single Berkeley DB
key/value pair. In this problem, you will add support for marshalling
by implementing the key method of the InsertRow class.
As you saw when completing the code-reading
questions, an InsertRow object is used by
the execute() method for INSERT commands (the one in the
InsertStatement class). That execute() method creates an
InsertRow object to represent the row to be inserted, and it calls
that object’s marshall() method to prepare the marshalled
key/value pair for the row.
We have already implemented some of the other methods of this class for you:
InsertRow constructor that initializes the state of object.
It takes two parameters: an already opened Table object for
the table to which the row will be added, and an array of type
Object containing the values in the row to be inserted. We
assume that the values are in the appropriate order – i.e.,
that element 0 of the array contains a value for the first
column in the table, element 1 contains a value for for the
second column in the table, etc. We also assume that the values
are valid and that they have been adjusted as needed to
correspond to the types of the columns.getKeyBuffer() method that returns a RowOutput object for
the key portion of the marshalled key/value pair.getValueBuffer() method that returns a RowOutput object for
the value portion of the marshalled key/value pair.toString() method that returns a String representation
that includes:offsets field, which is an
array in which your code will store the field offsets that
you compute at the beginning of marshalling (see below)RowOutput for the key)RowOutput for the value).
This toString() method should help you when debugging your
marshalling code.You will implement the marshall() method, which should take the
column values of the InsertRow object and marshall them into byte arrays
for the key/value pair that will eventually be inserted into the
B+tree for the table.
Important: marshall() should not interact with Berkeley DB at
all. In particular, it should not create any DatabaseEntry
objects or attempt to add them to the BDB database.
Rather, marshall() should only do the following:
Determine the correct offset values and store them in the
array to which the offsets field in the InsertRow object refers.
Write the appropriate values into the buffers represented by
the keyBuffer and valueBuffer fields, each of which refers to
a RowOutput object.
See below for more detail about each of these tasks.
Notes:
Each key/value pair should have the format that we discussed in the lecture notes on the logical-to-physical mapping. The key portion of the key/value pair should be based on the value of the primary-key column. The value portion should consist of a header of offsets followed by the values of the non-primary-key, non-null columns.
The key portion of the key/value pair will be stored in the
RowOutput object assigned to the keyBuffer field of the
InsertRow object. The value portion will be stored in the
RowOutput object assigned to the valueBuffer field.
Because RowOutput objects fill their associated byte
arrays from left to right, you will need to determine all of
the offsets that belong in the header before you begin marshalling the
column values themselves. Store these offsets in the array to
which the InsertRow object’s offsets field refers.
Once all of the offsets have been computed and stored in
the offsets array, you can begin the process of writing into the
RowOutput objects using the appropriate methods.
The InsertRow constructor takes a reference to the
corresponding Table object as a parameter, and it stores that
reference in a field called table. Your code can obtain any
column information that it needs from the Table object and its
associated Column objects.
The getLength() method in a Column object gives the actual
length in bytes of all columns except VARCHARs. In the case of
VARCHARs, you should determine the length by invoking the
String.length() method on the actual value.
Because the column values are stored in an array of type Object,
you will need to use type casts in order to treat them as objects
of their actual types. For example, to treat values[i] as a
String, you would need to do something like (String)values[i].
Consult the Column class for the method you should use to determine
the type of a given column.
Integer values are stored in the values array as objects of
Java’s Integer class, and real values are stored as objects of
Java’s Double class. When marshalling these values, you will
need to convert them to primitive values of type int and
double, and you should use the Integer.intValue() and
Double.doubleValue() methods to do so. For example, if you have
an Integer object named val, you can convert it to an int by
making the method call val.intValue().
The RowOutput methods that you will use for writing the offsets
and column values are inherited from the DataOutputStream class,
so you should make sure to review the
API
of that class.
When marshalling a String value, you should use
the writeBytes() method, not the writeUTF() method.
You should assume that all offset values are small enough to be
represented by a two-byte integer, and thus you should use the
writeShort() method for them.
To keep the marshall() method from getting too large, you may
want to add one or more private helper methods that can be called
to do part of the overall task.
Important: If you write a helper method that uses one or
more of the RowOutput methods, you must include a throws
clause in the header of the method like the one we’ve given you
for the marshall method:
public void marshall() throws IOException {
Review the Table, Column, RowOutput, and
DataOutputStream classes as needed.
marshall() methodYou should test your marshall() method thoroughly.
We’ve given you the following tools for doing so:
The RowOutput class includes a toString() method that
shows the current contents of the underlying byte array.
The InsertRow class includes its own toString() method that shows
the current values in the InsertRow object’s offsets array and
the contents of the byte arrays underlying the RowOutput objects
assigned to its keyBuffer and valueBuffer fields.
The code that we’ve given you in the execute() method of
InsertStatement will create the necessary InsertRow object,
call your marshall() method, and – if the DEBUG constant in
the DBMS class is true – print the InsertRow object so that
you can examine the values of its fields.
Given these tools, you can:
Compile and run the DBMS program as described above.
Create a table using a CREATE TABLE command. The starter code
already includes everything needed to carry out this type of command.
Enter one or more INSERT commands for the newly created table, and
see if the output from printing the InsertRow object looks correct.
For example, let’s say that you enter these two SQL commands:
CREATE TABLE Movie(id CHAR(7) PRIMARY KEY, name VARCHAR(64), runtime INT);
INSERT INTO Movie VALUES ('2294629', 'Frozen', 102);
If your marshall() command is working correctly, you should see the
following as part of the output of the debugging print statement:
for the offsets field:
[-2, 8, 14, 18]
Because there are three columns, there are four offsets. The -2
indicates that the first column (id) is the primary key.
The next two offsets (8 and 14) are the offets of the name
and runtime column values, and the 18 is the offset
of the end of the record.
for the key buffer (i.e., the keyBuffer field):
[50, 50, 57, 52, 54, 50, 57]
The numbers in this byte array represent the ASCII codes for
the characters in the id value '2294629': 50 for the
character '2', 57 for the character '9', etc.
for the value buffer (i.e., the valueBuffer field):
[-1, -2, 0, 8, 0, 14, 0, 18, 70, 114, 111, 122, 101, 110, 0, 0, 0, 102]
This byte array begins with 8 bytes for the offset table:
The first two bytes ([-1, -2]) represent the special -2
offset for the primary-key column. When -2 is represented
using a two-byte integer, the individual bytes
end up being the 8-bit representations of -1 and -2.
In general, when you use multiple bytes to store a
negative number whose absolute value is relatively small,
the rightmost byte will show the negative number itself,
and all of the remaining bytes will show -1. For example,
if we stored -3 using two bytes, we would see [-1, -3] as
its two bytes. If we stored -10 using four bytes, we
would see [-1, -1, -1, -10].
The next two bytes ([0, 8]) represent the offset of the
name column, which has an offset of 8 bytes because it
comes immediately after the offset table, which has a
length of 4*2 = 8 bytes.
The next two bytes ([0, 14]) represent the offset of the
runtime column, which has an offset of 8 + 6 = 14 bytes
in this particular row.
The next two bytes ([0, 18]) represent the offset of the
end of the record, which is 14 + 4 = 18 in this particular
row.
The next 6 bytes represent the ASCII codes for 'Frozen':
70 for 'F', 114 for 'r', etc.
The final 4 bytes ([0, 0, 0, 102]) represent the 4-byte
integer stored for the runtime value of 102.
Note: When you store larger integers, the resulting bytes can be harder to interpret. Here are some examples:
If you stored a runtime of 150 in the Movie table that we
created above, you would see the bytes [0, 0, 0, -106] for
the runtime. This stems from the fact that when only one byte
is used to store a signed integer (one that could be
negative), it can store any value between -128 and 127. When
we store 150 using two or more bytes, the 8 bits in the
rightmost byte look like they represent a negative number,
because 150 can’t actually be represented using an 8-bit signed
integer.
If you stored a runtime of 300, you would see the bytes
[0, 0, 1, 44] for the runtime. That’s because we need more
than 8 bits to store 300 as a binary number. In fact, when
we convert 300 to binary, we get a 9-bit number: 100101100.
When these 9 bits are stored as part of a 32-bit integer, we
get:
00000000 00000000 00000001 00101100
The bits in the rightmost byte represent the integer 44, and the bits in the byte to its left represent the integer 1.
Try inserting other rows as well, and convince yourself that your
marshall() method is working in all cases. For example, does it
work correctly when one of the column values is NULL?
20 points; pair-optional
This is one of only two problems that you may complete with a partner. See the rules for working with a partner on pair-optional problems for details about how this type of collaboration must be structured.
In order to execute a SELECT command, your DBMS needs to be able to
iterate over the rows in one or more tables, and to access the values
of the columns in those rows. In this problem, you will complete the
implementation of a table iterator that will be able to iterate over
all or some of the rows in a single table and access the values of the
columns. We can associate a WHERE clause with such an iterator, in
which case it will only visit rows that satisfy the WHERE clause.
Each table iterator will be an instance of the provided TableIterator
class. We have already implemented most of the methods of this class
for you, including:
a TableIterator constructor that takes an already opened table
object and initializes the state needed by the table iterator,
including:
DatabaseEntry objects called key
and value. These DatabaseEntry objects will be used
by the cursor methods to retrieve the current key/value pair.The constructor also examines the columns mentioned in the SQL
statement for which this iterator is needed, and it associates
this iterator with those columns; doing so allows the code that
evaluated the WHERE clause to use the iterator to obtain the
column values that it needs.
a first() method that positions the iterator on the first
tuple of the table.
a next() method that advances the iterator to the next tuple
specified by the SELECT command.
a getColumn() method that takes an index nand returns a
Column object for the nth column in the table associated
with the iterator. The leftmost column has an index of 0.
a close() method that closes the cursor associated with
the iterator.
a printAll() method that will be called to iterate over all rows
in the associated table and print them out.
For this assignment, you should implement the method called
getColumnVal() that takes an index colIndex and returns the value
of the column with that index in the tuple on which the iterator is
currently positioned. To do so, it will need to unmarshall the
appropriate value from the BDB key/value pair associated with that
tuple, and it should use the metadata that you included when you
marshalled the tuple to efficiently access the value of the specified
column. See the notes below for more detail.
Notes:
We have already given you the code needed to handle the two types of exceptions that are mentioned in the comments before the method.
You code that you write should assume that the underlying cursor
has already been positioned on an appropriate key/value pair. The
key can be accessed using the DatabaseEntry object to which the
TableIterator‘s key field refers, and the value can be
accessed using the DatabaseEntry object to which the
TableIterator‘s value field refers.
Your code will need to use one or two RowInput objects to
unmarshall the value of the specified column.
For example, to create a RowInput object that is based on the
value portion of the current key/value pair, you would do something
like the following:
RowInput valIn = new RowInput(this.value.getData());
Your getColumnVal() method should not perform unnecessary
reads. Rather, it should only read (1) the offset or offsets
needed to determine where the column value is located and (when
necessary) the length of the column value, and (2) the column
value itself.
The RowInput class includes two methods for each type of value:
readIntAtOffset() and
readDoubleAtOffset()). These methods jump to the specified
position in the underlying byte array before performing the read.readNextInt() and readNextDouble()).
When the RowInput object is created, the
current offset is set to 0. After each read, the current
offset is updated to be the offset of the byte that comes
immediately after the value that was just read.The RowInput class also includes a toString() method that you
may find useful when debugging. It returns a string that includes
the contents of the underlying byte array and the current offset
within that array.
When unmarshalling the key portion of the key-value pair, you
can use the getSize() method in its DatabaseEntry object
to determine the key’s length in bytes.
Review the Table, Column, and RowInput classes as needed, as
well as the Berkeley DB DatabaseEntry class.
Once you complete Problem 8, you can create a database, add rows to
it, and then perform SELECT * commands involving that table.
The following is a sample interaction with the DBMS. Note: We set
the DEBUG constant to false in DBMS.java before we ran these
commands.
Enter command (q to quit): CREATE TABLE Course(id CHAR(5) PRIMARY KEY, name VARCHAR(20));
Created table Course.
Enter command (q to quit): INSERT INTO Course VALUES ('01000', 'CS 460');
Added 1 row to Course.
Enter command (q to quit): INSERT INTO Course VALUES ('00050', 'Math 123');
Added 1 row to Course.
Enter command (q to quit): INSERT INTO Course VALUES ('02050', NULL);
Added 1 row to Course.
Enter command (q to quit): SELECT * FROM Course;
| id | name |
----------------------------------
| 00050 | Math 123 |
| 01000 | CS 460 |
| 02050 | null |
Selected 3 tuples.
Enter command (q to quit): q
Make sure to try to lots of different examples so that you can ensure that your unmarshalling code is working correctly! See below for an additional tool for testing.
30 points total
Still to come
Last updated on February 25, 2026.