Note: In addition to the questions below, we strongly encourage you to consult the solutions to Lab 5, which includes a number of query problems that are similar to the ones in this assignment.
My query has a syntax error, but I can’t find it. Do you have any suggestions?
Here are some things to double-check:
Variable names should begin with a dollar sign ($).
You should have commas between different components of a
for clause, let clause and return clause.
Each condition in a where clause should be a boolean
expression – something that would give you true or false –
and multiple conditions should be separated by either
and or or.
If you are defining new start and end tags in your
return clause, make sure that your end tag includes
a backslash. For example, the following is incorrect:
<location>{ string($b) }<location>
Instead, it should be:
<location>{ string($b) }</location>
My query doesn’t have any syntax errors (BaseX says it is OK), but I’m not getting any results or I’m missing some of the results that I should be seeing. Do you know why that would be?
One thing to double-check is that all of your element and attribute names are correct. To remind yourself of what the elements and attributes look like in our database, you can consult our description of its schema.
For example, if you wanted to obtain all movie elements, the correct XPath expression is:
//movie
If you accidentally included an s at the end of the word
(//movies), you wouldn’t produce an error, but you wouldn’t
get any results!
In addition:
Make sure that all attribute names are preceded by an @
symbol.
If you use a predicate in an XPath expression, make sure it is in the correct position within the expression. For example, in Lab 5 we considered the following XPath expression:
//account[branch = $b]/balance
The following incorrect version of this expression would not produce any results:
//account/balance[branch = $b]
If you are testing for the presence of a single ID value
within the value of an IDREFS attribute, make sure that you
use the contains function and not the = operator.
When using contains, make sure that the first input
is the larger string and the second input is the substring
that you are looking for.
See the next question for some other suggestions for debugging your query.
My query produces some results, but they’re not fully correct. Do you have any suggestions for how to debug my query?
One thing that can help is to try executing simpler versions of your query to see if the various components of the query are giving you what you expect them to.
For example, consider the following incorrect query, which is similar to one from Lab 5:
for $b in distinct-values(//branch) let $balances := //account/balance[branch = $b] return <branch> { <location>{ $b }</location>, <total_balance>{ sum($balances) }</total_balance> } </branch>
You could start by running just the XPath expression used in
the for clause:
distinct-values(//branch)
If you did so, you would see that it gives you the expected results: the names of all of the branches in the bank database from Lab 5.
Next, you could simplify the query as follows to see if the
let clause is producing the correct results:
for $b in distinct-values(//branch) let $balances := //account/balance[branch = $b] return ($b, $balances)
If you ran this simpler query, you would only see the branch
names, which indicates that $balances is always being assigned
an empty set. This would help you to realize that you need to
revise the XPath expression in the let clause.
Next, you could try a revised version of that XPath expression:
for $b in distinct-values(//branch) let $balances := //account[branch = $b]/balance return ($b, $balances)
This time, you would see the correct balance elements for each
branch, so you would know that the let clause is functioning
correctly.
Finally, you could restore the original return clause and
confirm that it works as well.
I know we need to use curly braces in some of our return
clauses, but I’m not sure when we need them.
You need them if you are creating new types of elements. For example, consider the following query from Lab 5:
for $c in //customer let $c_accounts := //account[contains(@owners, $c/@customer_num)] return <customer> { $c/name, for $a in $c_accounts return <account> { string($a/branch), "-", string($a/balance) } </account> } </customer>
We’re creating new elements of type customer, so we need to
specify the start and end tags of those elements, and we also
need curly braces inside of those tags so that XQuery will
evaluate the expressions needed to produce the contents of the new
elements.
In addition, we are creating new elements of type account, so
we need to specify their start and end tags and have curly braces
for their contents as well.
However, we are using the existing name child element of the
customer element assigned to $c – without changing its
existing tags – so we can simply say $c/name, without
specifying start and end tags and without an additional set
of curly braces.
I’m able to compile the starter code using the specified instructions, but I’m not able to run it. Any suggestions?
Try the following:
Uninstall your current installation of VS Code along with any programs that include the words “Java” or “JDK” in their names.
Reinstall VS Code using the Alternative method described in Lab 0.
If this doesn’t work, you can use VS Code on the virtual desktop.
Can you review the big picture of how we will be using Berkeley DB databases for these problems?
Each table is stored in its own Berkeley DB database (i.e., its own B+tree of key/value pairs). Each row/tuple in a given table will be represented by a single key/value pair in that table’s B+tree. The key portion of the key/value pair will be the value of the primary-key attribute in that row. The value portion of the key/value pair will be a marshalled form of the remaining values in that row. See the lecture notes on the logical-to-physical mapping for more detail.
My program crashed when I was testing it, and now I get an error message whenever I try to run it.
After running the program the first time, you should see a folder
called db within your dbms folder. This is the home directory
for the Berkeley DB environment, and it will be used to store the
files that BDB creates for your database. If your program crashes
for any reason, these files may be corrupted. As a result, we
recommend that after a crash, you either remove all files from
this db directory or simply delete the entire directory.
Note that you may also need to do this if you closed VS Code
while you were in the middle of running the program – before
entering q to quit the program.
If you end up needing to remove the files in the db folder,
you will lose any databases that you created for testing. As a
result, you will need to re-enter the CREATE TABLE and INSERT
commands needed to build them.
Should the marshall() method add the key-value pair for the
row represented by the InsertRow object to the B+tree
(i.e., to the Berkeley DB database)?
No – your marshal method marshall() will not interact with Berkeley
DB at all.
Rather, it should do the following:
Determine the correct offset values and store them in the
array to which the offsets field refers.
Write the appropriate values into the buffers represented by
the keyBuffer and valueBuffer fields.
See the Notes section in Problem 1 for more details on both of these tasks.
The code that we’ve given you for the InsertStatement class will
take care of creating the DatabaseEntry objects for the
key/value pair and adding them to the Berkeley DB database.
How can I get information about a given column in the InsertRow
object – e.g., the type of the column?
You should take the Table object stored in the field called
table and:
use its getColumn() method to get the Column object for
the column in question
use the appropriate method in the Column object to get the
information that you need.
How many bytes do REAL values take up?
8 bytes, since we are using the Java type double for them, and
double values are 8 bytes. Note that you can find the number of
bytes needed for any fixed-length column by:
using the Table object’s getColumn() method to get the
Column object for the column in question
using the getLength() method in the Column object.
I’m trying to test my marshall() method using the example in the
assignment, but I get an error that says:
Movie: a table with this name already exists. Could not create table Movie.
What should I do?
You only need to enter the CREATE TABLE command for a given table
once. After that, you should just enter INSERT commands to test
your marshall() method.
In the assignment, you give us one example of marshalling and
the output that the debugging print statement should produce.
Could you give us another one?
Sure – but make sure that you also perform additional tests that you devise!
If you enter the following commands:
CREATE TABLE Student (name VARCHAR(64), id CHAR(5) PRIMARY KEY);
INSERT INTO Student VALUES ('Alan Turing', '12345');
you should see:
for the offsets field:
[6, -2, 17]
Because there are two columns, there are three offsets:
The 6 is the offset of the value of the first column
(name).
The -2 indicates that the second column (id) is the
primary key.
The 17 is the offset of the end of the record,
which is 6 + 11 = 17 in this particular row.
for the key buffer (i.e., the keyBuffer field):
[49, 50, 51, 52, 53]
The numbers in this byte array represent the ASCII codes for
the characters in the id value '12345': 49 for the
character '1', 50 for the character '2', etc.
for the value buffer (i.e., the valueBuffer field):
[0, 6, -1, -2, 0, 17, 65, 108, 97, 110, 32, 84, 117, 114, 105, 110, 103]
This byte array begins with 6 bytes for the offset table: [0, 6]
for the first offset, [-1, -2] for the special -2
offset, and [0, 17] for the offset of the end of the record.
The remaining 11 bytes represent the ASCII codes for 'Alan Turing':
65 for 'A', 108 for 'l', etc.
I’m trying to debug my marshall() method by creating a table
and inserting rows into it, but I’m getting an error saying that it
cannot insert the row.
This means that an exception is being thrown somewhere in the
execute() method for InsertStatement – and that the exception is
being caught by the catch block that we provided at the end of
that method.
It’s possible that the exception is being thrown by your
marshall() method, or it may be getting thrown by one of the
statements in execute() itself.
You might try adding temporary print() statements at various
points in marshall() to see how much progress it makes and whether
it fully completes. That would allow you to rule out the
possibility that the the exception is being thrown by
marshall().
Sometimes you can get an exception if BDB ends up in a bad state
for some reason – e.g., because you did not quit the program
cleanly by using q. In such cases, you should remove the db
subdirectory, rerun the program, and recreate the table(s) that
you are using for testing. See the last question in the previous
section (Problems 1 and 3) for more details.
In InsertRow, I’ve included one or more helper methods
that are called by marshall. However, when I try to compile my
code, I get an error message that says something like “Unhandled
exception type IOException”. What am I doing wrong?
If you write a helper method that uses one or more of the
RowOutput methods, you must include a throws clause in the
header of the method like the one we’ve given you for the
marshall method:
public void marshall() throws IOException {
I don’t understand how the WHERE-clause evaluation code works. How
can it access the values in the rows?
The WHERE clause is represented by a tree structure that is
constructed by the parser. There’s more info about it in part 2 of
the DBMS code overview that
accompanies the assignment.
The WHERE-clause evaluation code is able to access the values in the
rows because the TableIterator constructor goes through all of the
columns in the WHERE clause and connects each column from the
corresponding table to the table iterator.
Then, when the WHERE clause is evaluated, the getValue()
method is invoked to get the value of each column, and this method
in turn invokes the table iterator’s getColumnVal() method. As a
result, all you need to do for WHERE-clause evaluation is to
implement this getColumnVal() method.
Where does the unmarshalled data actually go? Do we need to store it somewhere?
No. At any point in time, the DBMS only has access to a single row
from a given table. The execute() method that you write for
SELECT commands will call the printAll() method in the
TableIterator object for the table, and printAll() will use
the TableIterator‘s other methods (including the method
that you are writing) to access the appropriate rows from the
table one at a time, and to get the column values for the current
row and print them.
With CHAR fields, the adjustValue() method in Column
objects (which gets called in InsertStatement.execute()) pads
values that are shorter than the specified width with spaces. Are
we supposed to be able to get back the original width when we
unmarshall the value?
No. The DBMS adds spaces as needed to bring CHAR values to their
specified width, and you should unmarshall the value with the
added spaces. When writing a query involving such values, you can
use the LIKE operator (e.g., foo LIKE 'val%') if you don’t
want to enter the extra spaces.
In the getColumnVal() method, how do we access the raw data that
we need to unmarshall?
When implementing getColumnVal(), you can assume that the
DatabaseEntry objects that are represented by the TableIterator‘s
key and value fields hold the key/value pair for the row
whose column value you need to unmarshall.
This works because the calls to cursor.getNext() that are made
in the TableIterator‘s first() and next() methods take
those DatabaseEntry objects as inputs, and cursor.getNext()
uses them to return the next key/value pair.
Given the current key/value pair, your getColumnVal() method
will need to use one or two RowInput objects to unmarshall the
value of the specified column.
For example, to create a RowInput object that is based on the
value portion of the current key/value pair, you would do something
like the following:
RowInput valIn = new RowInput(this.value.getData());
When I try to perform a SELECT command, I’m not seeing any rows, and I’m getting an error message that says something about the buffer being null. What am I doing wrong?
Make sure that your getColumnVal() method is not creating its
own DatabaseEntry objects for the key and the value. The
TableIterator constructor already creates DatabaseEntry
objects for the key and value and assigns those objects to
appropriately named fields, and you must use those fields in your
method.
Last updated on March 2, 2026.