Consider this schedule of two transactions:
T1 | T2 |
---|---|
r(X) | |
r(X) | |
w(Y) | |
w(Y) | |
commit | |
commit |
Is this schedule:
conflict serializable?
To see this, note that there is only one pair of conflicting actions: T1’s write of A, and T2’s write of A. This means that T1 must come before T2 in any equivalent serial schedule. And since this is the only conflict-based constraint, the original schedule is equivalent to T1;T2.
An alternative way of showing this is to note that T2’s read of X doesn’t conflict with T1’s write of Y because the two operations involve different data items. Therefore, we can swap those two operations to produce the serial schedule T1;T2. (Note that the commits do not matter when it comes to determining conflict serializability).
serializable?
If we didn’t already know that the schedule is conflict serializable, we could also show that it is serializable by showing that its effects are equivalent to those of the serial schedule T1;T2:
T1 | T2 |
---|---|
r(X) | |
w(Y) | |
commit | |
r(X) | |
w(Y) | |
commit |
In both the original schedule and the schedule T1;T2:
Thus, the schedule above is serializable.
recoverable?
cascadeless?
A cascading rollback occurs when:
One transaction (call it T) writes a data item A.
Another transaction (call it U) reads the value that T wrote before T commits; that is a dirty read.
T is rolled back, and thus the old value of A is restored.
U must also be rolled back, because it read a value of A that no longer exists.
Since there are no dirty reads in this schedule, there will never be a reason for a cascading rollback, and thus the schedule is cascadeless.
Consider this schedule of two transactions:
T1 | T2 |
---|---|
r(Y) | |
r(Y) | |
w(X) | |
r(X) | |
w(Y) | |
w(Y) | |
commit | |
commit |
Is this schedule:
conflict serializable?
r1(Y)...w2(Y)
(the fact that T1 reads Y before T2 writes Y)
means that T1 must come before T2 in the equivalent serial
schedule.
r2(Y)...w1(Y)
means that T2 must come before T1.
These constraints contradict each other, so the original schedule is not conflict serializable.
serializable?
We can show this by showing the schedule is not equivalent to either of the two possible serial schedules, T1;T2 or T2;T1.
It’s not equivalent to T1;T2 because:
In the schedule above, T2 reads the initial value of Y, and it performs the final writes of both X and Y.
In T1;T2, T2 still performs the final writes of X and Y, but it would read the value of Y written by T1 and thus it might write different values for those data items.
It’s not equivalent to T2;T1 because:
In the schedule above, T2, writes the final value of Y.
In T2;T1, T1 would write the final value of Y.
Thus, it is not serializable.
recoverable?
cascadeless?
Would any of the answers for schedule 2 change if we swapped the order of the two commits?
Now that T1 commits before T2 does, if the system were to crash after T1 committed but before T2 committed—or if T2 were to be rolled back after T1 committed—the system could be put into an inconsistent state.
This is because:
T1 performed a dirty read of the value of X that T2 wrote (and may have based its write of Y on that value).
A rollback of T2 will undo T2’s write of X.
Because T1 read that value, it should be rolled back, too, but if T2’s rollback happens after T1 commits, it’s too late to roll back T1.
Precedence graphs can help us to determine if a schedule is conflict serializable.
Consider this schedule of four transactions:
T1 | T2 | T3 | T4 | |
---|---|---|---|---|
1 | r(X) | |||
2 | r(X) | |||
3 | w(Y) | |||
4 | r(Y) | |||
5 | r(Y) | |||
6 | w(X) | |||
7 | r(W) | |||
8 | w(Y) | |||
9 | r(W) | |||
10 | r(Z) | |||
11 | w(W) | |||
12 | r(Z) | |||
13 | w(Z) |
Draw a precedence graph for this schedule.
If the schedule is conflict serializable, state one possible equivalent serial schedule. If the schedule is not conflict serializable, explain briefly why the schedule is not equivalent to the serial schedule T1; T2; T3; T4. Your explanation should include a discussion of why particular actions from the original schedule are not consistent with that serial schedule.
We add an edge from transaction A to transaction B in the graph if transaction A would have to come before transaction B in an equivalent serial schedule.
Here are all of the conflicting actions from different transactions and their corresponding constraints:
conflicting actions | ordering constraint |
---|---|
r1(X)...w2(X) | T1 → T2 |
w1(Y)...r3(Y) | T1 → T3 |
w1(Y)...r2(Y) | T1 → T2 |
w1(Y)...w3(Y) | T1 → T3 |
r2(Y)...w3(Y) | T2 → T3 |
r3(W)...w4(W) | T3 → T4 |
r4(Z)...w1(Z) | T4 → T1 |
r1(X)...w2(X) | T1 → T2 |
And here’s the precedence graph:
As you can see, there are two cycles in the graph:
T1→T3→T4→T1 T1→T2→T3→T4→T1
This means that the schedule is not conflict serializable.
The schedule isn’t equivalent to the serial schedule T1;T2;T3;T4 because T4’s read of Z (r4(Z)) comes before T1’s write of Z (w1(Z)), and thus T4 would need to come before T1 in any equivalent serial schedule, which isn’t the case in the serial schedule T1;T2;T3;T4.
In PS 3: Part II, you will implement portions of a simple relational DBMS.
The assignment includes a set of code-reading and design questions that will help you to become familiar with the provided code, and to map out a preliminary design for the code you will write.
Let’s work together on questions 8-10 from the section of the code-reading questions on marshalling.
We would use an offset of -2 for the primary key value, which indicates that it will go in the key portion of the key-value pair.
We would use an offset of -1 for a null value.
For a table Foo(a INT PRIMARY KEY, b VARCHAR(20))
:
The row (1, 'hello')
would be stored as follows:
key = the four-byte integer 1
value = the following record:
where we begin with a header of three offsets:
offsets[0] = -2
, since the first column is the primary key,
and thus its value can be found in the key portion of the
key-value pair.
offsets[1] = 6
, since the second column’s value ('hello'
)
comes after the 3 offsets, which are 2 bytes each (3*2 = 6).
offsets[2] = offsets[1] + len('hello')
= 6 + 5
= 11
This is the offset of the end of the record.
After the offsets, we have the value of the non-primary-key column (‘hello’).
The row (2, null)
would be stored as follows:
key = the four-byte integer 2
value = the following record:
where we just need a header of three offsets:
offsets[0] = -2
as before
offsets[1] = -1
, because the value of the second column
is null.
offsets[2] = 6
, which is the offset of the end of the
record. Since the record only includes the three 2-byte
offsets, its length is 6.
There is nothing after the offsets because all of the column values are either in the key or null!
Later, you should do questions 11 and 12 on your own, along with the rest of the code-reading and design questions.
As you work on the assignment, you should also consult the API pages for both the DBMS code framework and for Berkeley DB Java Edition.
Last updated on October 24, 2024.