CS 357: Database Systems Implementation
Solution to Homework Assignment 8
1.
Suppose you want
to take the product of STUDENT with STUDENT, in order to find all pairs of
students.
a) One way is to
create a single table plan on STUDENT, and use it twice in the product. Explain why this will produce incorrect (and strange)
behavior when the scan is executed.
The problem is that
the variables s1 and s2 in the product scan refer to the same object. So a call to s1.next will also reposition s2,
and vice versa. Thus the code will only consider
a record paired with itself.
b) A better way is to
create two different table plans on STUDENT, and create a product plan on
them. This returns all combinations of
STUDENT records, but has a problem. What
is it?
The problem is that
the variables s1 and s2 in the product scan have the same fields. Thus the product scan will never be able to
access the field values of the second scan.
That is, a call to getInt or getString will always return the values
from the first scan.
2. Consider the
following relational algebra query:
T1 = select(DEPT, DName='math')
T2 = select(STUDENT,
GradYear=2008)
product(T1, T2)
a) Calculate the
number of disk accesses required to execute the operation.
Table T1 requires a
single scan through the DEPT table, and will return a single record.
B(T1) = 2 R(T1)
= 1
Table T2 requires a
single scan through the STUDENT table, and will return 1/50th of the
records:
B(T2) = 4,500 R(T2) = 900
The product of T1
and T2 is calculated as follows:
B(T1xT2) = B(T1) + R(T1)*B(T2) = 4,502
R(T1xT2) = R(T1)*R(T2) = 900
b) Calculate the number of disk accesses
required to execute the operation if the arguments to product are exchanged.
B(T2xT1) = B(T2) + R(T2)*B(T1) = 6,300
3. The parser method create
does not correspond to the SQL grammar of Figure 18-4.
a) Explain why the
grammar rule for <Create> is too ambiguous to be used for
recursive-descent parsing.
Because all three
options expect their first token to be a “create” token. Thus there is no way to know which option to
take.
b) Revise the grammar
so that it corresponds to how the create
method actually works.
We need to factor
the “create” token into the rule for <Create>:
<Create>
:= CREATE <CreateTable> | <CreateView> | <CreateIndex>
<CreateTable>
:= TABLE IdTok ( <FieldDefs> )
<CreateView> := VIEW IdTok AS <Query>
<CreateIndex>
:= INDEX IdTok ON IdTok ( <Field> )
4a. We need to modify the rule for <SelectList>, and add another one (for <SelectField>):
<SelectList> := <SelectField> [ , <SelectList> ]
<SelectField> := <Field> [ AS <Field> ]
4b. The method for SelectList needs to return both the old and new field names. To do so, I created the new class SelectData. I then changed the methods query and selectList in the class Parser, and added the method selectField. Here is the code for those methods.
I also had to modify the class Lexer to include the keyword "as".
4c. The revision to BasicQueryPlanner
is straightforward. The planner grabs the collection of SelectField
objects
from the parser. It creates the project scan by adding the old
field names to a collection of projected fields. It
then creates the necessary rename scans for each SelectField object that has a non-null new field value.