CS 357: Database Systems Implementation
Solution to Homework Assignment 8


1.  Suppose you want to take the product of STUDENT with STUDENT, in order to find all pairs of students. 
a)  One way is to create a single table plan on STUDENT, and use it twice in the product.  Explain why this will produce incorrect (and strange) behavior when the scan is executed.
 
The problem is that the variables s1 and s2 in the product scan refer to the same object.  So a call to s1.next will also reposition s2, and vice versa.  Thus the code will only consider a record paired with itself.
 
b)  A better way is to create two different table plans on STUDENT, and create a product plan on them.  This returns all combinations of STUDENT records, but has a problem.  What is it?
 
The problem is that the variables s1 and s2 in the product scan have the same fields.  Thus the product scan will never be able to access the field values of the second scan.  That is, a call to getInt or getString will always return the values from the first scan.
 


2.  Consider the following relational algebra query:
 
T1 = select(DEPT, DName='math')
T2 = select(STUDENT, GradYear=2008)
product(T1, T2)
 
a)  Calculate the number of disk accesses required to execute the operation.
   
Table T1 requires a single scan through the DEPT table, and will return a single record.
 
     B(T1) = 2     R(T1) = 1
 
Table T2 requires a single scan through the STUDENT table, and will return 1/50th of the records:
 
     B(T2) = 4,500 R(T2) = 900
 
The product of T1 and T2 is calculated as follows:
 
     B(T1xT2) = B(T1) + R(T1)*B(T2) = 4,502
     R(T1xT2) = R(T1)*R(T2) = 900
 
b)  Calculate the number of disk accesses required to execute the operation if the arguments to product are exchanged.
 
     B(T2xT1) = B(T2) + R(T2)*B(T1) = 6,300



3.  The parser method create does not correspond to the SQL grammar of Figure 18-4.
a)  Explain why the grammar rule for <Create> is too ambiguous to be used for recursive-descent parsing.
 
Because all three options expect their first token to be a “create” token.  Thus there is no way to know which option to take.
 
b)  Revise the grammar so that it corresponds to how the create method actually works. 
 
We need to factor the “create” token into the rule for <Create>:
 
<Create> := CREATE <CreateTable> | <CreateView> | <CreateIndex>
<CreateTable> := TABLE IdTok ( <FieldDefs> )
<CreateView>  := VIEW IdTok AS <Query>
<CreateIndex> := INDEX IdTok ON IdTok ( <Field> )



4a.  We need to modify the rule for <SelectList>, and add another one (for <SelectField>):
 
<SelectList> := <SelectField> [ , <SelectList> ]
<SelectField> := <Field> [ AS <Field> ]



4b.  The method for SelectList needs to return both the old and new field names.  To do so, I created the new class SelectData.  I then changed the methods query and selectList in the class Parser, and added the method selectField.  Here is the code for those methods.
I also had to modify the class Lexer to include the keyword "as".



4c.  The revision to BasicQueryPlanner is straightforward.  The planner grabs the collection of SelectField objects from the parser.  It creates the project scan by adding the old field names to a collection of projected fields.  It then creates the necessary rename scans for each SelectField object that has a non-null new field value.