CS 1101 Computer Science I
Spring 2016

Computer Science Department
The Morrissey College of Arts and Sciences
Boston College

About Staff Textbook Grading Schedule Resources
Notes Labs Piazza Canvas GitHub Problem Sets
Manual StdLib Pervasives UniLib OCaml.org
Problem Set 8: Overlap Graphs

Assigned: Monday April 18, 2016
Due: Monday April 25, 2016
Points: 9 Points up to 12 Points

This is an individual problem set, you can consult with friends but you should all author your code independently. The problem set has a required part and an optional extra credit part.

Part 1: (Required, 9 Points): Overlap Graphs

Rosalind is a terrific website with problems in bioinformatics. The website is named after Dr. Rosalind Franklin, the discoverer of the helical structure of DNA.

Solve the Overlap Graphs problem. Your solution should be in the form of a self-contained OCaml program contained in one file named overlap.ml. When compiled from the Unix shell with:


> ocamlc -o overlap overlap.ml

and then run from the shell as in:


> ./overlap a.fas 3

where a.fas is the name of a FASTA file of the form shown on the Overlap Graph page, the program should create a new file a.graph containing a representation of the overlap graph as specified on the Rosalind page.

Notes

  1. There is no harness code for this problem set, use SublimeText to create both sample FASTA files as well as the source file overlap.ml. But feel free to use a variation of the following snippet of code for reading the input text from the FASTA file.

    
    (* readLines : string -> string list
     *
     * The call (readLines filename) returns a list of strings with
     * one list-entry for each line in filename.
    *)
    let readLines filename =
      let inch = open_in filename in
      let rec repeat lines =
        try
          repeat ((input_line inch)::lines)
        with
          End_of_file -> close_in inch;
                         lines
      in
      repeat []
    
    

  2. Remember that the unix command line inputs to your program can be found in the string array Sys.argv.

  3. You'll want to make use of several of the functions in OCaml's String module.

Part 2: (Optional, 3 Points): Locating Restriction Sites

Do the Locating Restriction Sites problem on Rosalind.
Created on 01-19-2016 23:09.