Project 2: Parsing

Due October 4, 11:59pm.


In this assignment you will implement two lexers and parsers for a small language called Fish (short for Fortran-ish). You will use parsing combinators to construct the first lexer and parser. You will use the tools Lex and Yacc to build the second front-end.


Repo Access and Submitting

(a) Instructions to access the psets repo

  1. Click on the assignment link and sign in using your personal GitHub account.
  2. Create a new group or join an existing group. For the sake of blind grading, please make sure your group name has no personal identifiers! (For example, you can generate a random name here.)
  3. You have now created a new repo under the HarvardCS153–2018-Fall organization; commit and push your assignment here!

(b) Instructions to submit

  1. Make sure you are using OCaml version 4.07.0, or at the least that your submission compiles and runs with OCaml 4.07.0. This is our “supported” version and is what we will be using to grade your problem set!
  2. Please list your HUID(s) in the file huids.txt. This will help us perform blind grading.
  3. Your last commit before the deadline to the master branch of your repo is your final submission. Please note that no late submissions will be accepted. Don’t forget to push!

Background: Fish

The Fish language is very simple and only includes integer expressions and statements. Fish is intended to have syntax similar to that of C and Java (in fact, we’ll soon scale Fish up to have more features of these languages).

Here is an example Fish program:

/* this is a comment */
x = 42;
y = 31;
for (s = 0; x > 0; x = x - 1) {
  y = y + 1;
}
return y;

More example programs can be found in the test directory.

We have defined the abstract syntax of Fish in ast.ml. However, in the interest of not giving too much away, we have intentionally not specified the language’s concrete syntax thoroughly. In the course of this assignment you will design grammars defining the concrete syntax of Fish. Here are some guidelines that we expect you to follow:

In general, we expect that the concrete syntax you develop for Fish will support “reasonable” programs, such as the ones in the test directory (which are not exhaustive). Feel free to post questions clarifying syntax on Piazza.


Part 1: Parsing with combinators

The file lcombinators.ml contains parsing combinator definitions similar to the ones that we saw in class and which you should use to define your first lexer and parser for Fish. The files comblexer.ml and combparser.ml contain basic definitions for the lexer and parser. You will modify these files to build a parser for Fish programs. Additionally, please write the BNF grammar that your parser implements in a comment in combparser.ml. This grammar must be unambiguous.

To help you get started, we have written a basic parser that can parse some arithmetic expressions and return statements. You can test the parser like so:

$ make
...
$ cat test/01cexpr_01add.fish 
return 12 + 4;
$ ./proj2comb test/01cexpr_01add.fish
answer = 16

Feel free to change or remove any of the skeleton code as you see fit.

Notes

Deliverables

In summary, there are three deliverables for this part: 1. A lexer defined in comblexer.ml. 1. An unambiguous BNF grammar for the concrete syntax of Fish in combparser.ml. 1. A parser implementing this grammar in combparser.ml.


Part 2: Parsing with Lex and Yacc

The files lex.mll, parse.mly, and fishyacc.ml contain the boilerplate code that you’ll need for your Lex and Yacc-based front-end (you should only have to make changes to the first two of these files). Tutorials for Ocamllex and Ocamlyacc can be found here and here, respectively.

Notes

Deliverables

In summary, there are two deliverables for this part: 1. A lexer with rules defined in lex.mll and tokens defined in parse.mly. 1. A parser defined in parse.mly