Project 4: Building a Back-End for Cish

Due October 25, 11:59pm.


In this assignment, you will write a compiler for Cish, an extension of Fish with procedures and local variables.


Repo Access and Submitting

(a) Instructions to access the psets repo

  1. Click on the assignment link and sign in using your personal GitHub account.
  2. Create a new group or join an existing group. For the sake of blind grading, please make sure your group name has no personal identifiers! (For example, you can generate a random name here.)
  3. You have now created a new repo under the HarvardCS153–2018-Fall organization; commit and push your assignment here!

(b) Instructions to submit

  1. Make sure you are using OCaml version 4.07.0, or at the least that your submission compiles and runs with OCaml 4.07.0. This is our “supported” version and is what we will be using to grade your problem set!
  2. Please list your HUID(s) in the file huids.txt. This will help us perform blind grading.
  3. Your last commit before the deadline to the master branch of your repo is your final submission. Please note that no late submissions will be accepted. Don’t forget to push!

Background: Cish

Your job for this assignment is to implement a compiler that maps Cish source code down to MIPS assembly. Cish is quite similar to Fish except that it adds functions, function calls, and local variables.

A Cish program is a list of functions. Each function is of the form

var(var1,...,varn) { stmt }

mimicking functions in C. Here, var is the name of the function and var1,...,varn are the formal parameters of the function. The body of the function is a statement that should return an integer value to the caller. The distinguished function main is used to launch the program. For Cish, main should take no arguments.

Cish has local variables. In the concrete syntax, a local variable is declared using a statement of the form val x = exp;. The scope of a local variable is similar to the scope of a local variable in C. To support local variables, we have added a new type of node to the AST:

Let of var * exp * stmt

The semantics of this node are as follows. First, exp is evaluated to a value v. Then, a new local variable var is defined with the initial value v. The scope of var extends only across stmt and is unavailable outside. Note that variables in Cish can shadow each other. For example, this code should return 0:

val x = 42;
{
    val x = 0;
    return x;
}

However, this code should return 42:

val x = 42;
{
    val x = 0;
}
return x;

Cish programs do not have global variables.

To support function calls, we we have added a new form of expression: var(exp1,...,expn). Here, var is the name of the function. Similar to C, Cish does not specify the order in which the arguments to a function are evaluated, and instead leaves it up to the compiler implementor (i.e., you!). For example, consider this code:

val x = 0;
val y = f(x, x = 42);
...

A Cish compiler is allowed to generate code that calls f(0, 42) or f(42, 42).


Assignment

We’ve provided the abstract syntax, lexer, parser, and updated interpreter. You have to provide the compiler. Please follow the calling convention from the class notes, with the following exceptions: * You do not need to worry about keeping the stack pointer double-word aligned. * You do not need to enforce the minimum frame size. * A caller of a function with n arguments should set aside space on the stack for max(n, 4) arguments (so, even if the caller is calling a function with two arguments, it should set aside enough space on the stack for four).

When calling a function, make sure to save any caller-save registers that you need preserved across the call, and within a function, make sure to save any callee-save registers used by the function. Please post any questions about the calling convention to Piazza!

A simple strategy for compilation is to keep an environment around that maps variables (including formal parameters and local variables) to integer offsets relative to the frame-pointer. One option is to make a pass over the code and determine how many distinct variables there are and where they will live before compiling the code. After this pass, you will be able to generate the prologue and epilogue. Another strategy is to “push” locally-defined variables on the stack and “pop” them off as you encounter them. Regardless, you’ll need to keep track of where each variable lives relative to either the stack or the frame pointer.

In addition to handling local variables, you will need to have a strategy to handle temporary values. We suggest you start by just pushing temporary values on the stack and popping them off (as sketched in the class notes). Once you have this working, you can move on to something more sophisticated (if you want; it’s not necessary to do so, though).

Running make in the current directory generates an executable proj4, which expects a file to compile and outputs MIPS assembly. This executable automatically adds some debugging code to the code your compiler generates. The assembly code can be tested using the MIPS simulator SPIM.

Notes