Assignment 5: A Heap of Characters
Due: Friday, Oct 30, 11:59PM EDT
The goal of this assignment is to extend a compiler with data types that require memory allocation and dereferencing.
You are given a repository with a starter compiler similar to the Hustle language we studied in class. You are tasked with:
incorporating the Con+ features you added in Assignment 3,
incorporating the Fraud+ features you added in Assignment 4,
extending the language to include a string data type,
implementing a number of primitives,
As usual, the code in ast.rkt will need to be studied in order to understand the structure of the AST and how to traverse/operate on Hustle+ programs.
From Fraud+ to Hustle+
Implement all the features of Con+ and Fraud+, extended to Hustle+.
In order to get all the points for this section of the assignment you will need to modify the following files:
compile.rkt
syntax.rkt
Strung out
In the last assignment, you implemented a character data type for representing single letters. In this assignment, you will implement a String data type for representing arbitrarily long sequences of characters.
Strings are disjoint from all other data types and are essentially a fixed-size array of characters. Literal strings are written by enclosing the characters within the string in double quotes ("). Strings can include double quotes by using the escape sequence \".
You must add the following operations to Hustle+:
string? ; Any -> Boolean, which determines if its argument is a string.
string-ref ; String Natural -> Char, which extracts the character at the given index (using 0-based counting). An error is signalled if the index is out of bounds for the given string.
string-length ; String -> Natural, which computes the length of the string.
make-string ; Natural Char -> Natural, which constructs a string of the given length, filled with the given character.
The run-time system has been updated to account for a string type. It assumes a representation where the length of the string is stored in memory, followed by the characters of the string, in order. You are free to change the representation if you’d like, but you will have to update the run-time system to properly print strings. Otherwise, no changes to the run-time system should be necessary.
If you want to understand the details of how strings are implemented in the run-time system. See the function print_string() in main.c.
In order to get all the points for this section of the assignment you will need to modify the following files:
compile.rkt
syntax.rkt
More operations
Add the following operations to the Hustle+ language:
box? ; Any -> Boolean, which determines if a value is a box.
empty? ; Any -> Boolean, which determines if a value is the empty list.
cons? ; Any -> Boolean, which determines if a value is a pair.
= ; Number Number -> Boolean, which determines if two numbers are equal.
< ; Number Number -> Boolean, which determines if the first number is less than the second.
<= ; Number Number -> Boolean, which determines if the first number is less than or equal to the second.
char=? ; Number Number -> Boolean, which determines if two characters are equal.
boolean=? ; Boolean Boolean -> Boolean, which determines if two booleans are equal.
Tests for these primitives have not been provided in test/compile.rkt, therefore you will need to write appropriate tests for these primitives.
In order to get all the points for this section of the assignment you will need to modify the following files:
compile.rkt
syntax.rkt
test/compile.rkt
Extending the Parser
You should not need to make any changes to lex.rkt or to parse.rkt, but you do need to understand the structure of the resulting AST, which is why the information below is provided. The grammar, along with the definitions and utility functions in ast.rkt provide all the information necessary to work with the AST in implementing the new features of Hustle+.
The parser has been extended for you from the Fraud+ parser for the Hustle+ language based on the following grammar:
<expr> ::= integer |
| character |
| boolean |
| variable |
| string |
| empty |
| ( <compound> ) |
| [ <compound> ] |
|
<compound> ::= <prim1> <expr> |
| <prim2> <expr> <expr> |
| - <expr> <maybe-expr> |
| if <expr> <expr> <expr> |
| cond <clause>* <else> |
| let <bindings> <expr> |
|
<prim1> ::= add1 | sub1 | abs | zero? | integer->char | char->integer |
| char? | integer? | boolean? | string? | box? | empty? | cons? |
| box | unbox | car | cdr | string-length |
|
<prim2> ::= make-string | string-ref | = | < | <= |
| char=? | boolean=? | + | cons |
|
<maybe-expr> ::= |
| <expr> |
|
<clause> ::= ( <expr> <expr> ) |
| [ <expr> <expr> ] |
|
<else> ::= ( else <expr> ) |
| [ else <expr> ] |
|
<bindings> ::= ( <binding>* ) |
| [ <binding>* ] |
|
<binding> ::= ( variable <expr> ) |
| [ variable <expr> ] |
There is a lexer given to you in lex.rkt, which provides two functions: lex-string and lex-port, which consume a string or an input port, respectively, and produce a list of tokens, which are defined as follows (only the new parts are shown):
; type Token = ; ... ; | '() ; | String ; type Prim = Prim1 | Prim2 | '- ; type Prim1 = ; | 'add1 ; | 'sub1 ; | 'zero? ; | 'abs ; | 'integer->char ; | 'char->integer ; | 'char? ; | 'boolean? ; | 'integer? ; | 'string? ; | 'box? ; | 'empty? ; | 'cons? ; | 'box ; | 'unbox ; | 'car ; | 'cdr ; | 'string-length ; type Prim2 = ; | 'make-string ; | 'string-ref ; | '= ; | '< ; | '<= ; | 'char=? ; | 'boolean=? ; | '+ ; | 'cons
The lexer will take care of reading the #lang racket header and remove any whitespace.
The code in parse.rkt implements the parser which constructs an AST representing a valid Hustle+ expression, if possible, from a list of tokens.
As an example, parse should produce (prim-e `add1 (list (prim-e `sub1 (list (int-e 7))))) if given
'(lparen (prim add1) lparen (prim sub1) 7 rparen rparen eof)
Testing
You can test your code in several ways:
Using the command line raco test . from the directory containing the repository to test everything.
Using the command line raco test <file> to test only <file>.
- Pushing to github. You can see test reports at:
(You will need to be signed in in order see results for your private repo.)
Note that only a small number of tests are given to you, so you should write additional test cases.
There is separate a repository for tests! The testing set-up is slightly different for this assignment. There is a whole other repository that contains tests. When you push your code, Travis will automatically run your code against the tests. If you would like to run the tests locally, clone the following repository into the directory that contains your compiler and run raco test . to test everything:
https://github.com/cmsc430/assign05-test.git
This repository will evolve as the week goes on, but any time there’s a significant update it will be announced on ELMS.
Submitting
Pushing your local repository to github “submits” your work. We will grade the latest submission that occurs before the deadline.