4 Agreement: a language of numbers between friends
It’s just a phase!
4.1 Let’s Agree
After looking at Abscond: a language of numbers it may seem a little odd that we are building both a compiler and an interpreter. Furthermore, the difference between the two may not be immediately clear in our first encounter with both.
When implementing a language there are many reasons for choosing to implement an interpreter or a compiler or, increasingly, both1In fact, many modern language implementations have a compiler as part of their interpreter, this is is known as JIT compilation.
1In fact, many modern language implementations have a compiler as part of their interpreter, this is is known as JIT compilation.
4.2 Abstract syntax for Agreement
Abscond was a simple language with the following grammar:
; type expr = integer
An Agreement program, like and Abscond program, consists of a single expression, and the grammar of expressions introduces one new concept:
; type expr = integer ; | get-int
So, 0, 120, -42, are programs, just as in Abscond, but so is get-int. get-int is a primitive (something that we the language implementers provide in the language) that retrieves an integer from standard input (i.e. the user provides it).
Now that our programs can take more than one shape (barely!) it’s a good time to make our Abstract Syntax Tree explicit:
#lang racket (provide (all-defined-out)) ; type Expr = integer ; | get-int (struct int-e (i)) (struct get-i ()) (define (print-ast a) (match a [(int-e i) `(int-e ,i)] [(get-i) 'get-i]))
We are using Racket’s struct feature to provide the nodes of our AST. The (int-e) (pronounced ‘int expression’) struct takes a single argument: the integer value, while the (get-int) node takes no arguments as we don’t know what value it will represent.
4.3 Meaning of Agreement programs
We can write an “interpreter” that consumes an expression and produces it’s meaning:
#lang racket (provide (all-defined-out)) (require "primitives.rkt") (require "ast.rkt") ; Expr -> Integer ; Interpret given expression ; (define (interp p) (match p [(int-e i) i] [(get-i) (get-int)]))
Notice that the meaning of get-int is ‘just’ the call to the racket function get-int. Understanding how that function is implemented is not important for our purposes and we can treat it as a black-box.
> (interp (parse 42)) 42
> (interp (parse -8)) -8
Earlier I mentioned that the compiler and interpreter differ in how things run. Notice that for the above, we need a working Racket system in order to run the programs through the interpreter and get a result. Soon we will contrast this with how we run the program that the compiler produces.
We can add a command line wrapper program for interpreting Agreement programs saved in files:
#lang racket (provide (all-defined-out)) (require "interp.rkt") (require "parse.rkt") ;; String -> Void ;; Parse and interpret contents of given filename, ;; print result on stdout (define (main fn) (begin (define prog (with-input-from-file fn (λ () (writeln "Interpreting..." (current-output-port)) (let ((c (read-line)) ; ignore #lang racket line (p (read))) (parse p))))) (interp prog)))
shell
A few observations
Our program now uses our primitive: get-int
When interpreting our program, we pipe the value 1024 as input
We print the friendly message “Interpreting...” as we start interpreting our program
When we ran our program the second time, we saw “Interpreting...” again
Unlike Abscond, we will not be providing an Operational Semantics for agreement, as to do so fully would require some more advanced techniques that are more appropriate for a latter stage of the course.
4.4 Running an Agreement
Unlike in our interpreter, we cannot use a Racket function for get-int. This is because Racket functions are not available to use when we execute our assembly programs. So instead we expand our runtime-system to provide us with this functionality:
The other side of this coin is that we do not need Racket in order to run our programs, only to compile them.
In Agreement, the runtime system calls the function and prints the result as before, but now it also provides us with the implementation of the get-int primitive operation.
4.5 Compiling an Agreement
The distinction between the compiler for Abscond and for Agreement is not important for this stage in the course, and as such we omit the details here. Later in the semester we will cover the techniques used in compiling this primitive operation.
That said, we can still observe some important points from running some examples. Let’s start with compiling the same program we interpreted.
shell
> echo '#lang racket\nget-int' > example.rkt > make example.run make[1]: Entering directory `/home/travis/build/cmsc430/www/www/notes/agreement' racket -t compile-file.rkt -m example.rkt > example.s nasm -f elf64 -o example.o example.s gcc main.o example.o -o example.run rm example.o example.s make[1]: Leaving directory `/home/travis/build/cmsc430/www/www/notes/agreement' "Compiling..."
Notice that we now see “Compiling...” when our compiler is run on the input program. Let’s run the compiled program now:
shell
> echo '#lang racket\nget-int' > example.rkt > make example.run make[1]: Entering directory `/home/travis/build/cmsc430/www/www/notes/agreement' racket -t compile-file.rkt -m example.rkt > example.s nasm -f elf64 -o example.o example.s gcc main.o example.o -o example.run rm example.o example.s make[1]: Leaving directory `/home/travis/build/cmsc430/www/www/notes/agreement' "Compiling..." > echo 1024 | ./example.run result: 1024 > echo 2028 | ./example.run result: 2028
Even though we run the program multiple times, we only see “Compiling...” when we run our compiler. This helps us illustrate an important aspect of compilation. For the most part, you compile a program once, but can run it many times. In our case, because the target language is x86_64, we do not never need Racket to be installed in order to run our programs. This means that we can compile our programs on a system with a Racket environment, but run the program on any system that is compatible with the executable format we get.
This is very different from how our interpreter works. In order to run a program with our interpreter, we need Racket!
Below we have the rest of the necessary files for Agreement, in case you’re interested:
#lang racket (provide (all-defined-out)) ;; Asm -> String (define (asm->string a) (foldr (λ (i s) (string-append (instr->string i) s)) "" a)) ;; Instruction -> String (define (instr->string i) (match i [`(mov ,a1 ,a2) (string-append "\tmov " (arg->string a1) ", " (arg->string a2) "\n")] [`(call ,ad) (string-append "\tcall " (label->string ad) "\n")] [`(and ,a1 ,a2) (string-append "\tand " (arg->string a1) ", " (arg->string a2) "\n")] [`ret "\tret\n"] [l (string-append (label->string l) ":\n")])) ;; Arg -> String (define (arg->string a) (match a [`rax "rax"] [`rsp "rsp"] [`r15 "r15"] [n (number->string n)])) ;; Label -> String ;; prefix with _ for Mac (define label->string (match (system-type 'os) ['macosx (λ (s) (string-append "_" (symbol->string s)))] [_ symbol->string])) ;; Asm -> Void (define (asm-display a) ;; entry point will be first label (let ((g (findf symbol? a))) (display (string-append "\tglobal " (label->string g) "\n" "\textern " (label->string 'get_int) "\n" "\tsection .text\n" (asm->string a)))))
#lang racket (provide (all-defined-out)) (require "compile.rkt" "asm/printer.rkt" "parse.rkt") ;; String -> Void ;; Compile contents of given file name, ;; emit asm code on stdout (define (main fn) (with-input-from-file fn (λ () (writeln "Compiling..." (current-error-port)) (let ((c (read-line)) ; ignore #lang racket line (p (read))) (asm-display (compile (parse p)))))))
UNAME := $(shell uname) .PHONY: test .PHONY: clean ifeq ($(UNAME), Darwin) format=macho64 else format=elf64 endif %.run: %.o main.o gcc main.o $< -o $@ main.o: main.c gcc -c main.c -o main.o %.o: %.s nasm -f $(format) -o $@ $< %.s: %.rkt racket -t compile-file.rkt -m $< > $@ clean: rm *.o *.s *.run test: 42.run @test "$(shell ./42.run)" = "42"