6 Con: branching with conditionals
When you come to a fork in the road, take it.
6.1 Conditional execution
Let’s now consider add a notion of conditionals to our target language.
We’ll call it Con.
We will use the following syntax: (if (zero? e0) e1 e2).
Together this leads to the following grammar for Con:
Which can be modeled with the following definitions:
#lang racket (provide (all-defined-out)) ;; type Expr = ;; | Integer ;; | `(add1 ,Expr) ;; | `(sub1 ,Expr) ;; | `(if (zero? ,Expr) ,Expr ,Expr) (struct int-e (i)) (struct add1-e (e)) (struct sub1-e (e)) (struct if-e (i t f)) (define (ast->sexpr a) (match a [(int-e i) `(int-e ,i)] [(add1-e e) `(add1-e ,(ast->sexpr e))] [(sub1-e e) `(sub1-e ,(ast->sexpr e))] [(if-e i t f) `(if-e (zero? ,(ast->sexpr i)) ,(ast->sexpr t) ,(ast->sexpr f))]))
We will also need a predicate for well-formed Con expressions, but let’s return to this after considering the semantics and interpreter.
Because it is tedious to write out the AST directly all the time, we define a helper function sexpr->ast that can automate the process (but only if it’s given a well-formed Con expression).
6.2 Meaning of Con programs
The meaning of Con programs depends on the form of the expression and the new form is an if-expression.
the meaning of a if expression (if (zero? e0) e1 e2) is the meaning of e1 if the meaning of e1 if the meaning of e0 is 0 and is the meaning of e2 otherwise.
Let’s consider some examples:
'(if (zero? 0) (add1 2) 4) means 3.
'(if (zero? 1) (add1 2) 4) means 4.
'(if (zero? (if (zero? (sub1 1)) 1 0)) (add1 2) 4) means 4.
'(if (zero? (add1 0)) (add1 2) (if (zero? (sub1 1)) 1 0)) means 1.
The semantics is inductively defined as before. There are two new rules added for handling if-expressions: one for when the test expression means 0 and one for when it doesn’t.
The interpreter has an added case for if-expressions, which recursively evaluates the test expression and branches based on its value.
#lang racket (provide (all-defined-out)) (require "ast.rkt") ;; Expr -> Integer (define (interp e) (match e [(int-e i) i] [(add1-e e0) (+ (interp e0) 1)] [(sub1-e e0) (- (interp e0) 1)] [(if-e i t f) (if (zero? (interp i)) (interp t) (interp f))]))
We can confirm the interpreter computes the right result for the examples given earlier:
Examples
> (interp (sexpr->ast '(if (zero? 0) (add1 2) 4))) 3
> (interp (sexpr->ast '(if (zero? 1) (add1 2) 4))) 4
> (interp (sexpr->ast '(if (zero? (if (zero? (sub1 1)) 1 0)) (add1 2) 4))) 4
> (interp (sexpr->ast '(if (zero? (add1 0)) (add1 2) (if (zero? (sub1 1)) 1 0)))) 1
The argument for the correctness of the interpreter follows the same structure as for Blackmail, but with an added case for if-expressions.
6.3 An Example of Con compilation
Suppose we want to compile '(if (zero? 8) 2 3)...
We already know how to compile the 8, 2, and 3 part.
What needs to happen?
Execute the code for 8 leaving the result in 'rax,
check whether 'rax holds zero,
if it does, execute the code for 2,
if it doesn’t, execute the code for 3.
We can determine whether 8 evaluates to 0 using a comparison instruction: '(cmp rax 0). To do the conditional execution, we will need to jump to different parts of the code to either execute the code for 2 or 3. There are several ways we could accomplish this, but we take the following approach: immediately after the comparison, do a conditional jump to the code for the else branch when non-zero. Should the jump not occur, the next instructions will carry out the evaluation of the then branch, then (unconditionally) jump over the else branch code.
To accomplish this, we will need two new labels: one for the else branch code and one for the end of the else branch code. The gensym function can be used to generate symbols that have not appeared before.
In total, the code for this example would look like:
'((mov rax 8) (cmp rax 0) (jne if-else-begin) (mov rax 2) (jmp if-else-end) if-else-begin (mov rax 3) if-else-end)
6.4 A Compiler for Con
Notice that the '(mov rax 8), '(mov rax 3) and '(mov rax 2) parts are just the instructions generated by compiling 8, 2 and 3. Generalizing from this, we arrive at the following code for the compiler:
(let ((c0 (compile-e e0)) (c1 (compile-e e1)) (c2 (compile-e e2)) (l0 (gensym "if")) (l1 (gensym "if"))) `(,@c0 (cmp rax 0) (jne ,l0) ,@c1 (jmp ,l1) ,l0 ,@c2 ,l1))
This will require extending our representation of x86 instructions; in particular, we add 'jmp, 'jne, and 'cmp instructions:
#lang racket ;; type Asm = [Listof Instruction] ;; type Instruction = ;; | `ret ;; | `(mov ,Arg ,Arg) ;; | `(add ,Arg ,Arg) ;; | `(sub ,Arg ,Arg) ;; | `(cmp ,Arg ,Arg) ;; | `(jmp ,Label) ;; | `(je ,Label) ;; | `(jne ,Label) ;; | Label ;; type Label = Symbol ;; type Arg = ;; | Reg ;; | Integer ;; type Reg = ;; | `rax
We omit the printer code, which is mundane. See asm/printer.rkt for details.
The complete compiler code is:
#lang racket (provide (all-defined-out)) (require "ast.rkt") ;; Expr -> Asm (define (compile e) `(entry ,@(compile-e e) ret)) ;; Expr -> Asm (define (compile-e e) (match e [(int-e i) `((mov rax ,i))] [(add1-e e1) (let ((c1 (compile-e e1))) `(,@c1 (add rax 1)))] [(sub1-e e1) (let ((c1 (compile-e e1))) `(,@c1 (sub rax 1)))] [(if-e i t f) (let ((c1 (compile-e i)) (c2 (compile-e t)) (c3 (compile-e f)) (l1 (gensym "if")) (l2 (gensym "if"))) `(,@c1 (cmp rax 0) ; zero? <result of executing code for i> (je ,l1) ,@c3 (jmp ,l2) ,l1 ,@c2 ,l2))]))
Examples
> (asm-display (compile (sexpr->ast '(if (zero? 8) 2 3))))
global entry
section .text
entry:
mov rax, 8
cmp rax, 0
je if2486
mov rax, 3
jmp if2487
if2486:
mov rax, 2
if2487:
ret
> (asm-display (compile (sexpr->ast '(if (zero? 0) 1 2))))
global entry
section .text
entry:
mov rax, 0
cmp rax, 0
je if2488
mov rax, 2
jmp if2489
if2488:
mov rax, 1
if2489:
ret
> (asm-display (compile (sexpr->ast '(if (zero? 0) (if (zero? 0) 8 9) 2))))
global entry
section .text
entry:
mov rax, 0
cmp rax, 0
je if2492
mov rax, 2
jmp if2493
if2492:
mov rax, 0
cmp rax, 0
je if2490
mov rax, 9
jmp if2491
if2490:
mov rax, 8
if2491:
if2493:
ret
> (asm-display (compile (sexpr->ast '(if (zero? (if (zero? 2) 1 0)) 4 5))))
global entry
section .text
entry:
mov rax, 2
cmp rax, 0
je if2494
mov rax, 0
jmp if2495
if2494:
mov rax, 1
if2495:
cmp rax, 0
je if2496
mov rax, 5
jmp if2497
if2496:
mov rax, 4
if2497:
ret
Examples
> (asm-interp (compile (sexpr->ast '(if (zero? 8) 2 3)))) 3
> (asm-interp (compile (sexpr->ast '(if (zero? 0) 1 2)))) 1
> (asm-interp (compile (sexpr->ast '(if (zero? 0) (if (zero? 0) 8 9) 2)))) 8
> (asm-interp (compile (sexpr->ast '(if (zero? (if (zero? 2) 1 0)) 4 5)))) 4
6.5 Correctness and random testing
The statement of correctness follows the same outline as before:
Compiler Correctness: For all expressions e and integers i, if (e,i) in , then (asm-interp (compile e)) equals i.
Again, we formulate correctness as a property that can be tested:
Examples
> (define (check-compiler s) (let ((e (sexpr->ast s))) (check-equal? (asm-interp (compile e)) (interp e) e)))
Generating random Con programs is essentially the same as Blackmail programs, and are provided in a random.rkt module.
Examples
> (require "random.rkt") > (random-expr) '(if (zero? (if (zero? (sub1 (if (zero? 6) -3 4))) (if (zero? (add1 5)) (if (zero? -5) 2 -3) (add1 -1)) (add1 2))) -2 (if (zero? -2) (add1 8) (sub1 (if (zero? -10) -3 -2))))
> (random-expr) 4
> (random-expr) '(sub1 (sub1 (if (zero? (add1 1)) (add1 3) -7)))
> (random-expr) '(add1 (add1 0))
> (for ([i (in-range 10)]) (check-compiler (random-expr)))