7 Extort: when errors exist
The greatest mistake is to imagine that we never err.
7.1 Errors
We have added multiple, disjoint types, but mostly swept issues of errors under the rug by considering type mismatches as meaningless. Now let’s redesign the semantics to specify the error behavior of such programs.
We’ll call it Extort.
Nothing changes in the syntax of Extort from Dupe, although we will need to talk about two kinds of results from evaluating programs: values and errors. We will say that evaluation produces an answer, which is either a value or error:
7.2 Meaning of Extort programs
Languages adopt several approaches to type mismatches:
Prohibit such programs statically with a type system (e.g. OCaml, Java)
Coerce values to different types (e.g. JavaScript)
Signal a run-time error (e.g. Racket)
Leave the behavior unspecified (e.g. Scheme, C)
We’ve previously seen the last approach. Now let’s do what Racket does and signal an error.
The meaning of Extort programs that have type errors will now be defined as 'err:
Now what does the semantics say about (add1 #f)? What about (if 7 #t -2)?
The signature of the interpreter is extended to produce answers. Each use of a Racket primitive is guarded by checking the type of the arguments and an error is produced if the check fails. Errors are also propagated when a subexpression produces an error:
#lang racket (provide (all-defined-out)) ;; type Answer = Value | 'err ;; Expr -> Answer (define (interp e) (match e [(? integer? i) i] [(? boolean? b) b] [`(add1 ,e0) (match (interp e0) [(? integer? i) (add1 i)] [_ 'err])] [`(sub1 ,e0) (match (interp e0) [(? integer? i) (sub1 i)] [_ 'err])] [`(zero? ,e0) (match (interp e0) [(? integer? i) (zero? i)] [_ 'err])] [`(if ,e0 ,e1 ,e2) (match (interp e0) ['err 'err] [v (if v (interp e1) (interp e2))])]))
We can confirm the interpreter computes the right result for the examples given earlier:
Examples
> (interp '(add1 #f)) 'err
> (interp '(zero? #t)) 'err
> (interp '(if (zero? #f) 1 2)) 'err
The statement of correctness stays the same, but now observe that there is no way to crash the interpreter with any Expr value.
7.3 A Compiler for Extort
Suppose we want to compile (add1 #f), what needs to happen? Just as in the interpreter, we need to check the integerness of the argument’s value before doing the addition operation.
We extend the run-time system with a C function called error that prints "err" and exits:
#include <stdio.h> #include <inttypes.h> #include <stdlib.h> #define typeof_mask 1 #define val_shift 1 #define type_fixnum 0 #define type_bool 1 int64_t entry(); int main(int argc, char** argv) { int64_t result = entry(); switch (typeof_mask & result) { case type_fixnum: printf("%" PRId64 "\n", result >> val_shift); break; case type_bool: printf("#%c\n", result >> val_shift ? 't' : 'f'); break; } return 0; } void error() { printf("err"); exit(1); }
The compiler now emits code to check the type of arguments:
#lang racket (provide (all-defined-out)) ;; Expr -> Asm (define (compile e) `(entry ,@(compile-e e) ret err (push rbp) ; push before calling (call error))) ;; Expr -> Asm (define (compile-e e) (match e [(? integer? i) `((mov rax ,(* i 2)))] [(? boolean? b) `((mov rax ,(if b #b11 #b01)))] [`(add1 ,e0) (let ((c0 (compile-e e0))) `(,@c0 ,@assert-integer (add rax 2)))] [`(sub1 ,e0) (let ((c0 (compile-e e0))) `(,@c0 ,@assert-integer (sub rax 2)))] [`(zero? ,e0) (let ((c0 (compile-e e0)) (l0 (gensym)) (l1 (gensym))) `(,@c0 ,@assert-integer (cmp rax 0) (mov rax #b01) ; #f (jne ,l0) (mov rax #b11) ; #t ,l0))] [`(if ,e0 ,e1 ,e2) (let ((c0 (compile-e e0)) (c1 (compile-e e1)) (c2 (compile-e e2)) (l0 (gensym)) (l1 (gensym))) `(,@c0 (cmp rax #b01) ; compare to #f (je ,l0) ; jump to c2 if #f ,@c1 (jmp ,l1) ; jump past c2 ,l0 ,@c2 ,l1))])) (define assert-integer `((mov rbx rax) (and rbx 1) (cmp rbx 0) (jne err)))
Examples
> (asm-display (compile '(add1 #f)))
global entry
extern error
section .text
entry:
mov rax, 1
mov rbx, rax
and rbx, 1
cmp rbx, 0
jne err
add rax, 2
ret
err:
push rbp
call error
Examples
> (asm-interp (compile #t)) #t
> (asm-interp (compile #f)) #f
> (asm-interp (compile '(zero? 0))) #t
> (asm-interp (compile '(zero? -7))) #f
> (asm-interp (compile '(if #t 1 2))) 1
> (asm-interp (compile '(if #f 1 2))) 2
> (asm-interp (compile '(if (zero? 0) (if (zero? 0) 8 9) 2))) 8
> (asm-interp (compile '(if (zero? (if (zero? 2) 1 0)) 4 5))) 4
> (asm-interp (compile '(add1 #t))) 'err
> (asm-interp (compile '(sub1 (add1 #f)))) 'err
> (asm-interp (compile '(if (zero? #t) 1 2))) 'err
Since the interpreter and compiler have well defined specifications for what should happen when type errors occur, we can test in the usual way again:
Examples
> (define (check-correctness e) (check-equal? (asm-interp (compile e)) (interp e) e)) > (check-correctness '(add1 7)) > (check-correctness '(add1 #f))