8 Extort: when errors exist

7.4

8 Extort: when errors exist

The greatest mistake is to imagine that we never err.

8.1 Errors

We have added multiple, disjoint types, but mostly swept issues of errors under the rug by considering type mismatches as meaningless. Now let’s redesign the semantics to specify the error behavior of such programs.

We’ll call it Extort.

Nothing changes in the syntax of Extort from Dupe, although we will need to talk about two kinds of results from evaluating programs: values and errors. We will say that evaluation produces an answer, which is either a value or error:

8.2 Meaning of Extort programs

Languages adopt several approaches to type mismatches:

Prohibit such programs statically with a type system (e.g. OCaml, Java)
Coerce values to different types (e.g. JavaScript)
Signal a run-time error (e.g. Racket)
Leave the behavior unspecified (e.g. Scheme, C)

We’ve previously seen the last approach. Now let’s do what Racket does and signal an error.

The meaning of Extort programs that have type errors will now be defined as 'err:

(add1 #f): means 'err.
(zero? #t): means 'err.
(if (zero? #f) 1 2): means 'err.

There are three ways in which an error can be introduced:

And there are four rules for propagating errors from subexpressions:

Now what does the semantics say about (add1 #f)? What about (if 7 #t -2)?

The signature of the interpreter is extended to produce answers. Each use of a Racket primitive is guarded by checking the type of the arguments and an error is produced if the check fails. Errors are also propagated when a subexpression produces an error:

extort/interp.rkt
  #lang racket
  (provide (all-defined-out))

  (require "ast.rkt")

  ;; type Answer = Value | 'err

  ;; type Value =
  ;; | Integer
  ;; | Boolean


  ;; Expr -> Answer
  (define (interp e)
    (match e
      [(int-e i) i]
      [(bool-e b) b]
      [(add1-e e0)
       (match (interp e0)
         [(? integer? i) (add1 i)]
         [_ 'err])]
      [(sub1-e e0)
       (match (interp e0)
         [(? integer? i) (sub1 i)]
         [_ 'err])]
      [(zero?-e e0)
       (match (interp e0)
         [(? integer? i) (zero? i)]
         [_ 'err])]
      [(if-e p e1 e2)
       (match (interp p)
         ['err 'err]
         [v
          (if v
              (interp e1)
              (interp e2))])]))

We can confirm the interpreter computes the right result for the examples given earlier:

Examples
> (interp (add1-e (bool-e #f)))
'err
> (interp (zero?-e (bool-e #t)))
'err
> (interp (if-e (zero?-e (bool-e #f)) (int-e 1) (int-e 2)))
'err

The statement of correctness stays the same, but now observe that there is no way to crash the interpreter with any Expr value.

8.3 A Compiler for Extort

Suppose we want to compile (add1 #f), what needs to happen? Just as in the interpreter, we need to check the integerness of the argument’s value before doing the addition operation.

We extend the run-time system with a C function called error that prints "err" and exits:

extort/main.c

#include <stdio.h>
#include <inttypes.h>
#include <stdlib.h>

#define typeof_mask  1
#define val_shift    1
#define type_fixnum  0
#define type_bool    1

int64_t entry();

int main(int argc, char** argv) {
  int64_t result = entry();
  switch (typeof_mask & result) {
  case type_fixnum:    
    printf("%" PRId64 "\n", result >> val_shift);
    break;
  case type_bool:
    printf("#%c\n", result >> val_shift ? 't' : 'f');
    break;  
  }
  return 0;
}

void error() {
  printf("err\n");
  exit(1);
}

The compiler now emits code to check the type of arguments:

extort/compile.rkt
  #lang racket
  (provide (all-defined-out))

  (require "ast.rkt")

  ;; Expr -> Asm
  (define (compile e)
    `(entry
      ,@(compile-e e)
      ret
      err
      (push rbp)
      (call error)))

  ;; Expr -> Asm
  (define (compile-e e)
    (match e
      [(int-e i)
       `((mov rax ,(* i 2)))]
      [(bool-e b)
       `((mov rax ,(if b #b11 #b01)))]
      [(add1-e e1) (let ((c1 (compile-e e1)))
                  `(,@c1
                    ,@assert-integer
                    (add rax 2)))]
      [(sub1-e e1) (let ((c1 (compile-e e1)))
                  `(,@c1
                    ,@assert-integer
                    (sub rax 2)))]
      [(zero?-e e1) (let ((c1 (compile-e e1))
                          (l1 (gensym "nzero")))
                   `(,@c1
                    ,@assert-integer
                    (cmp rax 0)
                    (mov rax #b01) ; #f
                    (jne ,l1)
                    (mov rax #b11) ; #t
                    ,l1))]
      [(if-e p t f) (let ((c1 (compile-e p))
                          (c2 (compile-e t))
                          (c3 (compile-e f))
                          (l1 (gensym "if"))
                          (l2 (gensym "if")))
                      `(,@c1
                        (cmp rax #b01) ; compare to false
                        (jne ,l1)      ; maybe jump to code for true branch
                        ,@c3
                        (jmp ,l2)      ; jump to end
                        ,l1
                        ,@c2
                        ,l2))]))

  (define assert-integer
    `((mov rbx rax)
      (and rbx 1)
      (cmp rbx 0)
      (jne err)))

Here’s the code we generate for '(add1 #f):

Examples
> (asm-display (compile (add1-e (bool-e #f))))
global entry
extern error
section .text
entry:
mov rax, 1
mov rbx, rax
and rbx, 1
cmp rbx, 0
jne err
add rax, 2
ret
err:
push rbp
call error

Here are some examples running the compiler:

Examples
> (asm-interp (compile (sexpr->ast #t)))
#t
> (asm-interp (compile (sexpr->ast #f)))
#f
> (asm-interp (compile (sexpr->ast '(zero? 0))))
#t
> (asm-interp (compile (sexpr->ast '(zero? -7))))
#f
> (asm-interp (compile (sexpr->ast '(if #t 1 2))))
1
> (asm-interp (compile (sexpr->ast '(if #f 1 2))))
2
> (asm-interp (compile (sexpr->ast '(if (zero? 0) (if (zero? 0) 8 9) 2))))
8
> (asm-interp (compile (sexpr->ast '(if (zero? (if (zero? 2) 1 0)) 4 5))))
4
> (asm-interp (compile (sexpr->ast '(add1 #t))))
'err
> (asm-interp (compile (sexpr->ast '(sub1 (add1 #f)))))
'err
> (asm-interp (compile (sexpr->ast '(if (zero? #t) 1 2))))
'err

Since the interpreter and compiler have well defined specifications for what should happen when type errors occur, we can test in the usual way again:

Examples
> (define (check-correctness e)
    (check-equal? (asm-interp (compile e))
                  (interp e)
                  e))
> (check-correctness (add1-e (int-e 7)))
> (check-correctness (add1-e (bool-e #f)))

← prev up next →

	Syllabus
	Texts
	Schedule
	Notes
	Assignments
	Midterms
	Project
	Racket

1	What is a Compiler?
2	From OCaml to Racket
3	Abscond: a language of numbers
4	Agreement: a language of numbers between friends
5	Blackmail: incrementing and decrementing
6	Con: branching with conditionals
7	Dupe: a duplicity of types
8	Extort: when errors exist
9	Fraud: local binding and variables
10	Refactoring the Compiler
11	Grift: binary operations
12	Hustle: heaps and lists
13	Iniquity: function definitions and calls
14	Jig: jumping to tail calls
15	Knock: first-class function (pointers)
16	Using Graphviz/ dot to visualize our AST
17	Loot: lambda the ultimate
18	Shakedown: Calling functions C functions

8.1	Errors
8.2	Meaning of Extort programs
8.3	A Compiler for Extort