17 Loot: lambda the ultimate
17.1 Functions in their most general form
We’ve added function calls and function definitions, but what we don’t have and really should is function values.
Programming with functions as values is a powerful idiom that is at the heart of both functional programming and object-oriented programming, which both center around the idea that computation itself can be packaged up in a suspended form as a value and later run.
Now we’re ready to deal with functions in their most general form: λ-expressions.
Let’s call it Loot.
We add λ-expressions to the syntax of expressions:
(λ (x0 ...) e0)
Here x0 ... are the formal parameters of the function and e0 is the body.
The syntax is evocative of function definitions:
(define (f x0 ...) e0)
However, you’ll notice:
There is no function name in the λ-expression; it is an anonymous function.
The new form is an expression—
it can appear any where as a subexpression in a program, whereas definitions were restricted to be at the top-level.
There also is a syntactic relaxation on the grammar of application expressions (a.k.a. function calls). Previously, a function call consisted of a function name and some number of arguments:
(f e0 ...)
But since functions will now be considered values, we can generalize what’s allowed in the function position of the syntax for calls to be an arbitrary expression. That expression is expected to produce a function value (and this expectation gives rise to a new kind of run-time error when violated: applying a non-function to arguments), which can called with the value of the arguments.
Hence the syntax is extended to:
(e e0 ...)
In particular, the function expression can be a λ-expression, e.g.:
But also it may be expression which produces a function, but isn’t itself a λ-expression:
(define (adder n) (λ (x) (+ x n))) ((adder 5) 10)
Here, (adder 5) is the function position of ((adder 5) 10). That subexpression is itself a function call expression, calling adder with the argument 5. The result of that subexpression is a function that, when applied, adds 5 to its argument.
In terms of the AST, here’s how we model the extended syntax:
#lang racket ;; type Expr = ... ;; | (App Expr (Listof Expr)) ;; | (Lam (Listof Id) Expr)
So for example, the expression ((adder 5) 10) would be parsed as:
(App (App (Var 'adder) (Int 5)) (Int 10))
and (λ (x) (+ x n)) would be parsed as:
(Lam (list 'x) (Prim2 '+ (Var 'x) (Var 'n)))
We will actually use a slight tweak of this AST when it comes to representing the syntax of λ-expressions. Although functions are anynomous, it will nonetheless be useful to syntactically distinguish one λ-expression occurrence from an otherwise identical occurrence.
Consider for example:
(let ((g1 (let ((x 100)) (λ (y) (+ x y)))) (g2 (let ((x 9)) (λ (y) (+ x y))))) ...)
This program has two occurrences of the expression (λ (y) (+ x y)). Even though these expressions are identical and both evaluate to functions, they do not evaluate to the same function! One is the “add 100” function and the other is the “add 9” function.
It will be useful to distinguish these two occurrences so we can talk about this or that λ-expression.
The way we accomplish this is we will assume the AST representation of each distinct occurrence of a λ-expression has it’s own unique name (represented with a symbol). We choose to have the parser take of labelling λ-expressions by inserting a gensym’d symbol. So, we touch-up the Lam AST type definition as follows:
; type Expr = ... ; | (Lam Id (Listof Id) Expr)
and these two occurrence would be distinguished by having distinct symbols for the label of the expression:
Examples
> (Lam (gensym) (list 'x) (Prim2 '+ (Var 'x) (Var 'y)))
'#s(Lam
g146222
(x)
#s(Prim2 + #s(Var x) #s(Var y)))
> (Lam (gensym) (list 'x) (Prim2 '+ (Var 'x) (Var 'y)))
'#s(Lam
g146223
(x)
#s(Prim2 + #s(Var x) #s(Var y)))
17.2 Long Live Lambda!
Let’s start by developing the interpreter for Loot, where the relevant forms are λs and applications:
; Expr REnv Defns -> Answer (define (interp-env e r ds) (match e ; ... [(Lam _ xs e) '...] [(App e es) '...]))
These two parts of the interpreter must fit together: λ is the constructor for functions and application is deconstructor. An application will evaluate all its subexpressions and the value produced by e ought to be the kind of value constructed by λ. That value needs to include all the necessary information to, if given the values of the arguments es, evaluate the body of the function in an environment associating the parameter names with the arguments’ values.
So how should functions be represented? Here is a simple idea following the pattern we’ve used frequently in the interpreter:
Q: How can we represent strings?
A: With strings!
Q: How can we represent booleans?
A: With booleans!
Q: How can we represent numbers?
A: With numbers!
Q: How can we represent pairs?
A: With pairs!
Q: etc.
A: etc.
Q: How can we represent functions?
A: With functions!?
Great, so we will use function to represent functions. We can implement function application with function application. Let’s fill in what we know so far:
; Expr REnv Defns -> Answer (define (interp-env e r ds) (match e ; ... [(Lam _ xs e) (λ ??? '...)] [(App e es) (match (interp-env e r ds) ['err 'err] [f (match (interp-env* es r ds) ['err 'err] [vs (apply f vs)])])]))
It’s not totally clear what parameters the representation of a function should have or what we should in the body of that function. However, the code in the interpretation of an application sheds light on both. First, it’s clear a function should potentially take any number of arguments:
; Expr REnv Defns -> Answer (define (interp-env e r ds) (match e ; ... [(Lam _ xs e) (λ vs '...)] [(App e es) (match (interp-env e r ds) ['err 'err] [f (match (interp-env* es r ds) ['err 'err] [vs (apply f vs)])])]))
Second, what should happen when a function is applied? It should produce the answer produced by the body of the λ expression in an environment that associates xs with vs. Translating that to code, we get:
; Expr REnv Defns -> Answer (define (interp-env e r ds) (match e ; ... [(Lam _ xs e) (λ vs (interp-env e (zip xs vs) ds))] [(App e es) (match (interp-env e r ds) ['err 'err] [f (match (interp-env* es r ds) ['err 'err] [vs (apply f vs)])])]))
; type Value = ; | .... ; | (Value ... -> Answer)
and completed the implementation of the interpreter.
There are, however, problems.
For one, this approach does not model how λ-expressions are able to capture the environment in which they are evaluated. Consider:
(let ((y 8)) (λ (x) (+ x y)))
This evaluates to a function that, when applied, should add 8 to its argument. It does so by evaluating the body of the λ, but in an environment that both associates x with the value of the argument, but also associates y with 8. That association comes from the environment in place when the λ-expression is evaluated. The interpreter as written will consider y is unbound!
The solution is easy: in order for (Loot) functions to capture their (implicit) environment, we should capture the (explicit) environment in the (Racket) function:
; Expr REnv Defns -> Answer (define (interp-env e r ds) (match e ; ... [(Lam _ xs e) (λ vs (interp-env e (append (zip xs vs) r)) ds)] [(App e es) (match (interp-env e r ds) ['err 'err] [f (match (interp-env* es r ds) ['err 'err] [vs (apply f vs)])])]))
The last remaining issue is we should do some type and arity-checking:
; Expr REnv Defns -> Answer (define (interp-env e r ds) (match e ; ... [(Lam _ xs e) (λ vs ; check arity matches (if (= (length xs) (length vs)) (interp-env e (append (zip xs vs) r) ds) 'err))] [(App e es) (match (interp-env e r ds) ['err 'err] [f (match (interp-env* es r ds) ['err 'err] [vs (if (procedure? f) (apply f vs) 'err)])])]))
We have a final issue to deal with. What should we do about references to functions defined at the top-level of the program? In other words, how do we make function applicaton when the function was defined with define?
One possible answer to re-use our new power of lambda-expression by considering define-bound names as just regular old variables, but changing the way that variables are interpreted so that when evaluating a variable that is not bound in the local environment, we consult the program definitions and construct the function value at that moment.
There will turn out to be a better, more uniform approach, but this we will work for now and is simple.
So for now we interpret variables as follows:
; Id Env [Listof Defn] -> Answer (define (interp-var x r ds) (match (lookup r x) ['err (match (defns-lookup ds x) [(Defn f xs e) (interp-env (Lam f xs e) '() ds)] [#f 'err])] [v v]))
You’ll notice that the function is constructed by interpreting a lambda-expression corresponding to the function definition and that this happens in an empty environment; that’s because function definitions can only occur at the top-level and therefore the only variables they can reference are other define-bound functions, given in ds.
The complete interpreter is:
#lang racket (provide interp interp-env) (require "ast.rkt" "env.rkt" "interp-prims.rkt") ;; type Answer = Value | 'err ;; type Value = ;; | Integer ;; | Boolean ;; | Character ;; | Eof ;; | Void ;; | '() ;; | (cons Value Value) ;; | (box Value) ;; | (vector Value ...) ;; | (string Char ...) ;; | (Value ... -> Answer) ;; type REnv = (Listof (List Id Value)) ;; type Defns = (Listof Defn) ;; Prog -> Answer (define (interp p) (match p [(Prog ds e) (interp-env e '() ds)])) ;; Expr Env Defns -> Answer (define (interp-env e r ds) (match e [(Int i) i] [(Bool b) b] [(Char c) c] [(Eof) eof] [(Empty) '()] [(Var x) (interp-var x r ds)] [(Str s) (string-copy s)] [(Prim0 'void) (void)] [(Prim0 'read-byte) (read-byte)] [(Prim0 'peek-byte) (peek-byte)] [(Prim1 p e) (match (interp-env e r ds) ['err 'err] [v (interp-prim1 p v)])] [(Prim2 p e1 e2) (match (interp-env e1 r ds) ['err 'err] [v1 (match (interp-env e2 r ds) ['err 'err] [v2 (interp-prim2 p v1 v2)])])] [(Prim3 p e1 e2 e3) (match (interp-env e1 r ds) ['err 'err] [v1 (match (interp-env e2 r ds) ['err 'err] [v2 (match (interp-env e3 r ds) ['err 'err] [v3 (interp-prim3 p v1 v2 v3)])])])] [(If p e1 e2) (match (interp-env p r ds) ['err 'err] [v (if v (interp-env e1 r ds) (interp-env e2 r ds))])] [(Begin e1 e2) (match (interp-env e1 r ds) ['err 'err] [_ (interp-env e2 r ds)])] [(Let x e1 e2) (match (interp-env e1 r ds) ['err 'err] [v (interp-env e2 (ext r x v) ds)])] [(Lam _ xs e) (λ vs ; check arity matches (if (= (length xs) (length vs)) (interp-env e (append (zip xs vs) r) ds) 'err))] [(App e es) (match (interp-env e r ds) ['err 'err] [f (match (interp-env* es r ds) ['err 'err] [vs (if (procedure? f) (apply f vs) 'err)])])] [(Match e ps es) (match (interp-env e r ds) ['err 'err] [v (interp-match v ps es r ds)])])) ;; Value [Listof Pat] [Listof Expr] Env Defns -> Answer (define (interp-match v ps es r ds) (match* (ps es) [('() '()) 'err] [((cons p ps) (cons e es)) (match (interp-match-pat p v r) [#f (interp-match v ps es r ds)] [r (interp-env e r ds)])])) ;; Pat Value Env -> [Maybe Env] (define (interp-match-pat p v r) (match p [(PWild) r] [(PVar x) (ext r x v)] [(PLit l) (and (eqv? l v) r)] [(PBox p) (match v [(box v) (interp-match-pat p v r)] [_ #f])] [(PCons p1 p2) (match v [(cons v1 v2) (match (interp-match-pat p1 v1 r) [#f #f] [r1 (interp-match-pat p2 v2 r1)])] [_ #f])] [(PAnd p1 p2) (match (interp-match-pat p1 v r) [#f #f] [r1 (interp-match-pat p2 v r1)])])) ;; Id Env [Listof Defn] -> Answer (define (interp-var x r ds) (match (lookup r x) ['err (match (defns-lookup ds x) [(Defn f xs e) (interp-env (Lam f xs e) '() ds)] [#f 'err])] [v v])) ;; (Listof Expr) REnv Defns -> (Listof Value) | 'err (define (interp-env* es r ds) (match es ['() '()] [(cons e es) (match (interp-env e r ds) ['err 'err] [v (match (interp-env* es r ds) ['err 'err] [vs (cons v vs)])])])) ;; Defns Symbol -> [Maybe Defn] (define (defns-lookup ds f) (findf (match-lambda [(Defn g _ _) (eq? f g)]) ds)) (define (zip xs ys) (match* (xs ys) [('() '()) '()] [((cons x xs) (cons y ys)) (cons (list x y) (zip xs ys))]))
We now have the full power of λ expressions in our language. We can write recursive functions, using only anonymous functions, via the Y-combinator:
Examples
> (define (run . p) (interp (parse p)))
> (run '(λ (t) ((λ (f) (t (λ (z) ((f f) z)))) (λ (f) (t (λ (z) ((f f) z))))))) #<procedure:...ngs/loot/interp.rkt:78:5>
For example, computing the triangular function applied to 10:
Examples
> (run '(((λ (t) ((λ (f) (t (λ (z) ((f f) z)))) (λ (f) (t (λ (z) ((f f) z)))))) (λ (tri) (λ (n) (if (zero? n) 0 (+ n (tri (sub1 n))))))) 36)) 666
One of the niceties of using Racket functions to represent Loot functions is we can define Racket functions via the interpretation of Loot functions:
Examples
> (define Y (run '(λ (t) ((λ (f) (t (λ (z) ((f f) z)))) (λ (f) (t (λ (z) ((f f) z))))))))
> (define tri (run '(λ (tri) (λ (n) (if (zero? n) 0 (+ n (tri (sub1 n))))))))
And then use them from within Racket:
Examples
> ((Y tri) 36) 666
We can also “import” Racket functions in to Loot:
Examples
> (interp-env (parse-e '(expt 2 10)) (list (list 'expt expt)) '()) 1024
17.3 Lambda is Dead!
Now the question you might naturally wonder is: how does implementing functions in terms of functions help me implement functions in x86, which after all, doesn’t have λ?
The answer is that from this point, in which we have an understandable account of functions, we can iteratively revise the interpreter to eliminate the use of functions while computing equivalent results. Doing so will shed light on the lower-level implementation of functions in the compiler.
Consider again what it is that a λ-expression is doing for you:
it is packaging up the parameters, body, and environment, so that
when applied it can evaluate the body, binding the parameters, in an extension of the functions environment.
We can achive these things without using a function value by:
creating a data structure to hold the parameters, body, and environment, and
rewriting the application of the function to use those values to evaluate the body, binding the parameters, etc.
So we are changing the representation of functions from:
; type Value = ; | .... ; | (Value ... -> Answer)
To:
; type Value = ; | .... ; | (Closure [Listof Id] Expr Env)
When a λ is evaluated, a closure is created. When a function is applied, we deconstruct the closure and execute the code that used to be in the (Racket) function:
; Expr REnv Defns -> Answer (define (interp-env e r ds) (match e ; ... [(Lam _ xs e) (Closure xs e r)] [(App e es) (match (interp-env e r ds) ['err 'err] [f (match (interp-env* es r ds) ['err 'err] [vs (match f [(Closure xs e r) ; check arity matches (if (= (length xs) (length vs)) (interp-env e (append (zip xs vs) r) ds) 'err)] [_ 'err])])])]))
We can give it a try:
Examples
> (define (run . p) (interp (parse p))) > (run '(λ (x) x)) '#s(Closure (x) #s(Var x) ())
> (run '((λ (x) (λ (y) x)) 8)) '#s(Closure (y) #s(Var x) ((x 8)))
Notice in the second example how the closure contains the body of the function and the environment mapping the free variable: 'x to 8.
We can also confirm our larger example works:
Examples
> (run '(((λ (t) ((λ (f) (t (λ (z) ((f f) z)))) (λ (f) (t (λ (z) ((f f) z)))))) (λ (tri) (λ (n) (if (zero? n) 0 (+ n (tri (sub1 n))))))) 36)) 666
While can’t apply the interpretation of functions in Racket like we did previously, we can apply-function the interpretation of functions:
Examples
> (define Y (run '(λ (t) ((λ (f) (t (λ (z) ((f f) z)))) (λ (f) (t (λ (z) ((f f) z))))))))
> (define tri (run '(λ (tri) (λ (n) (if (zero? n) 0 (+ n (tri (sub1 n)))))))) ; Value Value ... -> Answer
> (define (apply-function f . vs) (match f [(Closure xs e r) ; check arity matches (if (= (length xs) (length vs)) (interp-env e (append (zip xs vs) r) '()) 'err)] [_ 'err])) > (apply-function (apply-function Y tri) 36) 666
The process we used to eliminate function values from the interpreter is an instance of a general-purpose whole-program transformation called defunctionalization for replacing function values with data structures.
17.4 Defunctionalization at work
Let’s digress for a moment and learn this very useful transformation.
Here is a data type for representing regular expressions:
; type Regexp = ; | 'zero ; | 'one ; | ‘(char ,Char) ; | ‘(times ,Regexp ,Regexp) ; | ‘(plus ,Regexp ,Regexp) ; | ‘(star ,Regexp)
The regular expression 'zero matches nothing; 'one matches the empty string; `(char ,c) matches the character c; `(times ,r1 ,r2) matches the concatenation of a string matching r1 followed by a string matching r2; `(plus ,r1 ,r2) matching either a string matching r1 or a string matching r2; and `(star ,r) matches a string made up of any number of substrings, each of which match r.
A really nice way to write a matcher is to use a continuation-passing style that keeps track of what is required of the remainder of the string after matching a prefix against the regexp:
#lang racket (provide accepts) ;; type Regexp = ;; | 'zero ;; | 'one ;; | `(char ,Char) ;; | `(times ,Regexp ,Regexp) ;; | `(plus ,Regexp ,Regexp) ;; | `(star ,Regexp) ;; Regexp String -> Boolean (define (accepts r s) (matcher r (string->list s) (λ (cs) (empty? cs)))) ;; Regexp (Listof Char) ((Listof Char) -> Bool) -> Bool (define (matcher r cs k) (match r ['zero #f] ['one (k cs)] [`(char ,c) (match cs ['() #f] [(cons d cs) (and (char=? c d) (k cs))])] [`(plus ,r1 ,r2) (or (matcher r1 cs k) (matcher r2 cs k))] [`(times ,r1 ,r2) (matcher r1 cs (λ (cs) (matcher r2 cs k)))] [`(star ,r) (letrec ((matcher* (λ (cs) (or (k cs) (matcher r cs matcher*))))) (matcher* cs))]))
Examples
> (accepts `(star (char #\a)) "aaaaa") #t
> (accepts `(star (char #\a)) "aaaab") #f
> (accepts `(star (plus (char #\a) (char #\b))) "aaaab") #t
But what if needed to program this regular expression matching without the use of function values? We can arrive at such code systematically by applying defunctionalization.
#lang racket (provide accepts) ;; type Regexp = ;; | 'zero ;; | 'one ;; | `(char ,Char) ;; | `(times ,Regexp ,Regexp) ;; | `(plus ,Regexp ,Regexp) ;; | `(star ,Regexp) ;; type K = ;; | '(k0) ;; | `(k1 ,Regexp ,K) ;; | `(k2 ,K ,Regexp) ;; Regexp String -> Boolean (define (accepts r s) (matcher r (string->list s) '(k0))) ;; Regexp (Listof Char) K -> Bool (define (matcher r cs k) (match r ['zero #f] ['one (apply-k k cs)] [`(char ,c) (match cs ['() #f] [(cons d cs) (and (char=? c d) (apply-k k cs))])] [`(plus ,r1 ,r2) (or (matcher r1 cs k) (matcher r2 cs k))] [`(times ,r1 ,r2) (matcher r1 cs `(k1 ,r2 ,k))] [`(star ,r) (apply-k `(k2 ,k ,r) cs)])) ;; K (Listof Char) -> Bool (define (apply-k k cs) (match k [`(k0) (empty? cs)] [`(k1 ,r2 ,k) (matcher r2 cs k)] [`(k2 ,k* ,r) (or (apply-k k* cs) (matcher r cs k))]))
And we get the same results:
Examples
> (accepts `(star (char #\a)) "aaaaa") #t
> (accepts `(star (char #\a)) "aaaab") #f
> (accepts `(star (plus (char #\a) (char #\b))) "aaaab") #t
17.5 Compiling Loot
Compiling a λ-expression will involve generating two different chunks of instructions:
one to implement the function, i.e. the code to be executed when the function created by the λ-expression is called, and
one to create a closure, i.e. to capture the environment at the point the λ-expression is evaluated.
17.6 Compiling Function Definitions
The first part closely follows the appoach of defining a function definition (define (f x ...) e) from our previous compilers.
Ther are two important differences from the past though:
λ-expressions don’t have a name, and
the body of the λ-expression may reference variables bound outside of the λ-expression.
To deal with the first issue, we first make a pass over the program inserting computed names for each λ-expression.
This is the reason for the generated name field in the Lam constructor.
; type Expr = ; .... ; | (Lam Id [Listof Id] Expr)
These labels are inserted by the parser. Here it is at work:
Examples
> (parse-e '(λ (t) ((λ (f) (t (λ (z) ((f f) z)))) (λ (f) (t (λ (z) ((f f) z)))))))
'#s(Lam
lambda148516
(t)
#s(App
#s(Lam
lambda148517
(f)
#s(App
#s(Var t)
(#s(Lam
lambda148518
(z)
#s(App
#s(App #s(Var f) (#s(Var f)))
(#s(Var z)))))))
(#s(Lam
lambda148519
(f)
#s(App
#s(Var t)
(#s(Lam
lambda148520
(z)
#s(App
#s(App #s(Var f) (#s(Var f)))
(#s(Var z))))))))))
Now turning to the second issue–λ-expression may reference
variables bound outside of the expression—
There are many possible solutions, but here is one. Every function can be passed an implicit first argument which will point to a section of memory that contains all of the values for the free variables.
In other words, the code for functions will accept an additional argument that plays the role of the environment for this particular instance of the function.
The first thing the function does once called is copies these values from memory to the stack and then executes the body of the function in an environment that binds both the free variables and the formal parameters.
This will have to work in concert with closure creation and function calls. When the λ-expression is evaluated, a closure will be created storing the value of z in memory. When the function is applied, the caller will need to retrieve that value and place it as the first argument on stack before calling the function’s code.
To implement this, we will need to compute the free variables, which we do with the following function:
#lang racket (require "ast.rkt") (provide fv) ;; Expr -> [Listof Id] ;; List all of the free variables in e (define (fv e) (remove-duplicates (fv* e))) (define (fv* e) (match e [(Var x) (list x)] [(Prim1 p e) (fv* e)] [(Prim2 p e1 e2) (append (fv* e1) (fv* e2))] [(Prim3 p e1 e2 e3) (append (fv* e1) (fv* e2) (fv* e3))] [(If e1 e2 e3) (append (fv* e1) (fv* e2) (fv* e3))] [(Begin e1 e2) (append (fv* e1) (fv* e2))] [(Let x e1 e2) (append (fv* e1) (remq* (list x) (fv* e2)))] [(App e1 es) (append (fv* e1) (append-map fv* es))] [(Lam f xs e) (remq* xs (fv* e))] [(Match e ps es) (append (fv* e) (append-map fv-clause* ps es))] [_ '()])) ;; Pat Expr -> [Listof Id] (define (fv-clause* p e) (remq* (bv-pat* p) (fv* e))) ;; Pat -> [Listof Id] (define (bv-pat* p) (match p [(PVar x) (list x)] [(PCons p1 p2) (append (bv-pat* p1) (bv-pat* p2))] [(PAnd p1 p2) (append (bv-pat* p1) (bv-pat* p2))] [(PBox p) (bv-pat* p)] [_ '()]))
We can now write the function that compiles a labelled λ-expression into a function in assembly:
; Lam -> Asm (define (compile-lambda-define l) (let ((fvs (fv l))) (match l [(Lam f xs e) (let ((env (append (reverse fvs) (reverse xs) (list #f)))) (seq (Label (symbol->label f)) (Mov rax (Offset rsp (* 8 (length xs)))) (Xor rax type-proc) (copy-env-to-stack fvs 8) (compile-e e env #t) (Add rsp (* 8 (length env))) ; pop env (Ret)))])))
Notice how similar it is to our previous function definition compiler:
; Defn -> Asm (define (compile-define d) (match d [(Defn f xs e) (seq (Label (symbol->label f)) (compile-e e (reverse xs) #t) (Add rsp (* 8 (length xs))) ; pop args (Ret))]))
The key difference here is that we are expecting the caller to leave the closure at the top of the stack. When called, the function fetches the closure and copies its environment to the stack, hence the body of the function has a static environment which includes the free variables followed by the parameters followed by the closure.
The copying of the values from the closure environment to the stack is achieved by this helper function:
; [Listof Id] Int -> Asm ; Copy the closure environment at given offset to stack (define (copy-env-to-stack fvs off) (match fvs ['() (seq)] [(cons _ fvs) (seq (Mov r9 (Offset rax off)) (Push r9) (copy-env-to-stack fvs (+ 8 off)))]))
When the body of the function completes, all of these elements are popped off the stack and the function returns.
Examples
> (compile-lambda-define (Lam 'f '(x) (Var 'z)))
(list
(Label 'label_f_5e96933745)
(Mov 'rax (Offset 'rsp 8))
(Xor 'rax 5)
(Mov 'r9 (Offset 'rax 8))
(Push 'r9)
(Mov 'rax (Offset 'rsp 0))
(Add 'rsp 24)
(Ret))
The compiler will need to generate one such function for each λ-expression in the program. So we use a helper function for extracting all the λ-expressions:
#lang racket (require "ast.rkt") (provide lambdas) ;; Prog -> [Listof Lam] ;; List all of the lambda expressions in p (define (lambdas p) (match p [(Prog ds e) (append (lambdas-ds ds) (lambdas-e e))])) ;; Defns -> [Listof Lam] ;; List all of the lambda expressions in ds (define (lambdas-ds ds) (match ds ['() '()] [(cons (Defn f xs e) ds) (append (lambdas-e e) (lambdas-ds ds))])) ;; Expr -> [Listof Lam] ;; List all of the lambda expressions in e (define (lambdas-e e) (match e [(Prim1 p e) (lambdas-e e)] [(Prim2 p e1 e2) (append (lambdas-e e1) (lambdas-e e2))] [(Prim3 p e1 e2 e3) (append (lambdas-e e1) (lambdas-e e2) (lambdas-e e3))] [(If e1 e2 e3) (append (lambdas-e e1) (lambdas-e e2) (lambdas-e e3))] [(Begin e1 e2) (append (lambdas-e e1) (lambdas-e e2))] [(Let x e1 e2) (append (lambdas-e e1) (lambdas-e e2))] [(App e1 es) (append (lambdas-e e1) (append-map lambdas-e es))] [(Lam f xs e1) (cons e (lambdas-e e1))] [(Match e ps es) (append (lambdas-e e) (append-map lambdas-e es))] [_ '()]))
And another for compiling each of them:
; [Listof Lam] -> Asm (define (compile-lambda-defines ls) (match ls ['() (seq)] [(cons l ls) (seq (compile-lambda-define l) (compile-lambda-defines ls))]))
The top-level compile function now extracts and compiles all the λ-expressions to functions:
; Prog -> Asm (define (compile p) (match p [(Prog ds e) (prog (externs) (Global 'entry) (Label 'entry) (Mov rbx rdi) ; recv heap pointer (compile-e e '() #t) (Ret) (compile-lambda-defines (lambdas e)) (Label 'raise_error_align) pad-stack (Call 'raise_error))]))
What remains is the issue of compiling λ-expressions to code to create a closure and using closures to provide the appropriate environment when called.
17.7 Save the Environment: Create a Closure!
The basic challenge we are faced with is designing a representation of functions as values. Like other kinds of values, functions will be disjoint kind of value, meaning bits representing a function will need to be tagged distinctly from other kinds of values. Functions will need to represent all of the run-time information in the Closure structure used in the interpreter. Looking back, a Closure contains the formal parameters of the lambda-expression, the body, and the environment in place at the time the lambda-expression was evaluated.
The parameters and body expression are relevant compile-lambda-define. What’s relevant for the closure is the label of lambda-expression and the environment. For the compiler, the environment can be represented by the sequence of values it contains at run-time.
So, the way we will represent a closure is by a tagged pointer to a sequence in memory that contains the label of the closure’s code and a sequence of values that were bound to the free variables when the lambda-expression was evaluated.
When a lambda-expression is evaluated, we allocate a closure on the heap, write the lambda’s label, followed by the values of the free variables. The result of evaluating the expression is the tagged pointer to the memory just written.
Here’s the function for emitting closure construction code:
; Id [Listof Id] Expr CEnv -> Asm (define (compile-lam f xs e c) (let ((fvs (fv (Lam f xs e)))) (seq (Lea rax (symbol->label f)) (Mov (Offset rbx 0) rax) (free-vars-to-heap fvs c 8) (Mov rax rbx) ; return value (Or rax type-proc) (Add rbx (* 8 (add1 (length fvs)))))))
It relies on a helper function for emitting instructions to copy the value of free variables, i.e. variables bound in the current environment but outside of the lambda-expression. It fetches these values just like a variable reference would: it computes the variables lexical address and fetches it from the stack, then writes it to the heap.
; [Listof Id] CEnv Int -> Asm ; Copy the values of given free variables into the heap at given offset (define (free-vars-to-heap fvs c off) (match fvs ['() (seq)] [(cons x fvs) (seq (Mov r8 (Offset rsp (lookup x c))) (Mov (Offset rbx off) r8) (free-vars-to-heap fvs c (+ off 8)))]))
That’s all there is to closure construction!
17.8 Calling Functions
The last peice of the puzzle is making function calls and closures work together. Remember that a λ-expression is compiled into a function that expects a closure plus its arguments on the stack.
So the code generated for a function call needs to manage running each subexpression, the first of which should evaluate to a function (i.e. a pointer to a label and environment in memory) and then fetching the function’s label and jumping to it.
Here is the code for the non-tail-calls:
; Expr [Listof Expr] CEnv -> Asm ; The return address is placed above the arguments, so callee pops ; arguments and return address is next frame (define (compile-app-nontail e es c) (let ((r (gensym 'ret)) (i (* 8 (length es)))) (seq (Lea rax r) (Push rax) (compile-es (cons e es) (cons #f c)) (Mov rax (Offset rsp i)) (assert-proc rax) (Xor rax type-proc) (Mov rax (Offset rax 0)) ; fetch the code label (Jmp rax) (Label r))))
Compared to the previous version of this code, it additionally executes the code for e. After all the subexpression are evaluated, it fetches the value of e off the stack, checks that it is a function, then fetches the label for the function’s code and jumps to it. Notice how the stack naturally has the function as the top-most element. This is used by the code for the function to fetch the values stored in the closure.
The code for tail calls is similar, but adapted to avoid pushing a return frame and to pop the local environment before jumping:
; Expr [Listof Expr] CEnv -> Asm (define (compile-app-tail e es c) (seq (compile-es (cons e es) c) (move-args (add1 (length es)) (length c)) (Add rsp (* 8 (length c))) (Mov rax (Offset rsp (* 8 (length es)))) (assert-proc rax) (Xor rax type-proc) (Mov rax (Offset rax 0)) (Jmp rax)))
We’ve now implemented all there is to first-class functions. It’s possible to write recursive functions using the Y-combinator, although that’s no so convenient. Next we can tackle the issue of recursive or even sets of mutually recursive functions by dealing with top-level function definitions.
17.9 Recursive Functions
Writing recursive programs with the Y-combinator is a bit inconvenient.
We previously had the ability to write recursive or even mutually recursive function definitions by defining them at the top-level with define, although that was before functions were considered first-class values.
What changes now?
Well, one view is that (define (f x) (add1 x)) is really just defining a function and giving it a name. In other words, it’s really just saying (define f (lambda (x) (add1 x))). We already know how to compile lambda-expressions and we all ready know how to bind variable names to values, so it would seem this is not so difficult to accomodate.
A program consisting of a series of function definitions followed by an expression can first compile all the function definitions, then create a series of closures, push them on the stack, then execute the main expression in an environment that includes the names of the defined functions.
That will work just fine for an example like (define (f x) (add1 x)) (f 5).
Where it breaks down is in a program like this:
(define (f n) (if (zero? n) 1 (+ n (f (sub1 n))))) (f 10)
Why? Because the (implicit) lambda-expression here has a free variable f. In the closure representation, what should the value of this variable be? It should be the function f itself. In other words, it should be a tagged pointer to the closure, meaning that the closure representation of a recursive function is a cyclic data structure!
But how can we create such a structure? In creating the closure representation of the function f we would need to write the pointer to the value we are constructing as we construct it.
To make matters worse, consider a set of mutually recursive functions like this:
(define (even? x) (if (zero? x) #t (odd? (sub1 x)))) (define (odd? x) (if (zero? x) #f (even? (sub1 x)))) (even? 101)
Both even? and odd? contain a free variable: for even? it’s odd? and for odd? it’s even?. Hence the closure representation of even? should be two words long; the first words will be the address of the label that contains even?’s code and the second word will be the tagged pointer to the odd? closure. Likewise, the closure representation of odd? will be two words long, containing the address of the label for odd? followed by the tagged pointer to the even? closure.
How can we possible construct these two closures that must each point to the other?
The solution here is to recognize that the closures can be constructed in a staged way. We can lay out the memory for each closure but delay writing the value of the free variables. This is possible because all we need to know in order to allocate the memory for a closure is the number of free variables that occur in the syntax of the lambda-expression. Once we have addresses for each closure we are constructing, we can then go back and initialize each closure writing the value of its free variables. Doing this staged initialization is safe because we know that none of these functions can be called before the initialization is complete. (Try to convince yourself of this by considering the grammar of programs.)
Using that idea, we can compile the functions defined at the top-level in a slightly different way from lambda-expressions. We will first allocate memory for all of the closures and push tagged pointers for each of them on the stack, effectively binding the defined function names to their (unitialized) closures. We then copy free variable values to memory, initializing the closures. Doing it in this way allows functions to refer back to themselves or other top-level function definitions.
First, the easy stuff: the code of a top-level function definition is compiled just like a lambda-expression:
; Defn -> Asm (define (compile-define d) (match d [(Defn f xs e) (compile-lambda-define (Lam f xs e))]))
We extend this to lists of function definitions in the obvious way:
; [Listof Defn] -> Asm (define (compile-defines ds) (match ds ['() (seq)] [(cons d ds) (seq (compile-define d) (compile-defines ds))]))
And in compiling a program (Prog ds e) we make sure to emit (compile-defines ds).
Now we have to turn to creating all of the closures for ds. To accomplish this, we write a function (compile-defines-values ds) that will create a closure for each function definition and push it on the stack.
The top-level expression e will no longer be compiled in the empty environment, but instead in an environment that includes all of the names defined as functions. So to compile (Prog ds e) we (compile-e e (reverse (define-ids ds)) #t), where define-ids is a simple function for fetching the list of function names defined by ds (the list of names is reversed because the functions are pushed on in the order they appear, hence the last function is the most recently pushed).
Here is the definition of compile-defines-values:
; Defns -> Asm ; Compile the closures for ds and push them on the stack (define (compile-defines-values ds) (seq (alloc-defines ds 0) (init-defines ds (reverse (define-ids ds)) 8) (add-rbx-defines ds 0)))
It does the staged allocation and initialization of the closures as described earlier. Once the closures are allocated and initialized, it bumps 'rbx by the total size of all the allocated closures.
The alloc-defines function allocates, but leaves uninitialized, each of the closures and pushes them on the stack:
; Defns Int -> Asm ; Allocate closures for ds at given offset, but don't write environment yet (define (alloc-defines ds off) (match ds ['() (seq)] [(cons (Defn f xs e) ds) (let ((fvs (fv (Lam f xs e)))) (seq (Lea rax (symbol->label f)) (Mov (Offset rbx off) rax) (Mov rax rbx) (Add rax off) (Or rax type-proc) (Push rax) (alloc-defines ds (+ off (* 8 (add1 (length fvs)))))))]))
The init-defines function intializes each of the closures using free-vars-to-heap:
; Defns CEnv Int -> Asm ; Initialize the environment for each closure for ds at given offset (define (init-defines ds c off) (match ds ['() (seq)] [(cons (Defn f xs e) ds) (let ((fvs (fv (Lam f xs e)))) (seq (free-vars-to-heap fvs c off) (init-defines ds c (+ off (* 8 (add1 (length fvs)))))))]))
Finally, the add-rbx-defines function computes the total size of all the closures and adjusts 'rbx appropriately:
; Defns Int -> Asm ; Compute adjustment to rbx for allocation of all ds (define (add-rbx-defines ds n) (match ds ['() (seq (Add rbx (* n 8)))] [(cons (Defn f xs e) ds) (add-rbx-defines ds (+ n (add1 (length (fv (Lam f xs e))))))]))
17.10 A Complete Compiler
Putting all the pieces together, we have the complete compile for Loot:
#lang racket (provide (all-defined-out)) (require "ast.rkt" "types.rkt" "lambdas.rkt" "fv.rkt" "compile-ops.rkt" a86/ast) ;; Registers used (define rax 'rax) ; return (define rbx 'rbx) ; heap (define rsp 'rsp) ; stack (define rdi 'rdi) ; arg ;; type CEnv = [Listof Id] ;; Prog -> Asm (define (compile p) (match p [(Prog ds e) (prog (externs) (Global 'entry) (Label 'entry) (Push rbx) ; save callee-saved register (Mov rbx rdi) ; recv heap pointer (compile-defines-values ds) (compile-e e (reverse (define-ids ds)) #f) (Add rsp (* 8 (length ds))) ;; pop function definitions (Pop rbx) ; restore callee-save register (Ret) (compile-defines ds) (compile-lambda-defines (lambdas p)) (Label 'raise_error_align) pad-stack (Call 'raise_error))])) (define (externs) (seq (Extern 'peek_byte) (Extern 'read_byte) (Extern 'write_byte) (Extern 'raise_error))) ;; [Listof Defn] -> [Listof Id] (define (define-ids ds) (match ds ['() '()] [(cons (Defn f xs e) ds) (cons f (define-ids ds))])) ;; [Listof Defn] -> Asm (define (compile-defines ds) (match ds ['() (seq)] [(cons d ds) (seq (compile-define d) (compile-defines ds))])) ;; Defn -> Asm (define (compile-define d) (match d [(Defn f xs e) (compile-lambda-define (Lam f xs e))])) ;; [Listof Lam] -> Asm (define (compile-lambda-defines ls) (match ls ['() (seq)] [(cons l ls) (seq (compile-lambda-define l) (compile-lambda-defines ls))])) ;; Lam -> Asm (define (compile-lambda-define l) (let ((fvs (fv l))) (match l [(Lam f xs e) (let ((env (append (reverse fvs) (reverse xs) (list #f)))) (seq (Label (symbol->label f)) (Mov rax (Offset rsp (* 8 (length xs)))) (Xor rax type-proc) (copy-env-to-stack fvs 8) (compile-e e env #t) (Add rsp (* 8 (length env))) ; pop env (Ret)))]))) ;; [Listof Id] Int -> Asm ;; Copy the closure environment at given offset to stack (define (copy-env-to-stack fvs off) (match fvs ['() (seq)] [(cons _ fvs) (seq (Mov r9 (Offset rax off)) (Push r9) (copy-env-to-stack fvs (+ 8 off)))])) ;; Expr CEnv Bool -> Asm (define (compile-e e c t?) (match e [(Int i) (compile-value i)] [(Bool b) (compile-value b)] [(Char c) (compile-value c)] [(Eof) (compile-value eof)] [(Empty) (compile-value '())] [(Var x) (compile-variable x c)] [(Str s) (compile-string s)] [(Prim0 p) (compile-prim0 p c)] [(Prim1 p e) (compile-prim1 p e c)] [(Prim2 p e1 e2) (compile-prim2 p e1 e2 c)] [(Prim3 p e1 e2 e3) (compile-prim3 p e1 e2 e3 c)] [(If e1 e2 e3) (compile-if e1 e2 e3 c t?)] [(Begin e1 e2) (compile-begin e1 e2 c t?)] [(Let x e1 e2) (compile-let x e1 e2 c t?)] [(App e es) (compile-app e es c t?)] [(Lam f xs e) (compile-lam f xs e c)] [(Match e ps es) (compile-match e ps es c t?)])) ;; Value -> Asm (define (compile-value v) (seq (Mov rax (value->bits v)))) ;; Id CEnv -> Asm (define (compile-variable x c) (let ((i (lookup x c))) (seq (Mov rax (Offset rsp i))))) ;; String -> Asm (define (compile-string s) (let ((len (string-length s))) (if (zero? len) (seq (Mov rax type-str)) (seq (Mov rax len) (Mov (Offset rbx 0) rax) (compile-string-chars (string->list s) 8) (Mov rax rbx) (Or rax type-str) (Add rbx (+ 8 (* 4 (if (odd? len) (add1 len) len)))))))) ;; [Listof Char] Integer -> Asm (define (compile-string-chars cs i) (match cs ['() (seq)] [(cons c cs) (seq (Mov rax (char->integer c)) (Mov (Offset rbx i) 'eax) (compile-string-chars cs (+ 4 i)))])) ;; Op0 CEnv -> Asm (define (compile-prim0 p c) (compile-op0 p)) ;; Op1 Expr CEnv -> Asm (define (compile-prim1 p e c) (seq (compile-e e c #f) (compile-op1 p))) ;; Op2 Expr Expr CEnv -> Asm (define (compile-prim2 p e1 e2 c) (seq (compile-e e1 c #f) (Push rax) (compile-e e2 (cons #f c) #f) (compile-op2 p))) ;; Op3 Expr Expr Expr CEnv -> Asm (define (compile-prim3 p e1 e2 e3 c) (seq (compile-e e1 c #f) (Push rax) (compile-e e2 (cons #f c) #f) (Push rax) (compile-e e3 (cons #f (cons #f c)) #f) (compile-op3 p))) ;; Expr Expr Expr CEnv Bool -> Asm (define (compile-if e1 e2 e3 c t?) (let ((l1 (gensym 'if)) (l2 (gensym 'if))) (seq (compile-e e1 c #f) (Cmp rax val-false) (Je l1) (compile-e e2 c t?) (Jmp l2) (Label l1) (compile-e e3 c t?) (Label l2)))) ;; Expr Expr CEnv Bool -> Asm (define (compile-begin e1 e2 c t?) (seq (compile-e e1 c #f) (compile-e e2 c t?))) ;; Id Expr Expr CEnv Bool -> Asm (define (compile-let x e1 e2 c t?) (seq (compile-e e1 c #f) (Push rax) (compile-e e2 (cons x c) t?) (Add rsp 8))) ;; Id [Listof Expr] CEnv Bool -> Asm (define (compile-app f es c t?) ;(compile-app-nontail f es c) (if t? (compile-app-tail f es c) (compile-app-nontail f es c))) ;; Expr [Listof Expr] CEnv -> Asm (define (compile-app-tail e es c) (seq (compile-es (cons e es) c) (move-args (add1 (length es)) (length c)) (Add rsp (* 8 (length c))) (Mov rax (Offset rsp (* 8 (length es)))) (assert-proc rax) (Xor rax type-proc) (Mov rax (Offset rax 0)) (Jmp rax))) ;; Integer Integer -> Asm (define (move-args i off) (cond [(zero? off) (seq)] [(zero? i) (seq)] [else (seq (Mov r8 (Offset rsp (* 8 (sub1 i)))) (Mov (Offset rsp (* 8 (+ off (sub1 i)))) r8) (move-args (sub1 i) off))])) ;; Expr [Listof Expr] CEnv -> Asm ;; The return address is placed above the arguments, so callee pops ;; arguments and return address is next frame (define (compile-app-nontail e es c) (let ((r (gensym 'ret)) (i (* 8 (length es)))) (seq (Lea rax r) (Push rax) (compile-es (cons e es) (cons #f c)) (Mov rax (Offset rsp i)) (assert-proc rax) (Xor rax type-proc) (Mov rax (Offset rax 0)) ; fetch the code label (Jmp rax) (Label r)))) ;; Defns -> Asm ;; Compile the closures for ds and push them on the stack (define (compile-defines-values ds) (seq (alloc-defines ds 0) (init-defines ds (reverse (define-ids ds)) 8) (add-rbx-defines ds 0))) ;; Defns Int -> Asm ;; Allocate closures for ds at given offset, but don't write environment yet (define (alloc-defines ds off) (match ds ['() (seq)] [(cons (Defn f xs e) ds) (let ((fvs (fv (Lam f xs e)))) (seq (Lea rax (symbol->label f)) (Mov (Offset rbx off) rax) (Mov rax rbx) (Add rax off) (Or rax type-proc) (Push rax) (alloc-defines ds (+ off (* 8 (add1 (length fvs)))))))])) ;; Defns CEnv Int -> Asm ;; Initialize the environment for each closure for ds at given offset (define (init-defines ds c off) (match ds ['() (seq)] [(cons (Defn f xs e) ds) (let ((fvs (fv (Lam f xs e)))) (seq (free-vars-to-heap fvs c off) (init-defines ds c (+ off (* 8 (add1 (length fvs)))))))])) ;; Defns Int -> Asm ;; Compute adjustment to rbx for allocation of all ds (define (add-rbx-defines ds n) (match ds ['() (seq (Add rbx (* n 8)))] [(cons (Defn f xs e) ds) (add-rbx-defines ds (+ n (add1 (length (fv (Lam f xs e))))))])) ;; Id [Listof Id] Expr CEnv -> Asm (define (compile-lam f xs e c) (let ((fvs (fv (Lam f xs e)))) (seq (Lea rax (symbol->label f)) (Mov (Offset rbx 0) rax) (free-vars-to-heap fvs c 8) (Mov rax rbx) ; return value (Or rax type-proc) (Add rbx (* 8 (add1 (length fvs))))))) ;; [Listof Id] CEnv Int -> Asm ;; Copy the values of given free variables into the heap at given offset (define (free-vars-to-heap fvs c off) (match fvs ['() (seq)] [(cons x fvs) (seq (Mov r8 (Offset rsp (lookup x c))) (Mov (Offset rbx off) r8) (free-vars-to-heap fvs c (+ off 8)))])) ;; [Listof Expr] CEnv -> Asm (define (compile-es es c) (match es ['() '()] [(cons e es) (seq (compile-e e c #f) (Push rax) (compile-es es (cons #f c)))])) ;; Expr [Listof Pat] [Listof Expr] CEnv Bool -> Asm (define (compile-match e ps es c t?) (let ((done (gensym))) (seq (compile-e e c #f) (Push rax) ; save away to be restored by each clause (compile-match-clauses ps es (cons #f c) done t?) (Jmp 'raise_error_align) (Label done) (Add rsp 8)))) ; pop the saved value being matched ;; [Listof Pat] [Listof Expr] CEnv Symbol Bool -> Asm (define (compile-match-clauses ps es c done t?) (match* (ps es) [('() '()) (seq)] [((cons p ps) (cons e es)) (seq (compile-match-clause p e c done t?) (compile-match-clauses ps es c done t?))])) ;; Pat Expr CEnv Symbol Bool -> Asm (define (compile-match-clause p e c done t?) (let ((next (gensym))) (match (compile-pattern p '() next) [(list i f cm) (seq (Mov rax (Offset rsp 0)) ; restore value being matched i (compile-e e (append cm c) t?) (Add rsp (* 8 (length cm))) (Jmp done) f (Label next))]))) ;; Pat CEnv Symbol -> (list Asm Asm CEnv) (define (compile-pattern p cm next) (match p [(PWild) (list (seq) (seq) cm)] [(PVar x) (list (seq (Push rax)) (seq) (cons x cm))] [(PLit l) (let ((fail (gensym))) (list (seq (Cmp rax (value->bits l)) (Jne fail)) (seq (Label fail) (Add rsp (* 8 (length cm))) (Jmp next)) cm))] [(PAnd p1 p2) (match (compile-pattern p1 (cons #f cm) next) [(list i1 f1 cm1) (match (compile-pattern p2 cm1 next) [(list i2 f2 cm2) (list (seq (Push rax) i1 (Mov rax (Offset rsp (* 8 (- (sub1 (length cm1)) (length cm))))) i2) (seq f1 f2) cm2)])])] [(PBox p) (match (compile-pattern p cm next) [(list i1 f1 cm1) (let ((fail (gensym))) (list (seq (Mov r8 rax) (And r8 ptr-mask) (Cmp r8 type-box) (Jne fail) (Xor rax type-box) (Mov rax (Offset rax 0)) i1) (seq f1 (Label fail) (Add rsp (* 8 (length cm))) ; haven't pushed anything yet (Jmp next)) cm1))])] [(PCons p1 p2) (match (compile-pattern p1 (cons #f cm) next) [(list i1 f1 cm1) (match (compile-pattern p2 cm1 next) [(list i2 f2 cm2) (let ((fail (gensym))) (list (seq (Mov r8 rax) (And r8 ptr-mask) (Cmp r8 type-cons) (Jne fail) (Xor rax type-cons) (Mov r8 (Offset rax 0)) (Push r8) ; push cdr (Mov rax (Offset rax 8)) ; mov rax car i1 (Mov rax (Offset rsp (* 8 (- (sub1 (length cm1)) (length cm))))) i2) (seq f1 f2 (Label fail) (Add rsp (* 8 (length cm))) ; haven't pushed anything yet (Jmp next)) cm2))])])])) ;; Id CEnv -> Integer (define (lookup x cenv) (match cenv ['() (error "undefined variable:" x)] [(cons y rest) (match (eq? x y) [#t 0] [#f (+ 8 (lookup x rest))])])) ;; Symbol -> Label ;; Produce a symbol that is a valid Nasm label (define (symbol->label s) (string->symbol (string-append "label_" (list->string (map (λ (c) (if (or (char<=? #\a c #\z) (char<=? #\A c #\Z) (char<=? #\0 c #\9) (memq c '(#\_ #\$ #\# #\@ #\~ #\. #\?))) c #\_)) (string->list (symbol->string s)))) "_" (number->string (eq-hash-code s) 16))))