Zippers
Let’s imagine we’re implementng a text input. What kind of a data definition should we use? Here’s one option.
type BadInput = (String, Int)
Here we interpret String
as the content of the input. The Int
is the cursor’s position in the input. Suppose now we wish to write an insert function that takes a text input and inserts a character before the cursor.
badPutChar :: BadInput -> Char -> BadInput badPutChar (xs, 0) c = (c : xs, 1) badPutChar (x : xt, k) c = (x : xt', k' + 1) where (xt', k') = badPutChar (xt, k - 1) c
Consider a function that inserts a whole String
instead of just a Char
.
badPutStr :: BadInput -> String -> BadInput badPutStr input s = foldl badPutChar input s
Maybe you can see why this is not a great representation for the text input. Suppose we’re editing the end of our input. Putting a single character will traverse the entire input to add a character. For badPutStr
we traverse the entire input n
times where n
is the length of the string. That’s terrible!
Can you think of a better representation?
The List Zipper
type Input = (String, String)
Here the first String
will be text before the cursor reversed. Slightly weird, but we’ll see why in a minute. The second String
is the text after the cursor. To make a connection with our previous structure, we can define a function that can take an Input
to a BadInput
.
goodToBad :: Input -> BadInput goodToBad (pre, post) = ((reverse pre) ++ post, length pre)
Now we can write putChar
and putStr
.
putChar :: Input -> Char -> Input putChar (pre, post) c = (c : pre, post) putStr :: Input -> String -> Input putStr input s = foldl putChar input s
Notice how much simpler putChar
is! The performance of our program is much better now too. A call to putChar
is constant-time. For editing operations, this representation is far superior.
While we used the example of a text input to motivate this representation, this technique is totally general. This would speed up any situation where we have a list of elements that we repeatedly make local edits.
type ListZipper a = ([a], [a])
We call the first component of the tuple the context and the second component the control. Think of the context as the original data structure with a hole in it. It’s made of a list of frames where each frame is a Cons
with a hole in it. The control is the thing needed to plug the whole to get the original structure. It turns out that this zipper transformation can work over more types than just lists. You can have a zipper for a tree that can make constant-time local edits.
Zipping Binary Trees
Here is your run-of-the-mill binary tree.
data Tree a = Empty | Node a (Tree a) (Tree a) deriving (Show)
To get a zipper we need a representation for the context.
data HolyNode a = LeftHole a (Tree a) | RightHole a (Tree a) deriving (Show) type Context a = [HolyNode a] type TreeZipper a = (Context a, Tree a)
We can now traverse this structure and make local edits.
goLeft :: TreeZipper a -> TreeZipper a goLeft (ctx, Node x l r) = (LeftHole x r : ctx, l) goRight :: TreeZipper a -> TreeZipper a goRight (ctx, Node x l r) = (RightHole x l : ctx, r) goUp :: TreeZipper a -> TreeZipper a goUp (LeftHole x r : ctx, t) = (ctx, Node x t r) goUp (RightHole x l : ctx, t) = (ctx, Node x l t)
Data Type Algebra
And now for something completely different.
The unit value
()
has type()
. I will write unit types as1
instead.Booleans have type
Bool
. I will write them as2
instead.The pair value
((), ())
has type((), ())
. I will write pairs types with*
instead. So, that value now has type1 * 1
.The
Either
valueLeft ()
has typeEither () b
. I will denoteEither
types with+
. So, that value now has type1 + b
.
In our new notation, this function f
has what return type?
f True = Left () f False = Right ()
This should be 1 + 1
. We’ll also write this as 2
. In general 1 + ... + 1
will be written as the number it adds up to. Same for products. One more exercise, what’s the return type of g
in our new notation?
g x y = (f x, f y)
This should be 2 * 2 = 4
. You should verify that if we have a type n
, then there are n
values of that type. We can see this pretty explicitly with the above example.
g True True = (Left (), Left ()) g True False = (Left (), Right ()) g False True = (Right (), Left ()) g False False = (Right (), Right ())
This seems like a cute notation, but is it more than just cute? Is there something deeper going on here?
The List Type
Recall the definition of a list type.
data List a = Nil | Cons a (List a)
I’m going to write this in a slightly more primitive way.
data List a = Either () (a, List a)
This is the same type as before except we’re not using Haskell’s nice syntax for defining types. Instead everything is just in terms of the more primitive notions. Here we represent the list [1, 2, 3]
as Right (1, Right (2, Right (3, Left ())))
.
Remember that list isn’t a type, but a whole family of types. Given an element type, we can give back a particular list type specialized to that list. We can model this as a function over types. In our new notation a list of a
has type L(a)
where L(t) = 1 + t * L(t)
.
Doing Algebra
Forget we’re talking about types. Let’s do some math.
What is the derivative of
1 + ax
with respect tox
? Should bea
.How about
1 + ax^2
with respect tox
? That’s2ax
.
Do these look familiar? They’re the types of our context frames! This works in general. Let’s try something else.
L(t) = 1 + t * L(t)
. Solving for L
we get L(t) = 1 / (1 - t)
. Does anyone remember the Taylor expansion of this? Yes, L(t) = 1 + t + t^2 + t^3 + ...
.
So we’ve derived an alternative definition of L(t)
. Turn on your type interpretation again. This is either 1
which is an empty list, t
which is a singleton list, t^2
a list with two elements, ... Looks like a list to me!
Try the same thing for binary trees. What do you get? The Catalan numbers for a good reason. Cn
is the number of full binary trees with n + 1
leaves.
Additional Reading
For the practical parts about using Haskell for zippers, see LYaH.
Gerard Huet wrote the first zipper paper.
Conor McBride discovered that the type of a zipper is the derivative of its data type.
Erik Hinton presented that paper at the Papers We Love meetup.
There is a nice series on zippers by Pavel Panchekha with even more interesting connections.
Finally, more of the mathematics can be found in a lecture note presented by Mathias Pedersen.