Functors

This lecture is adjusted from Niki Vazou who adjusted from Ranjit Jhala who adjusted from Graham Hutton.

Mapping Over Lists

As a brief review, let’s recall mapping over lists. Let’s suppose we want to take a list of integers and add one to each integer in the list. We might define that as follows

inc :: [Int] -> [Int] -- Note: [Int] is spe­cial syn­tax for [] Int
inc []     = []
inc (i:is) = i+1 : inc is

When we load this function into our trusty Haskell REPL we get

Pre­lude> inc [1,2,3,4]
[2,3,4,5]

Let’s now suppose we want to take a list of integers and square each integer in the list. We might write the following:

square :: [Int] -> [Int]
square []     = []
square (i:is) = i^2 : square is

In our REPL we get

Pre­lude> square [1,2,3,4]
[1,4,9,16]

Being good programmers, we realize there is a common pattern in these two tasks. In particular, we see that there is a function f (either inc or square that we want to apply to each element in the list. Therefore, we write the following function

imap :: (Int -> Int) -> [Int] -> [Int]
imap f []     = []
imap f (i:is) = (f i) : imap f is

Now we can rewrite our two functions above

inc' :: [Int] -> [Int]
inc' = imap ((+) 1)

square' :: [Int] -> [Int]
square' = imap (\x -> x^2)
  -- Note: we could also do imap (^2) b/c of Haskell's
  -- han­dling of infix func­tions

and

Pre­lude> inc' [1,2,3,4]
[2,3,4,5]
Pre­lude> square' [1,2,3,4]
[1,4,9,16]

Why stop at lists of integers? Let’s generalize map so that we can map over lists of any type.

map :: (a -> b) -> [a] -> [b]
map f []     = []
map f (i:is) = (f i) : map f is

Mapping Over Trees

Let’s now consider binary trees, defined as

data Tree a = Leaf | Bin a (Tree a) (Tree a)
   deriv­ing (Eq, Show)

Similar to lists, we want to take a tree and have a function that increments each element and another that squares each element in the tree.

tinc :: Tree Int -> Tree Int
tinc Leaf         = Leaf
tinc (Bin i l r)  = Bin (i+1) (tinc l) (tinc r)

tsquare :: Tree Int -> Tree Int
tsquare Leaf         = Leaf
tsquare (Bin i l r)  = Bin (i^2) (tsquare l) (tsquare r)

Again, we notice a pattern. For both functions, we apply some f to the value contained at a node and recursively apply it to both subtrees; if we reach a leaf we simply return that leaf. Let’s define map for trees. (Notice this time I skipped right to the generalized version.)

tmap :: (a -> b) -> Tree a -> Tree b
tmap f Leaf        = Leaf
tmap f (Bin i l r) = Bin (f i) (tmap f l) (tmap f r)

The Functor Typeclass

Let’s compare the types tmap and map.

tmap :: (a -> b) -> Tree a -> Tree b
map  :: (a -> b) -> List a -> List b

These look very similar. Can we abstract them into one idea? This looks like a job for typeclasses!

class Func­tor m where
   fmap :: (a -> b) -> m a -> m b

In this case, we define the typeclass Functor for which there is one function fmap. We say a type m is an instance of the Functor typeclass if we define a function fmap for it which given a function f from type a to type b, and a type m of a, we get back a type m of b. For instance, if m is the List type we have

instance Func­tor List where
   fmap f []     = []
   fmap f (i:is) = (f i) : fmap f is

and if m is Tree then we have

instance Func­tor Tree where
   fmap f Leaf     = Leaf
   fmap f (Bin i l r) = Bin (f i) (fmap f l) (fmap f r)

And there a bunch more instances of Functor in Haskell!

instance Func­tor Maybe where
   fmap f Noth­ing  = Noth­ing
   fmap f (Just x) = Just (f x)

instance Func­tor IO where
   fmap f io = do
      a <- io
      return (f a)

The Functor Laws

Recall that Haskell typeclasses can also have “laws”, which is to say properties of the typeclass functions which should be preserved for valid instances. For example, for a type to be an instance of the “Num” typeclass in Haskell, + over that type should be associative.

For the Functor typeclass, there are two laws which hold for valid instances of the type class. The first law states that mapping the identity function over the functor should simply return the original value. In Haskell terms

fmap id = id

We can see easily see this is true for List and Tree.

Pre­lude> fmap id [1,2,3,4]
[1,2,3,4]
Pre­lude> fmap id (Bin 2 (Leaf) (Leaf))
Bin 2 (Leaf) (Leaf)

The second law is a bit harder. It states that mapping the composition of two functions f and g over a functor should be the same as first mapping g over the functor and then mapping f over it. In Haskell terms

fmap (f . g) = (fmap f) . (fmap g)

So for list this might look like

Pre­lude> (fmap (\x->x^2) . fmap (\x->x+1)) [1,2,3,4]
[4,9,16,25]
Pre­lude> fmap ((\x->x^2) . (\x->x+1)) [1,2,3,4]
[4,9,16,25]

Why Functors?

So, why functors? We’ve seen that may common types, like [], Maybe, Trees, and IO can be made valid instances of the Functor typeclass. This in itself exposes the first real value of typeclasses. Similar to interfaces in Java, this allows us to write functions over functors that will work for any of the common types listed above. This type of abstraction can make code shorter and more understandable.

However, there is a second, more subtle advantage of Functors. Let’s explain this one by an example with our Tree typeclass from earlier. Let’s create a huge tree.

cre­ate­Tree :: Int -> Tree Int
cre­ate­Tree 0 = Leaf
cre­ate­Tree n = Bin n l r
  where l = cre­ate­Tree (n-1)
        r = cre­ate­Tree (n-1)

tree :: Tree Int
tree = cre­ate­Tree 30

tree has over 1 billion nodes in it!

Suppose we now want to add 1 to each element in the tree, multiply each element in the tree by 2, and then sum the result. We can do that using our tmap from earlier.

mapped­Tree1 :: Tree Int
mapped­Tree1 = tmap (+1) (tmap (*2) tree)

sum­Tree :: Tree Int -> Int
sum­Tree Leaf        = 0
sum­Tree (Bin i l r) = i + (sum­Tree l) + (sum­Tree r)

main = do
  print $ sum­Tree mapped­Tree1

This works, but can we do better? Well, do we have to call tmap twice to create mappedTree1? No, let’s write a new implementation, mappedTree2, using function composition.

mapped­Tree2 :: Tree Int
mapped­Tree2 = tmap ((+1) . (*2)) tree

This is better! Now we only iterate over the nodes once and only allocate one new tree instead of two!

This is a great optimization, but it did require some ingenuity on our part. What if this could be done automatically? Looking more closely at mappedTree1 and mappedTree2, we see that this transformation is really just the composition rule for functors. So indeed, for functors, our compiler can automatically perform this optimization, regardless of which specific functor instance (trees, lists, options, etc.).

This example exposes the second advantage of functors. Functors allow us to automatically perform rewriting due to the composition rule to reduce our cals to fmap.

Exercises for Home

Define an instance of Functor which does not obey the identity law. Now write one that does not obey the composition law.