Functors
This lecture is adjusted from Niki Vazou who adjusted from Ranjit Jhala who adjusted from Graham Hutton.
Mapping Over Lists
As a brief review, let’s recall mapping over lists. Let’s suppose we want to take a list of integers and add one to each integer in the list. We might define that as follows
inc :: [Int] -> [Int] -- Note: [Int] is special syntax for [] Int inc [] = [] inc (i:is) = i+1 : inc is
When we load this function into our trusty Haskell REPL we get
Prelude> inc [1,2,3,4] [2,3,4,5]
Let’s now suppose we want to take a list of integers and square each integer in the list. We might write the following:
square :: [Int] -> [Int] square [] = [] square (i:is) = i^2 : square is
In our REPL we get
Prelude> square [1,2,3,4] [1,4,9,16]
Being good programmers, we realize there is a common pattern in these two tasks. In particular, we see that there is a function f
(either inc
or square
that we want to apply to each element in the list. Therefore, we write the following function
imap :: (Int -> Int) -> [Int] -> [Int] imap f [] = [] imap f (i:is) = (f i) : imap f is
Now we can rewrite our two functions above
inc' :: [Int] -> [Int] inc' = imap ((+) 1) square' :: [Int] -> [Int] square' = imap (\x -> x^2) -- Note: we could also do imap (^2) b/c of Haskell's -- handling of infix functions
and
Prelude> inc' [1,2,3,4] [2,3,4,5] Prelude> square' [1,2,3,4] [1,4,9,16]
Why stop at lists of integers? Let’s generalize map so that we can map over lists of any type.
map :: (a -> b) -> [a] -> [b] map f [] = [] map f (i:is) = (f i) : map f is
Mapping Over Trees
Let’s now consider binary trees, defined as
data Tree a = Leaf | Bin a (Tree a) (Tree a) deriving (Eq, Show)
Similar to lists, we want to take a tree and have a function that increments each element and another that squares each element in the tree.
tinc :: Tree Int -> Tree Int tinc Leaf = Leaf tinc (Bin i l r) = Bin (i+1) (tinc l) (tinc r) tsquare :: Tree Int -> Tree Int tsquare Leaf = Leaf tsquare (Bin i l r) = Bin (i^2) (tsquare l) (tsquare r)
Again, we notice a pattern. For both functions, we apply some f
to the value contained at a node and recursively apply it to both subtrees; if we reach a leaf we simply return that leaf. Let’s define map for trees. (Notice this time I skipped right to the generalized version.)
tmap :: (a -> b) -> Tree a -> Tree b tmap f Leaf = Leaf tmap f (Bin i l r) = Bin (f i) (tmap f l) (tmap f r)
The Functor Typeclass
Let’s compare the types tmap
and map
.
tmap :: (a -> b) -> Tree a -> Tree b map :: (a -> b) -> List a -> List b
These look very similar. Can we abstract them into one idea? This looks like a job for typeclasses!
class Functor m where fmap :: (a -> b) -> m a -> m b
In this case, we define the typeclass Functor
for which there is one function fmap
. We say a type m
is an instance of the Functor
typeclass if we define a function fmap
for it which given a function f
from type a
to type b
, and a type m
of a
, we get back a type m
of b
. For instance, if m
is the List
type we have
instance Functor List where fmap f [] = [] fmap f (i:is) = (f i) : fmap f is
and if m
is Tree
then we have
instance Functor Tree where fmap f Leaf = Leaf fmap f (Bin i l r) = Bin (f i) (fmap f l) (fmap f r)
And there a bunch more instances of Functor in Haskell!
instance Functor Maybe where fmap f Nothing = Nothing fmap f (Just x) = Just (f x) instance Functor IO where fmap f io = do a <- io return (f a)
The Functor Laws
Recall that Haskell typeclasses can also have “laws”, which is to say properties of the typeclass functions which should be preserved for valid instances. For example, for a type to be an instance of the “Num” typeclass in Haskell, +
over that type should be associative.
For the Functor
typeclass, there are two laws which hold for valid instances of the type class. The first law states that mapping the identity function over the functor should simply return the original value. In Haskell terms
fmap id = id
We can see easily see this is true for List
and Tree
.
Prelude> fmap id [1,2,3,4] [1,2,3,4] Prelude> fmap id (Bin 2 (Leaf) (Leaf)) Bin 2 (Leaf) (Leaf)
The second law is a bit harder. It states that mapping the composition of two functions f
and g
over a functor should be the same as first mapping g
over the functor and then mapping f
over it. In Haskell terms
fmap (f . g) = (fmap f) . (fmap g)
So for list this might look like
Prelude> (fmap (\x->x^2) . fmap (\x->x+1)) [1,2,3,4] [4,9,16,25] Prelude> fmap ((\x->x^2) . (\x->x+1)) [1,2,3,4] [4,9,16,25]
Why Functors?
So, why functors? We’ve seen that may common types, like []
, Maybe
, Trees
, and IO
can be made valid instances of the Functor
typeclass. This in itself exposes the first real value of typeclasses. Similar to interfaces in Java, this allows us to write functions over functors that will work for any of the common types listed above. This type of abstraction can make code shorter and more understandable.
However, there is a second, more subtle advantage of Functors. Let’s explain this one by an example with our Tree
typeclass from earlier. Let’s create a huge tree.
createTree :: Int -> Tree Int createTree 0 = Leaf createTree n = Bin n l r where l = createTree (n-1) r = createTree (n-1) tree :: Tree Int tree = createTree 30
tree
has over 1 billion nodes in it!
Suppose we now want to add 1 to each element in the tree, multiply each element in the tree by 2, and then sum the result. We can do that using our tmap
from earlier.
mappedTree1 :: Tree Int mappedTree1 = tmap (+1) (tmap (*2) tree) sumTree :: Tree Int -> Int sumTree Leaf = 0 sumTree (Bin i l r) = i + (sumTree l) + (sumTree r) main = do print $ sumTree mappedTree1
This works, but can we do better? Well, do we have to call tmap
twice to create mappedTree1
? No, let’s write a new implementation, mappedTree2
, using function composition.
mappedTree2 :: Tree Int mappedTree2 = tmap ((+1) . (*2)) tree
This is better! Now we only iterate over the nodes once and only allocate one new tree instead of two!
This is a great optimization, but it did require some ingenuity on our part. What if this could be done automatically? Looking more closely at mappedTree1
and mappedTree2
, we see that this transformation is really just the composition rule for functors. So indeed, for functors, our compiler can automatically perform this optimization, regardless of which specific functor instance (trees, lists, options, etc.).
This example exposes the second advantage of functors. Functors allow us to automatically perform rewriting due to the composition rule to reduce our cals to fmap
.
Exercises for Home
Define an instance of
Functor
which does not obey the identity law. Now write one that does not obey the composition law.