Recreating the Haskell List Part 5: Monads

This is part 5 of a 6 part tutorial

Monads

The Haskell list type is also an instance of the Monad typeclass. There are four functions defined for a monad, but you only need to implement 2 of them: >>= (pronounced bind) and return.

The Monad Typeclass

class Monad m where
    (>>=) :: m a -> (a -> m b) -> m b
    (>>) :: m a -> m b -> m b
    return :: a -> m a
    fail :: String -> m a

The easiest function is return. It just wraps a value in the monad container, and it is exactly the same thing as pure from the Applicative class. The bind function takes a monad holding a value of type a, and a function which can change an a into the same type of monad holding something of type b.

Before making MyList an instance of Monad, it might be easier to see what happens if we make something simpler like Container from part 1 an instance first.

Making Container a Monad

instance Monad Container where
    (Container contents) >>= f = f contents
    return a = Container a
-- binding a container to a function
(Container 3) >>= (\x -> return $ x + 2)
=> Container 5

The Container on the left hand side is a monad and holds a 3, the function on the right hand side of bind always accepts one argument. In this case the implementation I wrote for bind just passes the inner value of Container to the function.

The function we passed just adds 2 to the value and rewraps it using return. This implementation is the least exciting thing that a monad can do, because Container is now the identity monad.
The identity monad performs just simple function application, and doesn’t employ any computational strategy. Making MyList into a monad is significantly more amazing.

instance Monad MyList where
    return = pure
    Empty >>= _ = Empty
    (Cons car cdr) >>= f = (f car) `mappend` (cdr >>= f)

The above says that binding any function to an empty list just returns an empty list. In the case that we have list items, the head and tail of the list are pulled apart, the bound function is applied to the head of the list, and recursion is used to bind the function to the remaining tail of the list. These results are appended together using mappend from the Monoid typeclass, resulting in one list at the end.

Since the bound function f must return a wrapped value, each item it returns is a list with one item in it such as (Cons 1 Empty) or [1] where the item inside has been modified by the function.
Then it appends all these lists into one list.

Simple usage of bind

--  Just like fmap
(Cons 1 (Cons 2 Empty)) >>= (\x -> return $ x * 2)
=> Cons {car = 2, cdr = Cons {car = 4, cdr = Empty}}
--  Native list version
[1,2] >>= (\x -> return $ x * 2)
=> [2,4]

This looks a lot like fmap, but the difference is that the function you apply has to return an already wrapped type, and so you can chain these together in an ever increasing closure, or enclosed scope inside lambdas. This means that each new lambda closure brings its argument within scope of all the rest.

Nested bind

[1,2] >>=
    (\x -> [3,4] >>=
        (\y -> [5,6] >>=
            (\z -> return (x,y,z))))
=> [(1,3,5),(1,3,6),(1,4,5),(1,4,6),(2,3,5),(2,3,6),(2,4,5),(2,4,6)]

Because of the recursion in bind, this is basically 3 nested loops, giving you every combination of the 3 lists. The same thing will work on MyList, except it will look ugly for lack of pretty printing.

This is the basis for list comprehensions in languages like in Erlang, Python, and Haskell itself. Haskell provides 2 types of syntactic sugar for this, do-expressions for monads, and list comprehensions for the list monad specifically.

List comprehension

[(x,y,z) | x <- [1,2], y <- [3,4], z <- [5,6]]
=> [(1,3,5),(1,3,6),(1,4,5),(1,4,6),(2,3,5),(2,3,6),(2,4,5),(2,4,6)]

In an imperative programming language you can write a list of things to do in a function like a recipe. When you make variables in a function scope they are available to everything within that scope, but writing Haskell is not like writing a todo list of what to do, and in what order, and carrying state from one todo item to another. It is like like declaring what something is in one expression, and it doesn’t really let you say what order anything should be evaluated. Haskell evaluates things in the order that it needs to know the value.

A monad can let you simulate sequence by nesting bind functions, because it will cause evaluation in the order that you nest the bound functions. It evaluates in a specific order because nesting makes each closure rely on the value of the previous outer closure.

It will also allow you to build scope that each function you bind can share, due to the nested lambda expressions.

The people who created Haskell made something called a do-expression that is syntactic sugar for monadic binding, and it sort of makes your code appear to be an imperative programming language with sequence and imperative style scope.

The List Monad with do-notation

combos = do
    x <- [1,2]
    y <- [3,4]
    z <- [5,6]
    return (x,y,z)
combos
=> [(1,3,5),(1,3,6),(1,4,5),(1,4,6),(2,3,5),(2,3,6),(2,4,5),(2,4,6)]