## ARTICLE

# Abstracting Computations with Type Classes

*From **Haskell in Depth** by Vitaly Bragilevsky*

*In this article we’ll discuss several type classes that originated in mathematics (mainly abstract algebra and category theory).*

Take 40% off *Haskell in Depth* by entering **fccbragilevsky** into the discount code box at checkout at manning.com.

Unfortunately, there’s a long tradition of difficulty in grasping Haskell’s type classes, which can be partially attributed to the fact that Haskell’s creators kept the mathematical names for these type classes and created a myth that you have to know the math behind them in order to work with the corresponding concepts effectively. I strongly believe that the opposite is true: many programming languages have adopted what was thought of as mathematical ideas and transformed them into purely technical stuff that can be exploited without any fear. In Haskell, we have to stick with the somewhat scary names, but a little technical intuition helps you use these type classes for practical needs.

I recommend reading the *Typeclassopedia* by Brent Yorgey as a brilliant introduction to type classes for abstracting computations for the mathematically inclined.

Keep in mind that almost every Haskell type class provides you with some generalization across concrete types. Sometimes the same methods are used with values of different types in the same way thanks to type classes. We’ll do the same in this article; the only new thing is that we’ll generalize *computations* over values of various types. The type classes I’m going to discuss here are presented in Figure 1, where solid arrows represent extending type classes.

As seen in the figure, three groups of type classes stand out:

*Operation*— For abstracting binary operations (combining two values into one of the same type)*Computational context*— For computing the value of some type in a context, and that computation can be accompanied by some*effects**Iteration*— For repeated computations over some structure, potentially collecting the results in a context

We’ll discuss some aspects of these abstract descriptions and use practical examples.

**The only operation, (<>)**

When using these two type classes, `Semigroup`

and `Monoid`

, keep one simple rule in mind— provide a generic binary operation over the values of any datatype that defines their instances. The type class `Semigroup`

declares such an operation as:

`(<>) :: Semigroup a => a -> a -> a`

Many things can be combined with it. For example:

`ghci> import Data.Semigroup ghci> [1,2,3] <> [4,5,6] [1,2,3,4,5,6] ghci> "Hello " <> "world" "Hello world" ghci> Sum 2 <> Sum 3 Sum {getSum = 5} ghci> Product 2 <> Product 3 Product {getProduct = 6}`

Remember that a `String`

is a synonym for a list of `Char`

, and the first two examples share an instance for lists (and this operation is only list concatenation). The third and fourth examples here apply instances which represent addition and multiplication—two basic operations over numeric types. They feature `newtype`

wrappers `Sum`

and `Product`

for any type `a`

with an instance of `Num a`

.

The type class `Monoid`

adds another method:

`mempty :: Monoid a => a`

which returns a *neutral* element:

`ghci> mempty :: [Int] [] ghci> mempty :: String ""`

ghci> mempty :: Sum Int Sum {getSum = 0}

ghci> mempty :: Product Int

Product {getProduct = 1}

This element is called neutral because it’s expected to satisfy *monoid laws*, and for any element `a`

:

`mempty <> a == a a <> mempty == a`

The empty list, an empty `String`

, and the numbers `Sum 0`

and `Product 1`

satisfy these rules with respect to the corresponding operation.

In general, you’d want to use the operation `(<>)`

to accumulate information, like in concatenating lists and other containers (`Data.Set`

or `Data.Sequence`

to name a few), combining configuration properties, and such things. When you encounter a new datatype it’s always a good idea to look for provided instances. If you see instances for `Semigroup`

or

`Monoid`

, then you know that values of this datatype can be combined with `(<>)`

. For example:

`ghci> import Data.Text ghci> :info Text data Text instance Monoid Text -- Defined in 'Data.Text' instance Semigroup Text -- Defined in 'Data.Text'`

The presence of these instances makes the following possible:

`ghci> :set -XOverloadedStrings ghci> h = "hello" :: Text ghci> c = ", " :: Text ghci> w = ", world" :: Text ghci> h <> c <> w "hello, world"`

Other methods in `Semigroup`

and `Monoid`

call binary operations several times:

`sconcat :: Semigroup a => Data.List.NonEmpty.NonEmpty a -> a mconcat :: Monoid a => [a] -> a`

Note the type `Data.List.NonEmpty.NonEmpty`

in the type signature of `sconcat`

: we have to use it here because `sconcat`

has no idea what to return when given an empty list. Thankfully, the type signature prevents us from running it with an empty list as an argument. The value of the `NonEmpty`

type must have a head with a potentially empty tail.

You can construct such a list with the operation `(:|)`

as follows:

`ghci> 0 :| [1,2,3] 0 :| [1,2,3] ghci> "x" :| [] "x" :| []`

The function `sconcat`

takes a nonempty list and applies `(<>)`

several times, accumulating a total value. For example:

`ghci> sconcat ("x" :| ["y", "z"]) "xyz"`

The function `mconcat`

does the same thing without this limitation, because it can return `mempty`

whenever it’s given an empty list as an argument.

The definitions of the type classes `Semigroup`

and `Monoid`

are works in progress. Some time ago we only had `Monoid`

; then `Semigroup`

was added, but at that time there were no direct connections between these two classes. In GHC version 8.4 `Semigroup`

became a superclass of `Monoid`

. If you look at their definitions chances are you’ll see traces of their history, such as the method `mappend`

, which is the same as `(<>)`

. This `mappend`

method is expected to be removed from the `Monoid`

type class at some time in the future. You can read full details about this plan here: https://prime.haskell.org/wiki/Libraries/Proposals/SemigroupMonoid.

**Mapping, applying, and sequencing in a computational context**

Let’s pick apart the title of this subsection. Merriam–Webster gives the following definition of *context*: the interrelated conditions in which something exists or occurs

C*omputational context* means these *interrelated conditions* occurred during the process of computation. In Haskell, you can use many different computational contexts: some of them allow you to have side effects, such as input/output (`IO`

) or mutable variables (`State`

, `ST`

), and others support the idea of computation with the potential absence of a result (`Maybe`

, `Either`

) or with the possibility of storing some additional information along with the computation (`Writer`

). You probably came across some of them when you started learning Haskell.

All these contexts share on abstract behavior by allowing you to *map*, *apply*, and *sequence* over them. This behavior is defined by three type classes with fancy names:

`Functor`

abstracts*mapping*over some value in a computational context.`Applicative`

enables*applying*a function in a context to a value in a context and injecting value into a context.`Monad`

allows*sequencing*of computations in a context to structure what’s done in the next computation depends on the result of the previous one.

This wording is a textual representation of the type signatures for the following methods:

`fmap :: Functor f => (a -> b) -> f a -> f b pure :: Applicative f => a -> f a (<*>) :: Applicative f => f (a -> b) -> f a -> f b`

(>>=) :: Monad m => m a -> (a -> m b) -> m b

Although these three type classes define other methods as well (you can explore them with the `:info`

GHCi command), these are the most crucial for understanding what’s going on here. We often refer to contexts as *functorial*, *applicative*, or *monadic* if there exists a corresponding instance and we want to specify a particular property.

The words *value in a context* relate to the type signatures `f a`

or `m a`

, where `a`

is a type of a value (the result of a computation) and `f`

or `m`

restricts the context. In the same way, `f (a → b)`

is a *function in a context*. Take a look also at the second argument in the type signature of `(>>=)`

, `(a → m b)`

—this is what’s meant by “the next computation depends on the result of the previous one” (we call it a *monadic function*). Some call monads “programmable semicolon” for this reason.

When you look at the particular contexts these methods are concretized and sometimes look quite different, but all of them implement the same ideas of mapping, applying, and sequencing.

You can find many introductory materials on these three type classes — the *Typeclassopedia* includes many links, and Unit 5 of *Get Programming with Haskell* by Will Kurt (Manning) describes them in great detail in a practical way — and I won’t attempt to repeat them. Instead, I’ll give you examples of three different contexts and operations over values in them, to make sure that we’re on the same page.

**Exploring different contexts in parallel**

The type classes we’re talking about share common ideas about behavior. It’s always instructive to go both ways, from common ideas to specific implementations and vice versa. In this subsection you’re advised to follow the path of understanding from particular examples toward general notions. We’ll focus on these three contexts:

`IO`

, which allows you to do input/output`Writer`

, which supports logging of additional information along with the process of computations (you can get access to this context after importing the module`Control.Monad.Writer`

)`[]`

(a list), which is the most unusual context of nondeterministic computations

First, let’s look at a few examples of what a *value in a context* is:

`getLine :: IO String`

is a value of type`String`

in a context which is responsible for acquiring it from a side effect of reading user input.`writer (42, "step 1\n") :: Writer String Int`

is a value`42`

of type`Int`

which is accompanied by the log`"step 1\n"`

in the form of a`String`

.`[1,2,3] :: [Int]`

is one of the values`1`

,`2`

, or`3`

of type`Int`

, but we aren’t sure which one is the result of the computations (this uncertainty is what I call nondeterminism).

Now let’s check the mapping using the method `fmap`

from the `Functor`

type class:

`ghci> fmap (++"!") getLine Hi`

"Hi!"

The second line, `Hi`

, is the user input, which becomes a value in a context during execution, and the last line is a result of mapping `(++"!")`

over that result.

What about `Writer`

? Take a look:

`ghci> fmap (+1) (writer (42, "step 1\n")) :: Writer String Int WriterT (Identity (43,"step 1\n"))`

Apart from the output mess with `WriterT`

and `Identity`

, you can clearly see the new value `43`

(the result of mapping `(+1)`

over `42`

) in an unaltered context—the log doesn’t change. As for the third context:

`ghci> fmap (+1) [1,2,3] [2,3,4]`

We have one of the values `2`

, `3`

, or `4`

, each of them being a potential result of nondeterministic mapping of `(+1)`

over `1`

, `2`

, or `3`

. Clearly it’s an ordinary `map`

for lists.

In the next step we’d like to look at these contexts in an applicative setting, to try to see any similarities:

`ghci> pure (++) <*> getLine <*> getLine hello world "helloworld"`

ghci> pure (+) <*> writer (42, "step 1\n") <*> writer (10, "step 2\n") :: Writer String Int WriterT (Identity (52,"step 1\nstep 2\n")) ghci> pure (+) <*> [1,2,3] <*> [4,5,6]

[5,6,7,6,7,8,7,8,9]

We have a two-argument function which is injected into the context and then applied to two values in a context, and we can clearly see the results of applying `"helloworld"`

, `52`

, or one of the values in the list `[5,6,7,6,7,8,7,8,9]`

. We can also see that the context does its job by getting two user inputs, combining logs, and exploiting nondeterminism (we had three potential results in the first value in a context and three potential results in the second value in a context, and we got nine (=3*3) of them finally).

Let’s look at `(>>=)`

, called “bind” for our last example here:

`ghci> getLine >>= putStrLn hello hello`

ghci> writer (42, "step 1\n") >>= (\a -> writer (a+1, "step 2\n")) :: Writer String Int WriterT (Identity (43,"step 1\nstep 2\n")) ghci> [1,2] >>= (\x -> [x-1, x, x+1])

[0,1,2,1,2,3]

Try to work out for yourself what’s done here with the result of the first computation and how the contexts reveal themselves. Don’t forget to check the type of `(>>=)`

before you start thinking.

**do notation**

`do`

notation is a convenient syntactic mechanism for expressing monadic computations, which includes:

- Sequencing of operations by writing them one after another (as in imperative programming languages)
- Binding values in a context to names with
`←`

- Constructing succeeding operations with the names bound previously (this is the main feature of the
`Monad`

type class)

For example, the following code:

`action t = do a <- f t`

b <- g a

h a

return (k a b)

is de-sugared into:

`action t = f t >>= \a ->`

a >>= \b ->

a >>

return (k a b)

which is effectively:

`action t = f t >>= \a -> g a >>= \b -> h a >> return (k a b)`

Linearized definitions like this one are almost impossible to read and inconvenient to write; `do`

notation helps a lot.

Note that you can use `do`

notation in any monadic context. For example, the following three fragments are absolutely legitimate. The first context is `IO`

:

`readNumber :: IO Int readNumber = do s <- getLine`

return (read s)

The second is `Writer`

, where we recursively compute the sum of numbers from `0`

to `n`

and log all intermediate arguments (logging is done via the auxiliary function `tell`

from `Control.Monad.Writer`

):

`sumN :: Int -> Writer String Int sumN 0 = writer (0, "finish") sumN n = do tell (show n ++ ",")`

s <- sumN (n-1)

return (n + s)

Running this function in the GHCi gives:

`ghci> sumN 5 WriterT (Identity (15,"5,4,3,2,1,finish"))`

You can see here that the computations resulted in the number `15`

, and a full log is also provided.

Some monads support *runners*, functions that provide unwrapped results. In the case of `Writer`

we’ve the `runWriter`

function, which returns a pair `(result, log)`

:

`ghci> runWriter (sumN 5) (15,"5,4,3,2,1,finish")`

The third context is a list:

`cartesianProduct :: [Int] -> [Int] -> [(Int, Int)] cartesianProduct xs ys = do x <- xs`

y <- ys

return (x, y)

We can apply the function `cartesianProduct`

to some lists and see the results in the following GHCi session:

`ghci> cartesianProduct [1..2] [1..3] [(1,1),(1,2),(1,3),(2,1),(2,2),(2,3)]`

Every `do`

-block corresponds to exactly one monadic context, it’s instructive to understand which `Monad`

you work in every moment as the observable results crucially depend on that. It’s a common mistake to expect something which isn’t handled by the context in use.

`do`

notation is de-sugared into the following monadic operations:

`(>>=) :: Monad m => m a -> (a -> m b) -> m b`

(>>) :: Monad m => m a -> m b -> m b

where the last operation, `(>>)`

, is a simplified version of `(>>=)`

without dependency on the result of the previous computation. You can find details about desugaring `do`

notation in the *Typeclassopedia* (https://wiki.haskell.org/Typeclassopedia#do_notation) and many other sources.

**Folding and traversing**

The last two classes are `Foldable`

and `Traversable`

—both of them capture the idea of processing a container, a collection of values, with different possibilities.

Any `Foldable`

instance should implement one of the following methods:

`class Foldable (t :: * -> *) where foldMap :: Monoid m => (a -> m) -> t a -> m`

foldr :: (a -> b -> b) -> b -> t a -> b ...

Both of them should iterate over `t a`

, a container `t`

with elements of type `a`

, process every element and combine all the results. `Foldable`

contains sixteen methods. About half of them are various sorts of folding, namely processing container elements using specific operations; others include `length`

, `elem`

, and `maximum`

. All of these were once list methods that were later generalized to support other containers too.

The type class `Traversable`

is both simpler (it has only four methods, and two of them repeat the other two in a richer context, existing only for historical reasons) and more complex due to combining the effects of `Functor`

, `Applicative`

, `Monad`

, and `Foldable`

. This is the definition of `Traversable`

:

`class (Functor t, Foldable t) => Traversable (t :: * -> *) where traverse :: Applicative f => (a -> f b) -> t a -> f (t b)`

sequenceA :: Applicative f => t (f a) -> f (t a)

mapM :: Monad m => (a -> m b) -> t a -> m (t b)

sequence :: Monad m => t (m a) -> m (t a)

{-# MINIMAL traverse | sequenceA #-}

The method `traverse`

goes over every element in a container (the second argument) and applies its first argument to an element effectively, as in `Applicative f`

. Due to constraints to `t`

and `f`

, the actions performed in `traverse`

greatly depend on the particular instances, and it’s hard to talk about it in general.

Let’s look at a simple example. From the type signature of the method `traverse`

we can see that it requires two arguments: a function `a → f b`

returning a value in an applicative context and a container `t a`

of elements of type `a`

. Let’s take a list of `Int`

values as a container and `IO`

as an applicative context. Then we need some function returning a value in `IO`

—for example:

`addNumber :: Int -> IO String`

addNumber n = pure (++) <*> pure (show n ++ " ") <*> getLine

This function’s type is compatible with `traverse`

: it takes a number, turns it into a `String`

, and concatenates it with the line from the user input. Now we’re ready to see `traverse`

in action:

`ghci> traverse addNumber [1..5] aaa bbb ccc ddd eee`

["1 aaa","2 bbb","3 ccc","4 ddd","5 eee"]

Here, I’ve entered five lines and then numbered them. It’s always useful to check the types:

`ghci> :t traverse addNumber [1..5] traverse addNumber [1..5] :: IO [String]`

We’ve a list of `String`

in the context of `IO`

and this particular call to `traverse`

has the type:

`(a -> f b) -> t a -> f (t b)`

(Int -> IO String) -> [Int] -> IO [String]

The module `Data.Foldable`

provides `traverse_`

, a function similar to `traverse`

which doesn’t collect results of the individual computations:

`traverse_ :: (Applicative f, Foldable t) => (a -> f b) -> t a -> f ()`

Always follow the types; they help you a lot in understanding abstract functions. Try to use the same approach for `sequenceA`

: choose any container (take a list as the simplest one; `Maybe`

works too) and any applicative context (try `Writer`

!) and then construct a call to `sequenceA`

. You’ll definitely see what’s going on there.

Writing code with respect to `Foldable`

and `Traversable`

, as opposed to relying on a specific container or data structure, allows you to change the container whenever you like.

As a result, you can improve the efficiency of your code without rewriting it.

If you want to see more, check out the book on Manning’s liveBook platform here.