# Examining Hackage: folds

In keeping with the rest of the “Examining Hackage” series I’d like to go through the source `folds`

package today. We’ll try to go through most of the code in an attempt to understand what exactly `folds`

does and how it does it. To be honest, I hadn’t actually heard of this one until someone mentioned it to me on /r/haskell but it looks pretty cool. It also has the word “comonadic” in the description, how can I resist?

It’s similar to Gabriel’s `foldl`

library, but it also seems to provide a wider suite of types folds. In retrospect, folds has a general framework for talking about types of folds and composing them where as `foldl`

defines only 2 types of folds, but defines a whole heap of prebuilt (left) folds.

## Poking Around

After grabbing the source and looking at the files we see that `folds`

is actually reasonable large

```
~$ cabal get folds && cd folds-0.6.2 && ag -g "hs$"
src/Data/Fold.hs
src/Data/Fold/L.hs
src/Data/Fold/L'.hs
src/Data/Fold/Class.hs
src/Data/Fold/M1.hs
src/Data/Fold/L1.hs
src/Data/Fold/R.hs
src/Data/Fold/Internal.hs
src/Data/Fold/L1'.hs
src/Data/Fold/R1.hs
src/Data/Fold/M.hs
Setup.lhs
tests/hlint.hs
```

One that jumps out at me is `Internal`

since it likely doesn’t depend on anything. We’ll start there.

## Internal

Looking at the top gives a hint for what we’re in for

```
{-# LANGUAGE FlexibleContexts #-}
{-# LANGUAGE UndecidableInstances #-}
{-# LANGUAGE ScopedTypeVariables #-}
{-# LANGUAGE DeriveDataTypeable #-}
module Data.Fold.Internal
( SnocList(..)
, SnocList1(..)
, List1(..)
, Maybe'(..), maybe'
, Pair'(..)
, N(..)
, Tree(..)
, Tree1(..)
, An(..)
, Box(..)
) where
```

This module seems to be mostly a bunch of (presumably useful) data types + their instances for `Foldable`

, `Functor`

, and `Traversable`

. Since all 3 of these are simple enough you can actually just derive them I’ll elide them in most cases.

First up is `SnocList`

, if the name didn’t give it away it is a backwards list (snoc is cons backwards)

Then we have the boilerplatey instances for `Functor`

and `Foldable`

. What’s a bit odd is that both `foldl`

and `foldMap`

are implemented where we only need `foldl`

. Presumably this is because just `foldMap`

gives worse performance but that’s a little disappointing.

Next is `SnocList1`

and `List1`

which are quite similar.

```
data SnocList1 a = Snoc1 (SnocList1 a) a | First a
deriving (Eq,Ord,Show,Read,Typeable,Data)
data List1 a = Cons1 a (List1 a) | Last a
```

If you’ve never seen this before, notice how instead of `Nil`

we have a constructor which requires an element. This means that no matter how we construct a list we need to supply at least element. Among other things this means that `head`

would be safe.

We also have a couple strict structures. Notice that these cannot be functors since they break `fmap f . fmap g = fmap (f . g)`

(why?). We have

And we have the obvious instance for `Foldable Maybe'`

and `Monoid (a, b)`

. Now it may seem a little silly to define these types, but from experience I can say anything that makes strictness a bit more explicit is wonderfully helpful. Now we can just use `seq`

on a `Pair'`

and know that both components will be forced.

Next we define a type for trees. One thing I noticed was the docs mentioned that this type reflects the structure of a `foldMap`

When we `foldMap`

each `One`

should be an element of the original collection. From there we can `fmap`

with the `map`

part of `foldMap`

, and we can imagine traversing the tree and replacing `Two l r`

with `l <> r`

, each `Zero`

with `mempty`

, and each `One a`

with a.

So that’s rather nifty. On top of this we have `Foldable`

, `Traversable`

, and `Functor`

instances.

We also have `Tree1`

which is similar but elides the `Zero`

As you’d expect, this implements the same type classes as `Tree`

.

Now is where things get a bit weird. First up is a type for reifying monoids using `reflection`

. I actually was thinking about doing a post on it and then I discovered Austin Seipp has done an outstanding one. So we have this `N`

type with the definition

Now with reflection there are two key components, there’s the type class instance floating around and a fresh type `s`

that keys it. If we have `s`

then we can easily demand a specific instance with `reflect (Proxy :: Proxy s)`

. That’s exactly what we do here. We can create a monoid instance using this trick with

```
instance Reifies s (a -> a -> a, a) => Monoid (N a s) where
mempty = N $ snd $ reflect (Proxy :: Proxy s)
mappend (N a) (N b) = N $ fst (reflect (Proxy :: Proxy s)) a b
```

So at each point we use our `s`

to grab the tuple of monoid operations we expect to be around and use them in the obvious manner. The only reason I could imagine doing this is if we had a structure which we want to use as a monoid in a number of different ways. I suppose we also could have just passed the dictionary around but maybe this was extremely ugly. We shall see later I suppose.

Last comes two data types I do not understand at all. There’s `An`

and `Box`

. The look extremely boring.

Their instances are the same everywhere as well.. I have no clue what these are for. Grepping shows they are used though so hopefully this mystery will become clearer as we go.

## Class

Going in order of the module DAG gives us `Data.Fold.Class.hs`

. This exports two type classes and one function

One thing that worries me a little is that this imports `Control.Lens`

which I don’t understand nearly as well as I’d like to.. We’ll see how this turns out.

Our first class is

```
class Choice p => Scan p where
prefix1 :: a -> p a b -> p a b
postfix1 :: p a b -> a -> p a b
run1 :: a -> p a b -> b
interspersing :: a -> p a b -> p a b
```

So right away we notice this is a subclass of `Choice`

which is in turn a subclass of `Profunctor`

. `Choice`

captures the ability to pull an `Either`

through our profunctor.

Note that we can’t do this with ordinary profunctors since we’d need a function from `Either a c -> a`

which isn’t complete.

Back to `Scan p`

. `Scan p`

takes a profunctor which apparently represents our folds. We then can prefix the input we supply, postfix the input we supply, and run our fold on a single element of input. This is a bit weird to me, I’m not sure if the intention is to write something like

```
foldList :: Scan p => [a] -> p a b -> b
foldList [x] = run1 x
foldList (x : xs) = foldList xs . prefix1 x
```

or something else entirely. Additionally this doesn’t really conform to my intuition of what a scan is. I’d expect a scan to produce all of the intermediate output involved in folding. At this point, with no instances in scope, it’s a little tricky to see what’s supposed to be happening here.

There are a bunch of default-signature based implementations of these methods if your type implements `Foldable`

. Since this is the next type class in the module let’s look at that and then skip back to the defaults.

```
class Scan p => Folding p where
prefix :: Foldable t => t a -> p a b -> p a b
prefixOf :: Fold s a -> s -> p a b -> p a b
postfix :: Foldable t => p a b -> t a -> p a b
postfixOf :: Fold s a -> p a b -> s -> p a b
run :: Foldable t => t a -> p a b -> b
runOf :: Fold s a -> s -> p a b -> b
filtering :: (a -> Bool) -> p a b -> p a b
```

At this point I looked at a few of the types and my first thought was “Oh dammit lens..” but it’s actually not so bad! The first thing to do is ignore the `*Of`

functions which work across lens’s `Fold`

type. There seems to be a nice pair for each “running” function where it can work across a `Foldable`

container or lens’s notion of a fold.

```
prefix :: Foldable t => t a -> p a b -> p a b
postfix :: Foldable t => p a b -> t a -> p a b
run :: Foldable t => t a -> p a b -> b
```

The first two functions let us create a new fold that will accept some input and supplement it with a bunch of other inputs. `prefix`

gives the supplemental input followed by the new input and `postfix`

does the reverse. We can actually supply input and run the whole thing with `run`

.

All of these are defined with `folded`

from lens which reifies a foldable container into a `Fold`

. so `foo = fooOf folded`

is the default implementation for all of these. Now for the corresponding fold functions I’m reading them as “If you give me a lens to treat `s`

as a container that I can get elements from and a fold, I’ll feed the elements of `s`

into the fold.”

The types are tricky, but this type class seems to capture what it means to run a fold across some type of structure.

Now that we’ve seen how `An`

comes in handy. It’s used as a single object `Foldable`

container. Since it’s newtyped, this should basically run the same as just passing a single element in.

So a `Scan`

here apparently means a fold over a single element at a time. Still not sure why this is deserving of the name `Scan`

but there you are.

Last but not least we have a notion of dragging a fold through an optic with `beneath`

.

```
beneath :: Profunctor p => Optic p Identity s t a b -> p a b -> p s t
beneath l f = runIdentity #. l (Identity #. f)
```

Those `#.`

’s are like `lmap`

s but only work when the function we apply is a “runtime identity”. Basically this means we should be able to tell whether or not we applied the function or just used `unsafeCoerce`

when running the code. Otherwise all we do is set up our fold `f`

to work across `Identity`

and feed it into the optic.

## Concrete Implementations

Now a lot of the rest of the code is implementing those two type classes we went over. To figure out where all these implementations are I just ran

```
~$ cabal repl
> :info Scan
....
instance Scan R1 -- Defined at src/Data/Fold/R1.hs:25:10
instance Scan R -- Defined at src/Data/Fold/R.hs:27:10
instance Scan M1 -- Defined at src/Data/Fold/M1.hs:25:10
instance Scan M -- Defined at src/Data/Fold/M.hs:33:10
instance Scan L1' -- Defined at src/Data/Fold/L1'.hs:24:10
instance Scan L1 -- Defined at src/Data/Fold/L1.hs:25:10
instance Scan L' -- Defined at src/Data/Fold/L'.hs:33:10
instance Scan L -- Defined at src/Data/Fold/L.hs:33:10
```

Looking at the names, I really don’t want to go through each of these with this much detail. Instead I’ll skip all the `*1`

’s and go over `R`

, `L'`

, and `M`

to get a nice sampling of the sort of folds we get.

### R.hs

Up first is `R.hs`

. This defines the first type for a fold we’ve seen.

Reading this as “a right fold from `a`

to `b`

” we notice a few parts here. It looks like that existential `r`

encodes our fold’s inner state and `r -> b`

maps the current state into the result of the fold. That leaves `a -> r -> r`

as the stepping function. All in all this doesn’t look *too* different from

The rest of this module is devoted to making a *lot* of instances for `R`

. Some of these are really uninteresting like `Bind`

, but quite a few are enlightening. To start with, `Profunctor`

.

```
instance Profunctor R where
dimap f g (R k h z) = R (g . k) (h . f) z
rmap g (R k h z) = R (g . k) h z
lmap f (R k h z) = R k (h . f) z
```

This should more or less by what you expect since it’s really the only the way to get the types to fit together. We fit the map from `b -> d`

onto the presentation piece of the fold and stick the map from `a -> c`

onto the stepper so it can take the new pieces of input.

Next we have the instance for `Choice`

.

```
instance Choice R where
left' (R k h z) = R (_Left %~ k) step (Left z) where
step (Left x) (Left y) = Left (h x y)
step (Right c) _ = Right c
step _ (Right c) = Right c
right' (R k h z) = R (_Right %~ k) step (Right z) where
step (Right x) (Right y) = Right (h x y)
step (Left c) _ = Left c
step _ (Left c) = Left c
```

This was slightly harder for me to read, but it helps to remember that here `_Left %~`

and `_Right %~`

are just mapping over the left and right sides of an `Either`

. That clears up the presentation bit. For the initial state, when we’re pulling our computation through the left side we wrap it in a `Left`

, when we’re pulling it through the right, we wrap it in `Right`

.

The interesting bit is the new `step`

function. It short circuits if either our state or our new value is the wrong side of an `Either`

otherwise it just applies our stepping function and wraps it back up as an `Either`

.

In addition to being a profunctor, `R`

is also a monad and comonad as well as a whole bunch of more finely grained classes built around those two. I’ll just show the `Monad`

`Applicative`

, and `Comonad`

instance here.

```
instance Applicative (R a) where
pure b = R (\() -> b) (\_ () -> ()) ()
R xf bxx xz <*> R ya byy yz = R
(\(Pair' x y) -> xf x $ ya y)
(\b ~(Pair' x y) -> Pair' (bxx b x) (byy b y))
(Pair' xz yz)
instance Comonad (R a) where
extract (R k _ z) = k z
duplicate (R k h z) = R (R k h) h z
instance Monad (R a) where
return b = R (\() -> b) (\_ () -> ()) ()
m >>= f = R (\xs a -> run xs (f a)) (:) [] <*> m
```

Looking at the `Comonad`

instance nesting a fold within a fold doesn’t change the accumulator, only the presentation. A nested fold is one that runs and returns a *new* fold which is identical except the starting state is the result of the old fold.

The `<*>`

operator here is kind of nifty. First off it zips both folds together using the strict `Pair'`

. Finally when we get to the presentation stage we map the final state for the left which gives us a function, and the final state for the right maps to its argument. Applying these two gives us our final result.

Notice that there’s some craziness happening with irrefutable patterns. When we call this function we won’t attempt to force the second argument until `bxx`

forces `x`

or `byy`

forces `y`

. This is important because it makes sure that `<*>`

preserves short circuiting.

The monad instance has a suitably boring `return`

and `>>=`

is a bit odd. We have one machine which accumulates all the elements it’s given in a list, this is an “identity fold” of sorts. From there our presentation function returns a lambda which expects an `a`

and runs `f a`

with all the input we’ve saved. We combine this with `m`

by running it in parallel with `<*>`

and feeding the result of `m`

back into the lambda generated by the right.

Now we’re finally in a position to define our `Scan`

and `Folding`

instances. Since the `Scan`

instance can be determined from the `Folding`

one I’ll show `Folding`

.

```
instance Folding R where
run t (R k h z) = k (foldr h z t)
prefix s = extend (run s)
postfix t s = run s (duplicate t)
runOf l s (R k h z) = k (foldrOf l h z s)
prefixOf l s = extend (runOf l s)
postfixOf l t s = runOf l s (duplicate t)
filtering p (R k h z) = R k (\a r -> if p a then h a r else r) z
```

It took some time, but I understand how this works! The first thing to notice is that actually running a fold just relies on the `foldr`

we have from `Foldable`

. Postfixing a fold is particularly slick with right folds. Remember that `z`

represents the accumulated state for the remainder of the items in our sequence.

Therefore, to postfix a number of elements all we need do is run the fold on the container we’re given and store the results as the new initial state. This is precisely what happens with `run s (duplicate t)`

.

Now `prefix`

is the inefficient one here. To prefix an element we want to change how presentation works. Instead of just using the default presentation function, we actually want to take the final state we get and run the fold *again* using this prefixing sequence and then presenting the result. For this we have another helpful comonandic function, `extend`

. This leaks because it holds on to the sequence a lot longer than it needs to.

The rest of these functions are basically the same thing except maybe postfixing (ha) a function with `Of`

here and there.

## L’.hs

Next up is (strict) left folds. As with right folds this module is just a data type and a bunch of instances for it.

One thing that surprised me here was that our state `r`

isn’t stored strictly! That’s a bit odd but presumably there’s a good reason for this. Now all the instances for `L'`

are the same as those for `R`

up to isomorphism because the types are well.. isomorphic.

The real difference comes in the instances for `Scan`

and `Folding`

. Remember how `Folding R`

used `foldr`

, well here we just use `foldl'`

. This has the upshot that all the strictness and whatnot is handled entirely by the foldable instance!

```
instance Folding L' where
run t (L' k h z) = k $! foldl' h z t
prefix s = run s . duplicate
postfix t s = extend (run s) t
runOf l s (L' k h z) = k $! foldlOf' l h z s
prefixOf l s = runOf l s . duplicate
postfixOf l t s = extend (runOf l s) t
filtering p (L' k h z) = L' k (\r a -> if p a then h r a else r) z
```

So everywhere we had `foldr`

we have `foldl'`

. The other interesting switch is that our definitions of `prefix`

and `postfix`

are almost perfectly swapped! This actually makes perfect sense when you think about it. In a left fold the state is propagating from the beginning to the end versus a right fold where it propagates from the end to the beginning! So to prefix something when folding to the left we add it to the initial state and when postfixing we use the presentation function to take our final state and continue to fold with it.

If you check above, you’ll find this to be precisely the opposite of what we had for right folds and since they both have the same comonad instance, we can swap the two implementations.

In fact, having read the implementation for right folds I’m noticing that almost everything in this file is *so* close to what we had before. It really seems like there is a clever abstraction just waiting to break out.

## M.hs

Now that we’ve seen how left and right folds are more or less the same, let’s try something completely different! `M.hs`

captures the notion of a `foldMap`

and looks pretty different than what we’ve seen before.

First things first, here’s the type in question.

We still have a presentation function `m -> b`

, and we still have an internal state `m`

. However, we also have a conversion function to map our inputted values onto the values we know how to fold together and we have a tensor operation `m -> m -> m`

.

Now as before we have a profunctor instance

```
instance Profunctor M where
dimap f g (M k h m e) = M (g.k) (h.f) m e
rmap g (M k h m e) = M (g.k) h m e
lmap f (M k h m e) = M k (h.f) m e
```

Which might start to look familiar from what we’ve seen so far. Next we have a `Choice`

instance which is still a little intimidating.

```
instance Choice M where
left' (M k h m z) = M (_Left %~ k) (_Left %~ h) step (Left z) where
step (Left x) (Left y) = Left (m x y)
step (Right c) _ = Right c
step _ (Right c) = Right c
right' (M k h m z) = M (_Right %~ k) (_Right %~ h) step (Right z) where
step (Right x) (Right y) = Right (m x y)
step (Left c) _ = Left c
step _ (Left c) = Left c
```

As before we use prisms and `%~`

to drag our presentation and conversion functions into `Either`

, similarly our starting state is wrapped in the appropriate constructor and we define a new stepping function with similar characteristic’s to what we’ve seen before.

As before, we’ve got a wonderful world of monads and comonads to dive into now. We’ll start with monads here to mix it up.

```
instance Applicative (M a) where
pure b = M (\() -> b) (\_ -> ()) (\() () -> ()) ()
M xf bx xx xz <*> M ya by yy yz = M
(\(Pair' x y) -> xf x $ ya y)
(\b -> Pair' (bx b) (by b))
(\(Pair' x1 y1) (Pair' x2 y2) -> Pair' (xx x1 x2) (yy y1 y2))
(Pair' xz yz)
instance Monad (M a) where
return = pure
m >>= f = M (\xs a -> run xs (f a)) One Two Zero <*> m
```

Our `return`

/`pure`

just instantiates a trivial fold that consumes `()`

s and outputs the value we gave it. For `<*>`

we run both machines strictly next to each other and apply the final result of one to the final result of the other.

Bind creates a new fold that creates a tree. This tree contains every input fed to it as it’s folding and stores each merge a node in the tree. While we run this, we also run the original `m`

we were given. Finally, when we reach the end, we apply `f`

to the result of `m`

and run this over the tree we’ve created which is foldable. If you remember back to the comment of `Tree a`

capturing `foldMap`

this is what was meant by it: we’re using a tree to suspend a `foldMap`

until we’re in a position to run it.

Now for comonad.

```
instance Comonad (M a) where
extract (M k _ _ z) = k z
duplicate (M k h m z) = M (\n -> M (k . m n) h m z) h m z
```

We can be pleasantly surprised that most of this code is the same. Extraction grabs our current state and presents it. Duplication creates a fold which will run and return a new fold. This new fold has the same initial state as the original fold, but when it goes to present its results it will merge it with the final state of the outer fold. This is very different from before and I suspect it will significantly impact our `Folding`

instance.

```
instance Folding M where
run s (M k h m (z :: m)) = reify (m, z) $
\ (_ :: Proxy s) -> k $ runN (foldMap (N #. h) s :: N m s)
prefix s (M k h m (z :: m)) = reify (m, z) $
\ (_ :: Proxy s) -> case runN (foldMap (N #. h) s :: N m s) of
x -> M (\y -> k (m x y)) h m z
postfix (M k h m (z :: m)) s = reify (m, z) $
\ (_ :: Proxy s) -> case runN (foldMap (N #. h) s :: N m s) of
y -> M (\x -> k (m x y)) h m z
filtering p (M k h m z) = M k (\a -> if p a then h a else z) m z
```

This was a little intimidating so I took the liberty of ignoring `*Of`

functions which are pretty much the same as what we have here.

To run a fold we use `foldMap`

, but `foldMap`

wants to work over monoids and we only have `z`

and `m`

. To promote this to a type class we use `reify`

and `N`

. Remember `N`

from way back when? It’s the data type that uses reflection to yank a tuple out of our context and treat it as a monoid instance. In all of this code we use `reify`

to introduce a tuple to our environment and `N`

as a pseudo-monoid that uses `m`

and `z`

.

with this in mind, this code uses `N #. h`

which uses the normal conversion function to introduce something into the `N`

monoid. Then `foldMap`

takes care of the rest and all we need do is call `runN`

to extract the results.

`prefix`

and `postfix`

are actually markedly similar. They both start by running the fold over the supplied structure which reduces it to an `m`

. From there, we create a new fold which is identical in all respects except the presentation function. The new presentation function uses `m`

to combine the pre/post-fixed result with the new result. If we’re postfixing, the postfixed result is on the right, if we’re prefixing, the left.

What’s particularly stunning is that neither of these leak! We don’t need to hold onto the structure in our new fold so we can prefix and postfix in constant memory.

## Fold.hs

Now that we’ve gone through a bunch of instances of `Folding`

and `Scanning`

, we’re in a position to actually look at what `Data.Fold`

exports.

```
module Data.Fold
( Scan(..)
, Folding(..)
, beneath
, L1(..) -- lazy Mealy machine
, L1'(..) -- strict Mealy machine
, M1(..) -- semigroup reducer
, R1(..) -- reversed lazy Mealy machine
, L(..) -- lazy Moore machine
, L'(..) -- strict Moore machine
, M(..) -- monoidal reducer
, R(..) -- reversed lazy Moore machine
, AsRM1(..)
, AsL1'(..)
, AsRM(..)
, AsL'(..)
) where
```

So aside from the folds we’ve examined before, there are 4 new classes, `AsRM[1]`

, and `AsL[1]'`

. We’ll look at the non-1 versions.

So this class covers the class of `p`

’s that know how to convert themselves to middle and right folds. Most of these instances are what you’d expect if you’ve ever done the “write `foldl`

as `foldr`

” trick or similar shenanigans.

For `M`

`asM`

is trivially identity and since `m`

is expected to be associative we don’t really care that `R`

is going to associate it strictly to the right. We just glue `h`

onto the front to map the next piece of input into something we know how to merge.

Next is `R`

For right folds we do something a bit different. We transform each value into a function of type `m -> m`

which is the back half of a folding function. We can compose these associatively with `.`

since they are just functions. Finally, when we need to present this, we apply this giant pipeline to the initial state and present the result. Notice here how we took a nonassociative function and bludgeoned it into associativity by partially applying it.

For `L'`

we do something similar

We once again build up a pipeline of functions to make everything associative and apply it at the end. We can’t just use `.`

though for composition because we need to force intermediate results. That’s why you see `\b g x -> g $! h x b`

, it’s just strict composition.

It makes sense that we’d bundle right and monoidal folds together because every right fold can be converted to a monoidal and every monoidal fold to a right. That means that every time we can satisfy one of these functions we can build the second.

This isn’t the case for left folds because we can’t convert a monoidal or right fold to a left one. For the people who are dubious of this, `foldl`

doesn’t let us capture the same amount of laziness we need. I forgot about this too and subsequently hung my machine trying to prove Edward Kmett wrong.

This means that the `AsL'`

is a fairly boring class,

```
class (AsRM p, AsL1' p) => AsL' p where
asL' :: p a b -> L' a b
instance AsL' L where
asL' (L k h z) = L' (\(Box r) -> k r) (\(Box r) a -> Box (h r a)) (Box z)
```

Now we finally see the point of `Box`

, it’s designed to stubbornly block attempts at making its contents strict. You can see this because all the instance for `L`

does is wrap everything in `Box`

es! Since `L'`

is the same as `L`

with some extra `seq`

s, we can use `Box`

to nullify those attempts at strictness and give us a normal left fold.

That’s it! We’re done!

## Wrap Up

Now that we’ve gone through a few concrete implementations and the overall structures in this package hopefully this has come together for you. I must say, I’m really quite surprised at how effectively comonadic operations can capture compositional folds. I’m certainly going to make an effort to use this package or Gabriel’s foldl a bit more in my random “tiny Haskell utility programs”.

If you’re as entranced by these nice little folding libraries as I am, I’d recommend

Trivia fact: this is the longest article out of all 52 posts on Code & Co.

*Update: I decided it might be helpful to write some utility folds for folds. I figured this might be interesting to some.*