Corecursion and anamorphisms

Unfolding what lies beneath
Some blurred out Haskell code, partially covered by the text 'corecursion', in the style of the 'Corecursive' podcast logo.

You know that feeling where you hear about something and you immediately need to look into it? I had that while listening to the most recent episode of Adam Gordon Bell's /Corecursive/ podcast today, where they were talking about where the name of the podcast came from. Up until then I had just assumed that corecursive meant mutually recursive1, but boy, was I wrong!

The C-word

According to Wikipedia, 'corecursion is type of operation that is a dual to recursion'. This quickly gets very theoretical, but the long and short of it is that corecursion can be seen as a kind of opposite of recursion: Where recursion allows you to operate on arbitrarily complex data as long as you can reduce it down to a set of base cases, corecursion allows you to generate arbitrarily complex data when given a base case.

Err ... what?

Think of it like this: Given a list of numbers (arbitrarily complex data), you can define a recursive function to sum all the numbers using simple base cases: is the list empty or are there more elements to add? Your language of choice may well have a sum function that does just this. If not, you can implement it with a fold or reduce.

However, given a number, can you create a list that when summed up would equal this number? This would be a form of corecursion, where we take simple data (a number), and generate arbitrarily complex data based on the input (the resulting list of numbers).

Let's talk about fold functions specifically. As we talked about in a previous post on folding, fold functions are catamorphisms. They take a data structure and reduce it to a 'lower' form. The opposite of a catamorphism is an anamorphism2, and the opposite of a fold is an unfold (at least in Haskell). An anamorphism generates a sequence by repeatedly applying a function onto its previous result.

According to Wikipedia, 'the anamorphism of a coinductive type denotes the assignment of a coalgebra to its unique morphism to the final coalgebra of an endofunctor.' Don't worry: you needn't understand that to understand unfolding and corecursion (I sure don't). Instead, let's try and get a feel of what we might use corecursion for.

Returning to our previous example of destructuring a number into a list of terms, let's look at a couple of ways to do it using unfold.

First, let's look at the unfoldr function itself. It is defined in the Data.List module, and its type signature is:

    unfoldr :: (b -> Maybe (a, b)) -> b -> [a]

Given a function from b to Maybe (a, b) and a b, it will produce a list of a. If the function (let's call it f) returns Just (x, y), x will be added to the result, and f will be called again with y. This continues until f returns Nothing, at which point unfoldr terminates, returning the list it has created.

With this, we can create a function that takes an integral value and returns a list that when summed, is equal to the value we passed in. We'll start with an easy variant that just destructures the input into ones. For simplicity's sake, we're ignoring non-negative numbers.

  module Unfold where

  import Data.List (unfoldr)

  unroll :: Integral a => a -> [a]
  unroll = unfoldr f
      f n
        | n <= 0 = Nothing
        | otherwise = Just (1, n - 1)

Easy enough. The function passed to unfoldr returns nothing if there is no more to sum. Otherwise, add $1$ to the list, and call again with n-1.

    unroll 5 -- [1,1,1,1,1]
    unroll 0 -- []

But we can have some more fun with this. How about we try and destructure a number into a list of the pieces we'd need to create a binary representation of it?

    binary :: Integral a => a -> [a]
    binary = unfoldr f
        f n
          | n <= 0 = Nothing
          | otherwise =
            let powerOfTwo = 2 ^ floor (logBase 2 $ fromIntegral n)
             in Just (powerOfTwo, n - powerOfTwo)

This is a bit more complicated, but only because we need to map the input value to a power of two. Luckily, we can use logBase 2 to get the exponent you'd need to get n, and then floor it to get the greatest integral exponent. This becomes the next entry to the list. What's left gets passed in to the next application of the function.

    binary 0 -- []
    binary 5 -- [4,1]
    binary 255 -- [128,64,32,16,8,4,2,1]
    binary 256 -- [256]

Pretty neat, huh? What if we take it a step further and convert the number to its binary representation instead, as if it was base 2?

    base2 :: Integral a => a -> a
    base2 = sum . unfoldr f
        f n
          | n <= 0 = Nothing
          | otherwise =
            let exponent = floor (logBase 2 $ fromIntegral n)
             in Just (10 ^ exponent, n - 2 ^ exponent)

As you'd expect:

    base2 0 -- 0
    base2 5 -- 101
    base2 10 -- 1010

Not too shabby at all.

Alright, I think we have had enough fun with corecursion for now. It's been a very unexpected, but very insightful little journey, and I thank you for taking it with me. Until next time!


I'm probably not the only one to do this. Wikipedia has a note under disambiguation on its article on corecursion that says 'not to be confused with mutual recursion'

Similar to how your body can be in either anabolic or catabolic states, for all you fitness people out there.

Thomas Heartman is a developer, writer, speaker, and one of those odd people who enjoy lifting heavy things and putting them back down again. Preferably with others. Doing his best to gain and share as much knowledge as possible.