A monad is like a functor with but with a subtle but abstractly powerful difference, and as usual an example will help to highlight that difference.
Let's start by looking at the behaviour of a Functor
Let's define two functions:
C#:
public static int Increment(int x) => x + 1;
public static int Square(int x) => x * x;
Next let's explore how we can use these two function with the
List data type; I've selected the
List data type because it is one of the types that Microsoft has conformed to algebras like
Functor and
Monad -- it all happens with the seemingly magical addition of the
Linq framework.
Let's define a list of integers and transform it with our two functions:
C#:
var numbers = new List<int> { 1, 2, 3, 4, 5 };
var result1 = numbers
.Select(Increment) // [2, 3, 4, 5, 6]
.Select(Square); // [4, 9, 16, 25, 36]
Console.WriteLine("[{0}]", string.Join(", ", result1));
// [4, 9, 16, 25, 36]
Linq's
Select extension method is an implementation of the
Functor algebra for the
List data type. In other languages and/or implementation this is more typically called
map or
fmap.
In short Linq's
Select, aka
Functor iterates over the elements in
numbers and applying a transformation to the elements, for example:
Increment function to increment each of the elements in
numbers and then a separate
Select to
Square the result of the increment.
The comments are reflective of the end result of each
Select statement.
This whilst being a powerful feature; replacing the need for many loops; has a specific limitation with regards to the type of transformations that are possible.
Let's illustrate the limitation of a functor
...by subtly adjusting the signature of the previous two functions.
C#:
public static List<int> IncrementM(int x) => new List<int> { x + 1 };
public static List<int> SquareM(int x) => new List<int> { x * x };
We now have two functions that similar the previous functions require an input of a single parameter of type
int, which we similarly
increment and
square, however we now return a new
List encapsulating the result.
Why this is needed will become apparent later on; let's for now just explore the limitation of
Select when it comes to processing these two functions.
C#:
var result2 = numbers
.Select(IncrementM) // [[2], [3], [4], [5], [6]]
.Select(SquareM); // Error CS0411
/*
Error CS0411:
The type arguments for method 'Enumerable.Select<TSource, TResult>(IEnumerable<TSource>, Func<TSource, TResult>)'
cannot be inferred from the usage.
Try specifying the type arguments explicitly. (CS0411) (Compositional)
*/
The 1st
.Select(IncrementM) operation worked just fine, returning a 2D Array structure containing our incremented values. The 2nd
.Select(SquareM); fails with a CS0411 error; which really doesn't help to explain the problem. To understand; we need to examine what elements the
SquareM function is being provided by Linq's
Select. As I said previously
Select can be thought in the same way as a loop iterating over the internal elements.
The error occurs because of the output of the
.Select(IncrementM) which unlike the previous functions has embedded the computed values in a internal
List; meaning that the input to our 2nd
Select operation is a
List<int>; from a
List within a
List, this in turn mean that our 2nd
Select is passing a
List containing a
2 to the
SquareM function; which is incompatible; because it expects just a
2
As the following example demonstrates; we need a 2nd loop to get at the elements in a 2D array:
C#:
var input = new List<List<int>> {
new List<int> { 2 },
new List<int> { 3 },
new List<int> { 4 },
new List<int> { 5 },
new List<int> { 6 }
};
foreach (var element in input) {
// element is a List<int>, so to access the value inside we would need another loop
foreach (var innerElement in element) {
// now we have access to 2, 3, ...
}
}
In Linq, we can modify our code to do this as follows:
C#:
var result2 = numbers
.Select(IncrementM) // [[2], [3], [4], [5], [6]]
.Select(x => x.Select(SquareM)); // [[[4]], [[9]], [[16]], [[25]], [[36]]]
Which works, but the 3D array structure of the result should be indicative of a growing problem, the more we execute functions of this type of signature with
Select, the more deeply entangled our result becomes, and the more loops we'll have to execute for any proceeding computation.
So how do we fix this.
If you look at the difference between the result from the 1st
.Select(IncrementM) and the previous example using the more traditional function signature
.Select(IncrementM); you should agree the problem is that the
IncrementM version returns as an
List within an
List, and an easy way to get the
IncrementM version to return the same result, would be to flatten the result from a 2D Array to a 1D Array -- which surprise, surprise is exactly what a Monad does; hence it's more formally called a
FlatMap; an transformative operation like
Map / Select, except that it flattens the structure at the end; making our 2D array into a 1D Array.
Here's the same process using FlatMap:
In
Linq, Microsoft has used a different name for a monad, so instead of
FlatMap, the extension method is called
SelectMany:
C#:
var result3 = numbers
.SelectMany(IncrementM) // [2, 3, 4, 5, 6]
.SelectMany(SquareM); // [4, 9, 16, 25, 36]
Console.WriteLine("[{0}]", string.Join(", ", result3));
You're probably again wondering why this is important and why one would ever want to construct a function that returns a value embedded in a
List, only to remove it when we flatten the result at the end, from a 2D Array to a 1D Array.
The first thing to acknowledge is that
Monad being an abstract concept is going to mean that the reason is also a bit abstract; and will require a bit more explanation and examples to clarify. However I will say bear with me, because this seemingly abstract and pointless behaviour is what underpins the power of the
Monad, and is the very reason why it's such a powerful concept and applicable for a wide variety of computing problems.
Next Post
The next post will examine this abstract behaviour of the Monad and will hopefully finally reveal for you a bit of the hidden power behind monadic operations. I'll most certainly explain why functions being able to return a result embedded in a data type is important, and why this behaviour allows us to model any imperative computing process using monadic functions.