Functors - huh!?
(NOTE: you can find the relevant source code in https://github.com/yatesco/reasonml-developer-handbook-src/blob/master/bs-src/src/functorsHuh.re)
Type systems are really all about types (duh!). A Type is really a description of a set of "common shapes". This allows you to then say things like "Everything in this list is the same sort of thing".
In Java this might look like:
import java.util.List;
import java.util.ArrayList;
public class HelloWorld {
public static void main(String args[]) {
List<String> myStrings = List.of("hi", "there");
System.out.println(myStrings);
}
}
java.util.List
is defined as a Container of 'things'. In order to construct an actual List
you must tell it exactly what it will contain. In the above case we are creating a List
of String
s.
In Reason
, or rather in OCaml
the same principle applies, but the syntax is very different. In Ocaml
you construct something called a Functor
which is strictly a mapping from one category
to another. AIUI (which is code for I haven't got a clue really, but this happened to work) there are three moving pieces:
- A generic
specification
for the type and necessary operations on that type - A generic
engine
that takes in aspecification
and "does the thing". - A concrete instance of the specific which glues the two together.
The engine
is also a factory which takes in a specification
and returns a specific instance
of the engine
baked just for you.
An example - grouping by things
Clojure
is fantastic for both its consistent set of simple abstractions and a rich standard library which operates across those abstractions.
One such utility is https://clojuredocs.org/clojure.core/group-by, which given a sequence
and a fn
, returns a map
from every unique value of (fn element)
to the set of elements that generated that value.
Imagine we have a sequence of numbers: [1 2 3 4 5 6 7 8 9 10]
and we want to split that sequence into even numbers and odd numbers. In a clojure repl:
user=> (def numbers [1 2 3 4 5 6 7 8 9 10])
#'user/numbers
user=> (defn our-fn [x] (even? x))
#'user/our-fn
user=> (group-by our-fn numbers)
{false [1 3 5 7 9], true [2 4 6 8 10]}
user=>
(and yes, smarty pants, (group-by even? (range 0 11))
gives the same result, but this is supposed to be an illuminating example ;-)).
You can see that (group-by our-fn numbers)
returned a map
from true
to all the even numbers and false
to all of the false numbers.
Grouping by things in ReasonML
To achieve the same thing in reasonml
we need to create the three things.
The Specification
First, we need a list of things (e.g. numbers). Secondly, we need something that when given a thing (e.g. number), returns the value for that thing (e.g. is it even?). Let's define that now:
module type CollectorSpec = {type e; type r; let l: list(e); let f: e => r;};
That defines the type of the e
lement and the type of the r
esult. It also defines a l
ist of e
lements, and a f
unction that takes an e
lement and returns the r
esult. Notice that we haven't actually said what the types are, only that there are two different types, a list
of e
and a function
that maps from one (instance of) e
to (an instance of) r
.
The Engine
Our second component is the generic engine
that actually does the work. In order to do anything it needs to know how to reason about the types, but it really doesn't care what those types actually are:
(NOTE: you did read the disclaimer that stated I really didn't have a clue what I am doing right? :-))
module Collector = {
module Make = (C: CollectorSpec) => {
type e = C.e;
type r = C.r;
let collect = (~l=C.l, ~f=C.f, _) => {
let result = Hashtbl.create(~random=true, List.length(l));
List.iter(
item => {
let r = f(item);
let newList = Hashtbl.mem(result, r) ? Hashtbl.find(result, r) : [];
Hashtbl.replace(result, r, [item, ...newList]);
},
l
);
result;
};
};
};
Let's unpack this a bit (explanations are underneath the code list):
module Collector = {
We create a module to store the generic logic in.
module Make = (C: CollectorSpec) => {
We create a nested module to create the constructor. Conventionally this is known as Make.
type e = C.e;
type r = C.r;
let collect = (~l=C.l, ~f=C.f, _) => {
Because we are claiming that Make
is of type CollectableSpec
we must make it type compatible, so we are teaching Make
about the shape of CollectableSpec
. The last line is creating a function
called collect
which takes in two named parameters: ~l
of type CollectableSpec.l
and ~f
of type CollectableSpec.f
. It is entirely co-incidental that the parameters are the same name as their type.
let result = Hashtbl.create(~random=true, List.length(l));
List.iter(
item => {
let r = f(item);
let newList = Hashtbl.mem(result, k) ? Hashtbl.find(result, r) : [];
Hashtbl.replace(result, k, [item, ...newList]);
},
l
);
result;
};
The above is the (almost certainly terrible and hacky) implementation of our group-by
logic. The detail doesn't matter here, simply note that other we have only restricted this to types defined in CollectorSpec,
not a specific concrete type.
The Glue
At this point we have something a specification
that states we need two types, a list of one of those types and a mapping from an instance of the first type to an instance of the second type.
Now we are ready to actually use our fancy collect
logic.
First, we must actually define the concrete
types. What exactly is CollectorSpec.e
and CollectorSpec.r
? To do that we provide theMake
factory a specification
using the types we are actually interested in. In our example we are collecting based on whether the number is even or odd:
module EvenNumbersCollectorSpec =
Collector.Make(
{
type e = int;
type r = bool;
let l = [1,2,3,4,5,6,7,8,9,10];
let f = (n) => n mod 2 === 0 ? true : false;
}
);
At this point, we have created an instance of our Collector
engine instantiated for our types and our mapping function.
Excellent!
Now, let's actually do the collecting:let m = EvenNumbersCollectorSpec.collect();
Awesome...er, really? Yep! That's done it. Let's prove it:
Js.log(Hashtbl.mem(m, true));
Js.log(Hashtbl.mem(m, false));
mem
is asking "do you have a mapping from X to anything"? You should see "1" (bizarrely, not "true") written in your browser's console indicating that yes, m
does indeed know about true
.
We now know that m
has bindings (or keys
) for true
and false
. What about a value which we know won't be there? You might expectJs.log(Hashtbl.mem(m, "not there"));
to return 0
, but it actually reports an error.
huh? An error!? You asked Hashtbl
to look up the value "not there"
which is a string
but you defined m
as only taking bool
s, and because string
s aren't bool
s the compiler knows that the question you are asking is nonsense. Awesome! Rather than crash at runtime it won't even compile!
Given there are only true
and false
we can't test a third bool
unmapped value so you will just have to trust that mem
isn't cheating :-).
Let's print out the actual numbers associated to true
and false
:
let trues = Hashtbl.find(m, true);
let falses = Hashtbl.find(m, false);
Js.log2("trues are:", Array.of_list(trues));
Js.log2("falses are:", Array.of_list(falses));
(NOTE: printing out List
s isn't great as it prints the raw (linked list) data structure, hence the conversion to Array
s first.)
You should see:
"trues are:" [10,8,6,4,2]
"falses are:" [9,7,5,3,1]
in your browser's console log.
Another Example
Numbers are great and everything, but let's do something a bit more interesting - cats and dogs. Everybody loves those.
type animalType = Dog | Cat | Horse;
type animal = {
type_: animalType,
name: string
};
let animals = [
{type_: Dog, name: "Wuffles"},
{type_: Dog, name: "Biscuits"},
{type_: Cat, name: "KillerCat"}
];
We have defined three distinct types of animals; Dog
, Cat
and Horse
. We also define an animal
as having an animalType
and a string
name. We then declared a list
of three animals
.
In order to collect
the different types of animal we define the glue:
module AnimalCollector =
Collector.Make(
{
type e = animal;
type r = animalType;
let l = animals;
let f = (n) => n.type_;
}
);
(NOTE: type
is a reserved word so by convention we name fields type_.)
This engine
will map from each distinct animalType
to all of the animals
of that animalType
.
As before, let's inspect the results:
let m = AnimalCollector.collect();
Js.log(Array.of_list(Hashtbl.find(m, Dog))); ;; === [[0,"Biscuits"],[0,"Wuffles"]]
Js.log(Array.of_list(Hashtbl.find(m, Cat))); ;; === [[1,"KillerCat"]]
Because we didn't add any Horse
s to our list of animal
s, we can verify that mem
is actually working ;-):
Js.log(Hashtbl.mem(m, Horse)); ;; === 0
excellent. and what happens if we try and find
a binding that isn't there:
Hashtbl.find(m, Horse);
This time, we don't get a type
error as Horse
is a valid type of animal
. Rather, we get a runtime exception stating Not_found,-6
. That cryptic error is telling you that there are no horses.
Q: Why on earth doesn't the standard library return option
to avoid these runtime errors?
Summary
Functors
are a way of parameterising a more general chunk of logic. There are three moving pieces, the specification
of the types and functions in play, the engine
which uses that specification and your glue
which provides an actual instance of the specification
which the engine
factory uses to build you your own personal instance.