Category Theory with Rust (pt2)
As a pedagogical exercise, I am following Bartosz Milewski's Category Theory for Programmers. These blog posts apply the challenges in Rust. The order comes from the challenges and can jump around a little.
Functors
A functor is simply defined as a mapping between objects from one category to another. Importantly, it preserves a category's structure. In Haskell, a functor is defined as:
class Functor f where
fmap :: (a -> b) -> f a -> f b
This definition is deceptively simple. For example, Rust's Option
is a functor, with
Option::map
corresponding to fmap
(I have removed the const
annotation from the
definition for brevity):
impl<T> Option<T> {
pub fn map<U, F>(self, f: F) -> Option<U>
where
F: FnOnce(T) -> U,
{
match self {
Some(x) => Some(f(x)),
None => None,
}
}
}
Similarly, arrays/vectors are functors, with (for arrays) [T]::map
acting as fmap
.
So can we abstract fmap
? This task is hard, Rust does not have a simple notion of a type
constructor that Haskell has (the f a -> f b
component). Conceptually, we are defining a
container of sorts (f a
) which we want to apply (a -> b)
to the inner value and
obtain f b
. Until GATs arrived in nightly Rust, this task to abstract a functor was
impossible.
However, GATs support is in nightly! So we could define a trait Functor
and apply it to Option
:
pub trait Functor<A> {
type Output<B>;
fn fmap<F, B>(self, f: F) -> Self::Output<B>
where
F: FnOnce(A) -> B;
}
impl<A> Functor<A> for Option<A> {
type Output<B> = Option<B>;
fn fmap<F, B>(self, f: F) -> Option<B>
where
F: FnOnce(A) -> B,
{
self.map(f)
}
}
We can also check that it upholds the functor laws of preserving identity and preserving composition:
#[test]
fn functor_laws() {
// id
let x = Some(5).fmap(id);
let y = id(Some(5));
assert_eq!(x, y);
// composition
let add2 = |i: u8| i + 2;
let into_float = |d: u8| d as f32;
let x = Some(5).fmap(compose(add2, into_float));
let y = Some(5).fmap(add2).fmap(into_float);
assert_eq!(x, y);
}
There is a problem though, the Functor::Output
is not actually bounded to Self
's type.
For example, we could write the implementation so that the output is a vector. This
compiles and would run just fine, but it voids the first functor law that it upholds
function identity: fmap id = id
(note).
impl<A> Functor<A> for Option<A> {
type Output<B> = Vec<B>;
fn fmap<F, B>(self, f: F) -> Vec<B>
where
F: FnOnce(A) -> B,
{
self.map(f).map(|x| vec![x]).unwrap_or_default()
}
}
We can use equational reasoning to show that this fails to preserve identity:
Some(5).fmap(identity); // = vec![5] by above definition
identity(Some(5)); // = Some(5)
// fmap id != id >> identity is not preserved!
So can we construct our trait to bound the output to Self
? I haven't been able to think
up a solution, since Rust does not have this notion of higher kinded types and type
constructors.
I would welcome any feedback from the Rust community!
Bifunctors
Bifunctors are similar to functors, taking two type parameters instead of one.
A definition could look like this in Rust, getting a little hairy since I wanted to
implement the bimap
function with a default. This requires that the output associated
type which is partially applied (Self::Output<C, B>
) also satisfies being a Bifunctor
(phew 😅).
pub trait Bifunctor<A, B> {
type Output<C, D>;
fn bimap<F, G, C, D>(self, f: F, g: G) -> Self::Output<C, D>
where
F: FnOnce(A) -> C,
G: FnOnce(B) -> D,
Self: Sized,
// this is a mouthful!
Self::Output<C, B>: Bifunctor<C, B, Output<C, D> = Self::Output<C, D>>,
{
self.first(f).second(g)
}
fn first<F, C>(self, f: F) -> Self::Output<C, B>
where
F: FnOnce(A) -> C;
fn second<G, D>(self, g: G) -> Self::Output<A, D>
where
G: FnOnce(B) -> D;
}
Despite the narly trait definition, the implementations become poetic in their simplicity.
// Rust's Result type can implement Bifunctor
impl<T, E> Bifunctor<T, E> for Result<T, E> {
type Output<C, D> = Result<C, D>;
fn first<F, C>(self, f: F) -> Self::Output<C, E>
where
F: FnOnce(T) -> C,
{
self.map(f)
}
fn second<G, D>(self, g: G) -> Self::Output<T, D>
where
G: FnOnce(E) -> D,
{
self.map_err(g)
}
}
// Our previously defined Either type can implement Bifunctor
impl<L, R> Bifunctor<L, R> for Either<L, R> {
type Output<C, D> = Either<C, D>;
fn first<F, C>(self, f: F) -> Self::Output<C, R>
where
F: FnOnce(L) -> C,
{
match self {
Either::Left(l) => Either::Left(f(l)),
Either::Right(r) => Either::Right(r),
}
}
fn second<G, D>(self, g: G) -> Self::Output<L, D>
where
G: FnOnce(R) -> D,
{
match self {
Either::Left(l) => Either::Left(l),
Either::Right(r) => Either::Right(g(r)),
}
}
}
// Bifunctor can also be applied to product types!
impl<A, B> Bifunctor<A, B> for (A, B) {
type Output<C, D> = (C, D);
fn first<F, C>(self, f: F) -> Self::Output<C, B>
where
F: FnOnce(A) -> C,
{
(f(self.0), self.1)
}
fn second<G, D>(self, g: G) -> Self::Output<A, D>
where
G: FnOnce(B) -> D,
{
(self.0, g(self.1))
}
}
#[test]
fn does_it_work() {
let tostr = |x: f32| x.to_string();
let strlen = |x: &str| x.len();
// with Result
let x = Ok(3.14).bimap(tostr, strlen);
assert_eq!(x, Ok(String::from("3.14")));
let x = Err("Hello, world!").bimap(tostr, strlen);
assert_eq!(x, Err(13));
// with Either
let x = Left(3.14).bimap(tostr, strlen);
assert_eq!(x, Left(String::from("3.14")));
let x = Right("Hello, world!").bimap(tostr, strlen);
assert_eq!(x, Right(13));
// with tuple
let x = (3.14, "Hello, world!").bimap(tostr, strlen);
assert_eq!(x, ("3.14".to_string(), 13));
}
Currying
Currying is having partial function application. For example, we could define a string
concatenation function, and partially apply it as a greet
function. I have used a tuple
as arguments for reasons that will soon become clear.
fn strcat((a, b): (String, &str)) -> String {
a + b
}
#[test]
fn greet() {
let greet = |who| strcat(("Hello ".to_string(), who));
assert_eq!(greet("world!"), "Hello world!".to_string());
}
This form of partial application uses a closure. Could we do differently with a function
that can bind arguments? Currying formally defines that any cartesian product of variables into a
function can also be written as a function partially applying the first variable, (a, b)
-> c = a -> (b -> c)
. This also reverses! Written in Rust, the functions would look like:
pub fn uncurry<F, G, A, B, C>(f: F) -> impl FnOnce((A, B)) -> C
where
F: FnOnce(A) -> G,
G: FnOnce(B) -> C,
{
move |(a, b)| f(a)(b)
}
pub fn curry<'a, F, A: 'a, B, C>(f: F) -> impl FnOnce(A) -> Box<dyn FnOnce(B) -> C + 'a>
where
F: Fn((A, B)) -> C + 'a,
{
move |a| Box::new(move |b| f((a, b)))
}
The uncurry
function is the simpler of the two. We take a function which returns another
function, effectively describing a -> b -> c
transformations, and return a function (a,
b) -> c
. The curry
function has a few more constraints. Firstly, we are effectively splitting f
into two
functions. Both of these functions are returned, but only the first one can use the
impl trait
syntax. The second function unfortunately has to be boxed and dynamically
dispatched.
Given we are returning a boxed closure, any references need to be tied to a lifetime
governed by f
. Returning a FnOnce
also limits how useful this form of currying truly
is.
#[test]
fn greet2() {
let greet = curry(strcat)("Hello ".to_string());
assert_eq!(greet("world!"), "Hello world!".to_string());
let greet2 = uncurry(|a| move |b| strcat((a, b)));
assert_eq!(
greet2(("Hello ".to_string(), "world!")),
"Hello world!".to_string()
);
}
I am not a fan of the ergonomics of this, a curried function usually would be called
repeatedly, but the move semantics of the function requires FnOnce
implementations.
Functions are also generally defined with argument lists rather than tuples.
We can achieve a better ergonomic function with a trait definition:
pub trait Curry<A, B, C> {
fn curry_once<'a>(self, arg: A) -> Box<dyn FnOnce(B) -> C + 'a>
where
Self: 'a,
A: 'a;
fn curry<'a>(self, arg: A) -> Box<dyn Fn(B) -> C + 'a>
where
Self: 'a,
A: 'a + Copy;
}
Notice the two different functions and the additional bound that A: Copy
for a reuseable
curry
function which returns a Fn
closure.
To test it out, we implement Curry
on any type which satisfies Fn(A, B) -> C
:
impl<F, A, B, C> Curry<A, B, C> for F
where
F: Fn(A, B) -> C,
{
fn curry_once<'a>(self, a: A) -> Box<dyn FnOnce(B) -> C + 'a>
where
Self: 'a,
A: 'a,
{
Box::new(move |b| self(a, b))
}
fn curry<'a>(self, a: A) -> Box<dyn Fn(B) -> C + 'a>
where
Self: 'a,
A: 'a + Copy,
{
Box::new(move |b| self(a, b))
}
}
Now for any function, we can call .curry(arg)
or .curry_once(arg)
if the arg is not
Copy
.
I have redefined strcat
, using an argument list rather than a tuple. There is also a
definition of count_chars
which takes its first argument as &str
, a Copy
able type.
The strcat2
can only be used with .curry_once
since it takes its argument by value.
But count_hello_world
can be called multiple times!
fn strcat2(a: String, b: &str) -> String {
a + b
}
fn count_chars(s: &str, ch: char) -> usize {
s.chars().filter(|&c| c == ch).count()
}
#[test]
fn greet3() {
let greet = strcat2.curry_once("Hello ".to_string());
assert_eq!(greet("world!"), "Hello world!".to_string());
let count_hello_world = count_chars.curry("Hello, world!");
assert_eq!(count_hello_world('l'), 3);
assert_eq!(count_hello_world('o'), 2);
}
Note 1
This depends on what the category is describing. I have implicitly described the category of types over the Option image. If a functor was defined between the category Option to the category Vec using a mapping such as:
fn to_vec<T>(opt: Option<T>) -> Vec<T> {
match opt {
Some(x) => vec![x],
None => Vec::new(),
}
}
Then we can indeed preserve identity and composition through this mapping. For instance, identity can be preserved with to_vec(id(Some(A))) = id(to_vec(Some(A)))
.