Category Theory with Rust (pt2)

13 Dec

As a pedagogical exercise, I am following Bartosz Milewski's Category Theory for Programmers. These blog posts apply the challenges in Rust. The order comes from the challenges and can jump around a little.

⬅ Part 1

Functors

A functor is simply defined as a mapping between objects from one category to another. Importantly, it preserves a category's structure. In Haskell, a functor is defined as:

class Functor f where
    fmap :: (a -> b) -> f a -> f b

This definition is deceptively simple. For example, Rust's Option is a functor, with Option::map corresponding to fmap (I have removed the const annotation from the definition for brevity):

impl<T> Option<T> {
    pub fn map<U, F>(self, f: F) -> Option<U>
    where
        F: FnOnce(T) -> U,
    {
        match self {
            Some(x) => Some(f(x)),
            None => None,
        }
    }
}

Similarly, arrays/vectors are functors, with (for arrays) [T]::map acting as fmap. So can we abstract fmap? This task is hard, Rust does not have a simple notion of a type constructor that Haskell has (the f a -> f b component). Conceptually, we are defining a container of sorts (f a) which we want to apply (a -> b) to the inner value and obtain f b. Until GATs arrived in nightly Rust, this task to abstract a functor was impossible.

However, GATs support is in nightly! So we could define a trait Functor and apply it to Option:

pub trait Functor<A> {
    type Output<B>;
    fn fmap<F, B>(self, f: F) -> Self::Output<B>
    where
        F: FnOnce(A) -> B;
}

impl<A> Functor<A> for Option<A> {
    type Output<B> = Option<B>;

    fn fmap<F, B>(self, f: F) -> Option<B>
        where
            F: FnOnce(A) -> B,
    {
        self.map(f)
    }
}

Yes, this works!

We can also check that it upholds the functor laws of preserving identity and preserving composition:

#[test]
fn functor_laws() {
    // id
    let x = Some(5).fmap(id);
    let y = id(Some(5));
    assert_eq!(x, y);

    // composition
    let add2 = |i: u8| i + 2;
    let into_float = |d: u8| d as f32;

    let x = Some(5).fmap(compose(add2, into_float));
    let y = Some(5).fmap(add2).fmap(into_float);
    assert_eq!(x, y);
}

There is a problem though, the Functor::Output is not actually bounded to Self's type. For example, we could write the implementation so that the output is a vector. This compiles and would run just fine, but it voids the first functor law that it upholds function identity: fmap id = id (note).

impl<A> Functor<A> for Option<A> {
    type Output<B> = Vec<B>;

    fn fmap<F, B>(self, f: F) -> Vec<B>
        where
            F: FnOnce(A) -> B,
    {
        self.map(f).map(|x| vec![x]).unwrap_or_default()
    }
}

We can use equational reasoning to show that this fails to preserve identity:

Some(5).fmap(identity); // = vec![5] by above definition
identity(Some(5)); // = Some(5)
// fmap id != id >> identity is not preserved!

So can we construct our trait to bound the output to Self? I haven't been able to think up a solution, since Rust does not have this notion of higher kinded types and type constructors. I would welcome any feedback from the Rust community!

Bifunctors

Bifunctors are similar to functors, taking two type parameters instead of one. A definition could look like this in Rust, getting a little hairy since I wanted to implement the bimap function with a default. This requires that the output associated type which is partially applied (Self::Output<C, B>) also satisfies being a Bifunctor (phew 😅).

pub trait Bifunctor<A, B> {
    type Output<C, D>;

    fn bimap<F, G, C, D>(self, f: F, g: G) -> Self::Output<C, D>
    where
        F: FnOnce(A) -> C,
        G: FnOnce(B) -> D,
        Self: Sized,
        // this is a mouthful!
        Self::Output<C, B>: Bifunctor<C, B, Output<C, D> = Self::Output<C, D>>,
    {
        self.first(f).second(g)
    }

    fn first<F, C>(self, f: F) -> Self::Output<C, B>
    where
        F: FnOnce(A) -> C;

    fn second<G, D>(self, g: G) -> Self::Output<A, D>
    where
        G: FnOnce(B) -> D;
}

Despite the narly trait definition, the implementations become poetic in their simplicity.

// Rust's Result type can implement Bifunctor
impl<T, E> Bifunctor<T, E> for Result<T, E> {
    type Output<C, D> = Result<C, D>;

    fn first<F, C>(self, f: F) -> Self::Output<C, E>
    where
        F: FnOnce(T) -> C,
    {
        self.map(f)
    }

    fn second<G, D>(self, g: G) -> Self::Output<T, D>
    where
        G: FnOnce(E) -> D,
    {
        self.map_err(g)
    }
}

// Our previously defined Either type can implement Bifunctor
impl<L, R> Bifunctor<L, R> for Either<L, R> {
    type Output<C, D> = Either<C, D>;

    fn first<F, C>(self, f: F) -> Self::Output<C, R>
    where
        F: FnOnce(L) -> C,
    {
        match self {
            Either::Left(l) => Either::Left(f(l)),
            Either::Right(r) => Either::Right(r),
        }
    }

    fn second<G, D>(self, g: G) -> Self::Output<L, D>
    where
        G: FnOnce(R) -> D,
    {
        match self {
            Either::Left(l) => Either::Left(l),
            Either::Right(r) => Either::Right(g(r)),
        }
    }
}

// Bifunctor can also be applied to product types!
impl<A, B> Bifunctor<A, B> for (A, B) {
    type Output<C, D> = (C, D);

    fn first<F, C>(self, f: F) -> Self::Output<C, B>
    where
        F: FnOnce(A) -> C,
    {
        (f(self.0), self.1)
    }

    fn second<G, D>(self, g: G) -> Self::Output<A, D>
    where
        G: FnOnce(B) -> D,
    {
        (self.0, g(self.1))
    }
}

#[test]
fn does_it_work() {
    let tostr = |x: f32| x.to_string();
    let strlen = |x: &str| x.len();

    // with Result
    let x = Ok(3.14).bimap(tostr, strlen);
    assert_eq!(x, Ok(String::from("3.14")));
    let x = Err("Hello, world!").bimap(tostr, strlen);
    assert_eq!(x, Err(13));

    // with Either
    let x = Left(3.14).bimap(tostr, strlen);
    assert_eq!(x, Left(String::from("3.14")));
    let x = Right("Hello, world!").bimap(tostr, strlen);
    assert_eq!(x, Right(13));

    // with tuple
    let x = (3.14, "Hello, world!").bimap(tostr, strlen);
    assert_eq!(x, ("3.14".to_string(), 13));
}

Currying

Currying is having partial function application. For example, we could define a string concatenation function, and partially apply it as a greet function. I have used a tuple as arguments for reasons that will soon become clear.

fn strcat((a, b): (String, &str)) -> String {
    a + b
}

#[test]
fn greet() {
    let greet = |who| strcat(("Hello ".to_string(), who));
    assert_eq!(greet("world!"), "Hello world!".to_string());
}

This form of partial application uses a closure. Could we do differently with a function that can bind arguments? Currying formally defines that any cartesian product of variables into a function can also be written as a function partially applying the first variable, (a, b) -> c = a -> (b -> c). This also reverses! Written in Rust, the functions would look like:

pub fn uncurry<F, G, A, B, C>(f: F) -> impl FnOnce((A, B)) -> C
where
    F: FnOnce(A) -> G,
    G: FnOnce(B) -> C,
{
    move |(a, b)| f(a)(b)
}

pub fn curry<'a, F, A: 'a, B, C>(f: F) -> impl FnOnce(A) -> Box<dyn FnOnce(B) -> C + 'a>
where
    F: Fn((A, B)) -> C + 'a,
{
    move |a| Box::new(move |b| f((a, b)))
}

The uncurry function is the simpler of the two. We take a function which returns another function, effectively describing a -> b -> c transformations, and return a function (a, b) -> c. The curry function has a few more constraints. Firstly, we are effectively splitting f into two functions. Both of these functions are returned, but only the first one can use the impl trait syntax. The second function unfortunately has to be boxed and dynamically dispatched. Given we are returning a boxed closure, any references need to be tied to a lifetime governed by f. Returning a FnOnce also limits how useful this form of currying truly is.

#[test]
fn greet2() {
    let greet = curry(strcat)("Hello ".to_string());
    assert_eq!(greet("world!"), "Hello world!".to_string());

    let greet2 = uncurry(|a| move |b| strcat((a, b)));
    assert_eq!(
        greet2(("Hello ".to_string(), "world!")),
        "Hello world!".to_string()
    );
}

I am not a fan of the ergonomics of this, a curried function usually would be called repeatedly, but the move semantics of the function requires FnOnce implementations. Functions are also generally defined with argument lists rather than tuples. We can achieve a better ergonomic function with a trait definition:

pub trait Curry<A, B, C> {
    fn curry_once<'a>(self, arg: A) -> Box<dyn FnOnce(B) -> C + 'a>
    where
        Self: 'a,
        A: 'a;

    fn curry<'a>(self, arg: A) -> Box<dyn Fn(B) -> C + 'a>
    where
        Self: 'a,
        A: 'a + Copy;
}

Notice the two different functions and the additional bound that A: Copy for a reuseable curry function which returns a Fn closure. To test it out, we implement Curry on any type which satisfies Fn(A, B) -> C:

impl<F, A, B, C> Curry<A, B, C> for F
where
    F: Fn(A, B) -> C,
{
    fn curry_once<'a>(self, a: A) -> Box<dyn FnOnce(B) -> C + 'a>
    where
        Self: 'a,
        A: 'a,
    {
        Box::new(move |b| self(a, b))
    }

    fn curry<'a>(self, a: A) -> Box<dyn Fn(B) -> C + 'a>
    where
        Self: 'a,
        A: 'a + Copy,
    {
        Box::new(move |b| self(a, b))
    }
}

Now for any function, we can call .curry(arg) or .curry_once(arg) if the arg is not Copy. I have redefined strcat, using an argument list rather than a tuple. There is also a definition of count_chars which takes its first argument as &str, a Copyable type. The strcat2 can only be used with .curry_once since it takes its argument by value. But count_hello_world can be called multiple times!

fn strcat2(a: String, b: &str) -> String {
    a + b
}

fn count_chars(s: &str, ch: char) -> usize {
    s.chars().filter(|&c| c == ch).count()
}

#[test]
fn greet3() {
    let greet = strcat2.curry_once("Hello ".to_string());
    assert_eq!(greet("world!"), "Hello world!".to_string());

    let count_hello_world = count_chars.curry("Hello, world!");
    assert_eq!(count_hello_world('l'), 3);
    assert_eq!(count_hello_world('o'), 2);
}

Note 1

This depends on what the category is describing. I have implicitly described the category of types over the Option image. If a functor was defined between the category Option to the category Vec using a mapping such as:

fn to_vec<T>(opt: Option<T>) -> Vec<T> {
    match opt {
        Some(x) => vec![x],
        None => Vec::new(),
    }
}

Then we can indeed preserve identity and composition through this mapping. For instance, identity can be preserved with to_vec(id(Some(A))) = id(to_vec(Some(A))).

Kurt Lawrence

Category Theory with Rust (pt2)

Functors

Bifunctors

Currying

Note 1

how-u-doin: A progress reporting abstraction

Quantifying Activity Influence on a Scheduling Target

Kurt Lawrence - Mining & Software Consultant