Yesterday, Rich Hickey published a blog post, “Transducers are Coming”, which attracted a lot of attention.
I have a confession to make, which I have made before: I find it very difficult to understand ideas or code not presented with types. So I decided that the only way I could possibly understand what “transducers” are would be to actually implement them in a typed language. I ended up doing so and am sharing my findings here.
Vague types in the original blog post
Rich informally gave some type signatures in his blog post:
1 2 3 4 5
This was, unfortunately, not very helpful. It is hard to make sense of this pseudo-notation for types. What is quantified over what? And what is bound to what? I’ll explain later what I mean by these questions.
First discussion thread I saw
There was much tweeting online about transducers after Rich Hickey’s initial announcement; the tweets did not help me, except for links posted to discussion elsewhere.
One of them was on Hacker News. I browsed through it but found it mostly not useful. The problem was that although a lot of interesting Haskell code was thrown around, it tended to be related to transducers but not an exact translation of the concept. I already had my own intuitions about transducers being related to well-known types such as foldables, iteratees, lenses, etc. That “ordinary function composition” was involved immediately suggested the connections, because function composition is huge in these existing Haskell libraries.
But what I wanted was to understand transducers as they are, before even thinking about generalizations and comparisons.
What are the types?
Rich Hickey informally offered some types (which he said were “a la Haskell”) to try to help out:
1 2 3 4 5
OK, by using type variables
x, that indicates what is bound to what. The blog post should have used this notation rather than
Sample Clojure code
He also posted some sample Clojure code:
Second discussion thread I saw
Then today, I saw a discussion thread on Reddit, titled “Clojure’s Transducers are Perverse Lenses”.
Actual runnable Haskell code
Rich finally posted some actual type-checked, runnable Haskell code!
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37
After this post, I knew it would not take me long to figure out transducers.
Refactoring his Haskell code
Two things to notice about the original code:
- It has long, low-level function types rather than types that actually name the concepts being discussed (reducers and transducers).
- It uses hardcoded list types
Type synonyms and higher-rank types
Defining lots and lots of types (whether synonyms or newtypes is standard practice when programming in a modern typed language. OK, so I defined a type synonym
But what about transducer? This is trickier.
An invalid attempt at a type would be
because the type variable
r is not bound in the type definition. And it would be incorrect to just randomly add
r on the left hand side as an extra parameter to the
Transducer type, because in fact it is critical that a transducer does not care about the underlying reducer’s return type
r. How do we write the desired type?
It turns out you need higher-rank types. Rank-1 types are not sufficient; we need a rank-2 type to quantify
r, to say that a transducer from
b is a transformation that takes a reducer to a specific
r and returns another reducer to the same
Now we can see more clearly some completely generic ways to create a transducer:
1 2 3 4 5
A bit of history
Higher-rank types are a powerful technique for expressing “hiding” of unnecessary details about types going on somewhere. My first recollection of the real world use of rank-2 types is from 1994 (the year I started using Haskell, although I did not actually use it in my work as a software engineer until 1995), when I was excited to read a paper by John Launchbury and Simon Peyton Jones that solved, using a rank-2 type, a specific, important, practical problem, “Lazy Functional State Threads”; twenty years later, their ST monad is still part of the standard library!
Introducing type classes
Clojure uses protocols as an abstraction mechanism, and the “magic” of transducers uses protocols. In Haskell, type classes are the major abstraction mechanism (this is true of Scala also).
So I abstracted away from the hardcoded list-oriented functions and values in Rich Hickey’s Haskell code:
foldlabstracted to a
conjand empty list
abstracted to a
1 2 3 4 5 6 7
Note our reliance on transducing and reducing from one type
a to another,
flatmapping is not completely generic, because it depends on something being
Foldable (implementing a
Finally, here’s the originally list-specific code that now depends only on
1 2 3 4 5 6 7 8 9 10 11 12 13 14
Here is the list-specific code:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
Notice some very important properties of this code:
xformhas a type that does not mention lists at all, even though it is implemented using a list and cannot compile without the list
mungealso does not mention lists, and can transform anything that is
1 2 3 4
Implementing another type to illustrate the genericity of transducers
To illustrate Rich Hickey’s main point, I implemented instances of
Conjable for a standard Haskell
Vector library as an alternate “collection-like” type.
1 2 3 4 5 6 7 8 9 10
And we can run
munge directly on a vector instead of a list, without making any changes:
1 2 3 4
This is code reuse at its best.
Note that there is nothing that ties transducers to any concrete “collection” type. We could write instances of
Conjable for some kind of “channel” abstraction, for example, and instantaneously be able to munge data coming from it and to another. In fact, this is already what is done in the real world, where Haskell and Scala are used in production at places like Facebook and Twitter to efficiently handle large amounts of data.
My code repository
My complete code is available on GitHub.
It was pretty exciting to see the announcement of the transducers library for Clojure, because it represents a level of abstraction that I think has not been expressed much in the world of dynamically typed languages, although the techniques are two decades old in the Haskell community in a statically typed setting. And I hope that I was able to convey the sheer elegance of Haskell as a way to express interesting types with practical ramifications for abstraction and pluggability.