FeML: a skeleton of a femto-ML with nothing but polymorphic variants and functions
Kragen Javier Sitaker
kragen at canonical.org
Thu Jun 14 03:37:01 EDT 2012
A year or two ago, Darius Bacon and I were talking about making a sort
of variant of Scheme with ML-like pattern-matching as the basic
primitive. You have a slightly more complicated lambda, but you don’t
need cond, car, cdr, eq, or null? in the base language, just cons or
quasiquote, plus the pattern-matching lambda and either labels or
(lambda (key ((key . value) . more)) (cons key value)
(key (x . more)) (assoc key more)
(key ()) '()))
Here `lambda` takes an arbitrary set of pattern-action pairs and
matches the patterns in order against the argument list, evaluating
and returning the action expression for the pattern that matches. In
this case, first we check to see if the key is the caar of the list of
key-value pairs, and if so, we return its (reconstructed) car;
otherwise, if there are more items in the list, we recurse down the
list; otherwise, if the list is the empty list, we return the empty
(It’s not clear what happens if you say `(assoc 3 4)`. Should that be
an error, or should it have unspecified behavior to permit compiler
This is awfully close to Prolog, although you wouldn’t do it this way
assoc(Key, [[Key | Value] | _], [Key | Value]).
assoc(Key, [_ | More], Y) :- assoc(Key, More, Y).
assoc(_, , ).
However, you don’t actually need general backtracking search or
Prolog’s unification of inputs and outputs to make it Turing-complete.
The result is an arguably simpler notation for general recursive
functions than McCarthy’s ur-Lisp, and certainly one that’s easier to
use for writing compiler-like programs, things that operate on
(This is probably in Essentials of Programming Languages somewhere.
And, if not, it’s very similar indeed to Aardappel and, for that
matter, a simple ML.)
So in April, I was thinking about this again, and it occurred to me
that you might be able to implement this without runtime type
OCaml polymorphic variants
In particular, it turns out Objective Label, which is integrated into
the versions of OCaml over the last several years, does decidable type
inference on a language including higher-order functions, mutability,
and what it calls “polymorphic variants”, which are basically Prolog
terms or Aardappel trees. The consequence is that it needs very
little runtime type checking, and is immune to runtime type errors,
catching at compile-time the `(assoc 3 4)` example above.
Here's some Objective Caml code that type-checks, compiles, and works,
without any explicit type annotations:
let rec length = function | `Nil -> 0 | `Cons(_, x) -> 1 + length x;;
For this, OCaml infers the type ``([< `Cons of 'b * 'a | `Nil ] as 'a)
-> int``, which is to say it’s a function returning int (since that’s
the return type of `+`) and taking as its argument *at most*
`` `Nil `` with no arguments or `` `Cons `` with two arguments, whose
first argument can be of any type but whose second argument must be of
the same type as the function’s argument. The “at most” means it’s
okay to call the function in a context where it is guaranteed to be
passed *only* `` `Nil ``, for example, but it is not okay to invoke it
where it might be passed things with some other kind of value.
Here's a higher-order function which invokes a given function with
`` `Nil ``:
let callnil x = x `Nil ;;
OCaml infers the type ``([> `Nil ] -> 'a) -> 'a`` for this function;
that is, it’s a function that takes a function argument that takes,
*at least*, `` `Nil ``, and returns whatever that function does. That
means that it’s okay for the function to be able to handle other kinds
of arguments as well. So we can apply this `callnil` function to
callnil length ;;
which returns 0.
This makes type inference substantially more complicated than in ML,
because you aren’t just unifying types; there’s a subtyping relation,
or I think actually two of them, and you sometimes take the meet or
join of types.
One drawback of polymorphic variants is that every occurrence of the
variant tag could be associated with a different type, which
apparently produces some slight efficiency reductions in OCaml.
Here’s a list made out of polymorphic variants in OCaml:
`Cons(1, `Cons(2, `Cons(3, `Nil)));;
OCaml infers the type `` [> `Cons of int * [> `Cons of int * [> `Cons
of int * [> `Nil ] ] ] ] `` for this list, which is to say that each
of the three Conses that make it up is of a different type; one is
``[> `Cons of int * [> `Nil ] ]``, another is ``[> `Cons of int * [>
`Cons of int * [> `Nil ] ] ]``, and the third is the entire list.
However, the above `length` function can successfully be applied to
it, so OCaml is able to unify all three of those types (plus the type
of the Nil node) with the argument type of the function.
So I was thinking that an eager language with:
* polymorphic variants as its only data structure;
* pattern-matching as its only conditional;
* (higher-order) functions as its only means of abstraction;
* tail-calls as its only means of iteration;
* and complete run-time type erasure
Ought to be:
* feasible to implement;
* as safe as ML;
* as flexible as Smalltalk (or more so, since the “messages” can
contain other messages you pattern-match on);
* easy to optimize to the neighborhood of C performance;
* relatively terse;
* and a language in the simplicity neighborhood of C, Scheme, and Lua.
Let’s call it “FeML”, for “Femto-ML”.
What’s the equivalent of classes? You need *either* λ-calculus-style
automatic currying *or* closures, but you don’t need both. (That is,
you could make it work like C, without nested functions; you could
require people to write their code directly in supercombinators.)
Interestingly, you can take advantage of polymorphic-inline-cache
optimizations from Self for your pattern-matching; at call-sites to
functions that pattern-match their arguments, you can test the common
cases for that callsite and jump straight to the clause that is
It would be pretty awesome to be able to write programs in
happy-go-lucky Python style, but with full pattern-matching and C-like
efficiency. *Predictable* C-like efficiency. And parametric
polymorphism (aka templates) without the hassle that accompanies it in
C++ or Java. And it seems like FeML might be able to deliver that.
How does this map to object-oriented programming? I’m not 100% sure
that I haven’t screwed something up with inheritance, but I don’t
think so. It requires a sort of mental inversion:
* Functions correspond to classes.
* Variables captured from outer scopes, or early arguments to curried
functions, correspond to instance variables, as is common practice
* The last argument to the function corresponds to a method name and
* Pattern-matching on that argument corresponds to method dispatch.
* The variant tag of that argument corresponds to a method name.
* The pattern-match clauses correspond to method definitions.
* And a catch-all clause in that pattern-matching that tail-calls
another function with the same argument corresponds to inheritance
from a single superclass.
You can do all of that in ordinary ML, but then the “subclasses” can’t
add new “methods”. Polymorphic variants give you that ability.
There are a couple of flies in the ointment, though.
First, OCaml’s type inference algorithm for polymorphic variants
doesn’t actually handle that implementation of inheritance:
let a = function `X -> 1 | `Y -> 2 and b x = match x with `Z -> 3 | _ -> a x ;;
Error: This expression has type [> `Z ]
but an expression was expected of type [< `X | `Y ]
The second variant type does not allow tag(s) `Z
That doesn’t, in itself, mean that this approach is infeasible. It
might be trivial to extend the type-inference algorithm to handle that
case — it’s clearly true that in the second branch of the conditional,
`x` cannot be `` `Z `` — but I don’t know enough about type inference
to be confident that that won’t, for example, make the type-inference
problem for the new language undecidable.
Second, in the above case, the “methods” of the “class” `a` are pure
functions. But what if they actually needed something out of the
object? Say, they need to call some other method on “self”? How do
you write the code so that they get a reference to “self”?
Because the language is so simple, you can use a very minimal syntax for
it. For example:
assoc key (Cons (Cons key value) _): Cons key value
assoc key (Cons _ more): assoc key more
assoc _ Nil: Nil
a X: 1
a Y: 2
b Z: 3
b x: a x
length Nil: 0
length (Cons _ x): + 1 (length x)
assoc key =
Cons (Cons key value) _: Cons key value
Cons _ more: assoc key more
x: a x
Cons _ x: + 1 (length x)
(The above makes it seem like some infix operators would be a really
good idea, for lists if nothing else.)
Aardappel has the interesting idea of not distinguishing syntactically
between functions (node heads with rewrite rules existing) and data
structures (polymorphic variants, i.e. heads without any rewrite rules
defined). Some tree-rewriting languages allow you to define rewrite
rules that rewrite a tree partway, so the same node head can serve as
either one. Intuitively I think that these would make both the
type-checking and the syntax of FeML more complicated.
More information about the Kragen-tol