gotchaMinor

Why does higher-order abstract syntax need an inverse to define catamorphisms?

Submitted by: @import:stackexchange-cs·Mar 10, 2026·

Viewed 0 times

whyorderabstractinverseneedsyntaxhigherdoesdefinecatamorphisms

Problem

In the introduction to the colorfully-named Boxes Go Bananas: Encoding Higher-Order Abstract Syntax with Parametric Polymorphism, Washburn and Weirich describe a problem in traditional formulations of higher-order abstract syntax (HOAS):

One obstacle preventing the widespread use of this technique is the difficulty in using elimination forms, such as catamorphisms, for datatypes containing functions. The general form of catamorphism for these datatypes requires that an inverse be simultaneously defined for every iteration. Unfortunately, many operations that we would like to define with catamorphisms require inverses that do not exist or are expensive to compute.

They go on to demonstrate this problem with a small example, in Haskell. They define the following datatypes to represent simple lambda terms with HOAS:

data Exp = Lam (Exp -> Exp) | App Exp Exp
data Value = Fn (Value -> Value)

Lam represents lambda terms, App represents function application, and Fn represents evaluated functions. They next attempt to define eval :: Exp -> Value, but this causes trouble, as it requires converting a function of type Exp -> Exp to one of type Value -> Value. Since Exp appears in both positive and negative positions, this requires functions from both Exp -> Value and Value -> Exp to define.

The remainder of the paper proposes a solution to that problem, but I am stuck on something else: I don’t understand why it’s a problem at all. Usually, in programming language papers, we define languages with grammars like the following one:

$$
\begin{align*}
v &::= \lambda x. e \\
e &::= v \thinspace | \thinspace e \thinspace e
\end{align*}
$$

If we define a big-step evaluation relation $e \Downarrow v$, we have no need to define some other “unevaluation” relation $v \Uparrow e$, as lambdas are already values, so $\lambda x.e \Downarrow \lambda x.e$, which is quite straightforward.

It seems to me that a natural translation of the above grammar to HOAS

Solution

By being corecursive between the types, you indeed get a representation of a grammar, and it does have binding. But now you've sort of "baked in" the unembedding by making it "definitionally id". (This is similar to the Place constructor in Fegaras and Sheard).

So you can evaluate to Value. But what if you want to evaluate to anything else? You can't, unless it is something that "passes through" Value. So consider writing an "abstract interpreter" for your language (i.e., where you evaluate only to the "type" of the expression but without computing, or where you only "count the reductions" without computing). You can't, because you need to pass through having concrete elements of Value to go under lambdas.

You can fix this by having both Place and also a normal HOAS binding form, as in the above paper. But then that still doesn't suffice, for reasons they explain (and similar to what I've described -- see section 2) and so you make Place go further abstract by parameterizing over it, etc. Follow your nose further and you'll hit free monads. I got an intuition for this stuff from Edward Kmett's phoas post.

Context

StackExchange Computer Science Q#115738, answer score: 4

Revisions (0)

No revisions yet.