patternModerate

Recursive definitions over an inductive type with nested components

Submitted by: @import:stackexchange-cs·Mar 10, 2026·

Viewed 0 times

type-theory recursion cs logic stackoverflow coq proof-assistants

withtypenestedrecursivecomponentsinductiveoverdefinitions

Problem

Consider an inductive type which has some recursive occurrences in a nested, but strictly positive location. For example, trees with finite branching with nodes using a generic list data structure to store the children.

Inductive LTree : Set := Node : list LTree -> LTree.

The naive way of defining a recursive function over these trees by recursing over trees and lists of trees does not work. Here's an example with the size function that computes the number of nodes.

Fixpoint size (t : LTree) : nat := match t with Node l => 1 + (size_l l) end
with size_l (l : list LTree) : nat := match l with
    | nil => 0
    | cons h r => size h + size_l r
  end.

This definition is ill-formed (error message excerpted):

Error:
Recursive definition of size_l is ill-formed.
Recursive call to size has principal argument equal to
"h" instead of "r".

Why is the definition ill-formed, even though r is clearly a subterm of l? Is there a way to define recursive functions on such a data structure?

If you aren't fluent in Coq syntax: LTree is an inductive type corresponding to the following grammar.

$$\begin{align}
\mathtt{LTree} ::= & \\
\vert & \mathtt{list}(\mathtt{LTree}) \\
\end{align}$$

We attempt to define the size function by induction over trees and lists. In OCaml, that would be:

type t = Node of t list
let rec size = function Node l -> 1 + size_l l
and size_l = function [] -> 0
                    | h::r -> size h + size_l r

Solution

What works

If you nest the definition of the fixpoint on lists inside the definition of the fixpoint on trees, the result is well-typed. This is a general principle when you have nested recursion in an inductive type, i.e. when the recursion goes through a constructor like list.

Fixpoint size (t : LTree) : nat :=
  let size_l := (fix size_l (l : list LTree) : nat :=
                  match l with
                    | nil => 0
                    | h::r => size h + size_l r
                  end) in
  match t with Node l =>
    1 + size_l l
  end.

Or if you prefer to write this more tersely:

Fixpoint size (t : LTree) : nat :=
  match t with Node l =>
    1 + (fix size_l (l : list LTree) : nat :=
          match l with
            | nil => 0
            | h::r => size h + size_l r
          end) l
  end.

(I have no idea who I heard it from first; this was certainly discovered independently many times.)
A general recursion predicate

More generally, you can define the “proper” induction principle on LTree manually. The automatically generated induction principle LTree_rect omits the hypothesis on the list, because the induction principle generator only understand non-nested strictly positive occurences of the inductive type.

LTree_rect = 
fun (P : LTree -> Type) (f : forall l : list LTree, P (Node l)) (l : LTree) =>
match l as l0 return (P l0) with
| Node x => f x
end
     : forall P : LTree -> Type,
       (forall l : list LTree, P (Node l)) -> forall l : LTree, P l

Let's add the induction hypothesis on lists. To fulfill it in the recursive call, we call the list induction principle and pass it the tree induction principle on the smaller tree inside the list.

Fixpoint LTree_rect_nest (P : LTree -> Type) (Q : list LTree -> Type)
                         (f : forall l, Q l -> P (Node l))
                         (g : Q nil) (h : forall t l, P t -> Q l -> Q (cons t l))
                         (t : LTree) :=
  match t as t0 return (P t0) with
    | Node l => f l (list_rect Q g (fun u r => h u r (LTree_rect_nest P Q f g h u)) l)
  end.

Why

The answer to why lies in the precise rules for accepting recursive functions. These rules are perforce subtle, because there is a delicate balance between allowing complex cases (such as this one, with nested recursion in the datatype) and unsoundness. The Coq reference manual introduces the language (the calculus of inductive constructions, which is the proof language of Coq), mostly with formally precise definitions, but if you want the exact rules regarding induction and coinduction you'll need go to the research papers, on this topic Eduardo Giménez's [1].

Starting with the Coq manual, in the notation of the Fix rule, we have the fixpoint definition of $\mathsf{Fix} f_i \, \{ f_1 : A_1 := t_1 \; ; \;\; f_2 : A_2 := t_2 \}$ where:

$$\begin{align*}
\Gamma_1 &= (x : \mathtt{LTree}) &
A_1 &= \mathrm{nat} &
t_1 &= \mathsf{case}(x, \mathtt{LTree}, \lambda y. g_1 (f_2 y)) \\
\Gamma_2 &= (l : \mathtt{list}\;\mathtt{LTree}) &
A_2 &= \mathrm{nat} &
t_2 &= \mathsf{case}(l, \mathtt{list \: LTree}, \lambda h \: r. g_2 (f_1 h) (f_2 r)) \\
\end{align*}$$

In order for the fixpoint definition to be accepted, “if $f_j$ occurs [in $t_i$] then … the argument should be syntactically recognized as structurally smaller than [the argument passed to $f_i$]” (simplifying because the functions have a single argument). Here, we need to check that

$i=1$, $j=2$: l must be structurally smaller than t in size, ok.

$i=2$, $j=1$: h must be structurally smaller than l in size_l, looks ok but isn't!

$i=2$, $j=2$: r must be structurally smaller than l in size_l, ok.

The reason why h is not structurally smaller than l according to the Coq interpreter is not clear to me. As far as I understand from discussions on the Coq-club list [1] [2], this is a restriction in the interpreter, which could in principle be lifted, but very carefully to avoid introducing an inconsistency.
References

Cocorico, the nonterminating Coq wiki: Mutual Induction

Coq-Club mailing list:

[1] Thread “Meta theory: induction over terms with abstract variables”; first message by Robbert Krebbers on 2009-10-21; explanation by Adam Chlipala; counterexample by Bruno Barras; reply by Hugo Herbelin.

[2] Thread “Termination checking with nested recursion”; first message by Gert Smolka; reply by Hugo Herbelin.

The Coq Development Team. The Coq Proof Assistant: Reference Manual. Version 8.3 (2010). [web] ch. 4.

Eduardo Giménez. Codifying guarded definitions with recursive schemes. In Types'94: Types for Proofs and Programs, LNCS 996. Springer-Verlag, 1994. doi:10.1007/3-540-60579-7_3 [Springer]

Eduardo Giménez. Structural Recursive Definitions in Type Theory. In ICALP'98: Proceedings of the 25th International Colloquium on Automata, Languages and Programming. Springer-Verlag, 1998. [PDF]

Code Snippets

Fixpoint size (t : LTree) : nat :=
  let size_l := (fix size_l (l : list LTree) : nat :=
                  match l with
                    | nil => 0
                    | h::r => size h + size_l r
                  end) in
  match t with Node l =>
    1 + size_l l
  end.

Fixpoint size (t : LTree) : nat :=
  match t with Node l =>
    1 + (fix size_l (l : list LTree) : nat :=
          match l with
            | nil => 0
            | h::r => size h + size_l r
          end) l
  end.

LTree_rect = 
fun (P : LTree -> Type) (f : forall l : list LTree, P (Node l)) (l : LTree) =>
match l as l0 return (P l0) with
| Node x => f x
end
     : forall P : LTree -> Type,
       (forall l : list LTree, P (Node l)) -> forall l : LTree, P l

Fixpoint LTree_rect_nest (P : LTree -> Type) (Q : list LTree -> Type)
                         (f : forall l, Q l -> P (Node l))
                         (g : Q nil) (h : forall t l, P t -> Q l -> Q (cons t l))
                         (t : LTree) :=
  match t as t0 return (P t0) with
    | Node l => f l (list_rect Q g (fun u r => h u r (LTree_rect_nest P Q f g h u)) l)
  end.

Context

StackExchange Computer Science Q#104, answer score: 16

Revisions (0)

No revisions yet.