patternModerate
Recursive definitions over an inductive type with nested components
Viewed 0 times
withtypenestedrecursivecomponentsinductiveoverdefinitions
Problem
Consider an inductive type which has some recursive occurrences in a nested, but strictly positive location. For example, trees with finite branching with nodes using a generic list data structure to store the children.
The naive way of defining a recursive function over these trees by recursing over trees and lists of trees does not work. Here's an example with the
This definition is ill-formed (error message excerpted):
Why is the definition ill-formed, even though
If you aren't fluent in Coq syntax:
$$\begin{align}
\mathtt{LTree} ::= & \\
\vert & \mathtt{list}(\mathtt{LTree}) \\
\end{align}$$
We attempt to define the
Inductive LTree : Set := Node : list LTree -> LTree.The naive way of defining a recursive function over these trees by recursing over trees and lists of trees does not work. Here's an example with the
size function that computes the number of nodes.Fixpoint size (t : LTree) : nat := match t with Node l => 1 + (size_l l) end
with size_l (l : list LTree) : nat := match l with
| nil => 0
| cons h r => size h + size_l r
end.This definition is ill-formed (error message excerpted):
Error:
Recursive definition of size_l is ill-formed.
Recursive call to size has principal argument equal to
"h" instead of "r".Why is the definition ill-formed, even though
r is clearly a subterm of l? Is there a way to define recursive functions on such a data structure?If you aren't fluent in Coq syntax:
LTree is an inductive type corresponding to the following grammar.$$\begin{align}
\mathtt{LTree} ::= & \\
\vert & \mathtt{list}(\mathtt{LTree}) \\
\end{align}$$
We attempt to define the
size function by induction over trees and lists. In OCaml, that would be:type t = Node of t list
let rec size = function Node l -> 1 + size_l l
and size_l = function [] -> 0
| h::r -> size h + size_l rSolution
What works
If you nest the definition of the fixpoint on lists inside the definition of the fixpoint on trees, the result is well-typed. This is a general principle when you have nested recursion in an inductive type, i.e. when the recursion goes through a constructor like
Or if you prefer to write this more tersely:
(I have no idea who I heard it from first; this was certainly discovered independently many times.)
A general recursion predicate
More generally, you can define the “proper” induction principle on
Let's add the induction hypothesis on lists. To fulfill it in the recursive call, we call the list induction principle and pass it the tree induction principle on the smaller tree inside the list.
Why
The answer to why lies in the precise rules for accepting recursive functions. These rules are perforce subtle, because there is a delicate balance between allowing complex cases (such as this one, with nested recursion in the datatype) and unsoundness. The Coq reference manual introduces the language (the calculus of inductive constructions, which is the proof language of Coq), mostly with formally precise definitions, but if you want the exact rules regarding induction and coinduction you'll need go to the research papers, on this topic Eduardo Giménez's [1].
Starting with the Coq manual, in the notation of the
$$\begin{align*}
\Gamma_1 &= (x : \mathtt{LTree}) &
A_1 &= \mathrm{nat} &
t_1 &= \mathsf{case}(x, \mathtt{LTree}, \lambda y. g_1 (f_2 y)) \\
\Gamma_2 &= (l : \mathtt{list}\;\mathtt{LTree}) &
A_2 &= \mathrm{nat} &
t_2 &= \mathsf{case}(l, \mathtt{list \: LTree}, \lambda h \: r. g_2 (f_1 h) (f_2 r)) \\
\end{align*}$$
In order for the fixpoint definition to be accepted, “if $f_j$ occurs [in $t_i$] then … the argument should be syntactically recognized as structurally smaller than [the argument passed to $f_i$]” (simplifying because the functions have a single argument). Here, we need to check that
The reason why
References
Cocorico, the nonterminating Coq wiki: Mutual Induction
Coq-Club mailing list:
The Coq Development Team. The Coq Proof Assistant: Reference Manual. Version 8.3 (2010). [web] ch. 4.
Eduardo Giménez. Codifying guarded definitions with recursive schemes. In Types'94: Types for Proofs and Programs, LNCS 996. Springer-Verlag, 1994. doi:10.1007/3-540-60579-7_3 [Springer]
Eduardo Giménez. Structural Recursive Definitions in Type Theory. In ICALP'98: Proceedings of the 25th International Colloquium on Automata, Languages and Programming. Springer-Verlag, 1998. [PDF]
If you nest the definition of the fixpoint on lists inside the definition of the fixpoint on trees, the result is well-typed. This is a general principle when you have nested recursion in an inductive type, i.e. when the recursion goes through a constructor like
list.Fixpoint size (t : LTree) : nat :=
let size_l := (fix size_l (l : list LTree) : nat :=
match l with
| nil => 0
| h::r => size h + size_l r
end) in
match t with Node l =>
1 + size_l l
end.Or if you prefer to write this more tersely:
Fixpoint size (t : LTree) : nat :=
match t with Node l =>
1 + (fix size_l (l : list LTree) : nat :=
match l with
| nil => 0
| h::r => size h + size_l r
end) l
end.(I have no idea who I heard it from first; this was certainly discovered independently many times.)
A general recursion predicate
More generally, you can define the “proper” induction principle on
LTree manually. The automatically generated induction principle LTree_rect omits the hypothesis on the list, because the induction principle generator only understand non-nested strictly positive occurences of the inductive type.LTree_rect =
fun (P : LTree -> Type) (f : forall l : list LTree, P (Node l)) (l : LTree) =>
match l as l0 return (P l0) with
| Node x => f x
end
: forall P : LTree -> Type,
(forall l : list LTree, P (Node l)) -> forall l : LTree, P lLet's add the induction hypothesis on lists. To fulfill it in the recursive call, we call the list induction principle and pass it the tree induction principle on the smaller tree inside the list.
Fixpoint LTree_rect_nest (P : LTree -> Type) (Q : list LTree -> Type)
(f : forall l, Q l -> P (Node l))
(g : Q nil) (h : forall t l, P t -> Q l -> Q (cons t l))
(t : LTree) :=
match t as t0 return (P t0) with
| Node l => f l (list_rect Q g (fun u r => h u r (LTree_rect_nest P Q f g h u)) l)
end.Why
The answer to why lies in the precise rules for accepting recursive functions. These rules are perforce subtle, because there is a delicate balance between allowing complex cases (such as this one, with nested recursion in the datatype) and unsoundness. The Coq reference manual introduces the language (the calculus of inductive constructions, which is the proof language of Coq), mostly with formally precise definitions, but if you want the exact rules regarding induction and coinduction you'll need go to the research papers, on this topic Eduardo Giménez's [1].
Starting with the Coq manual, in the notation of the
Fix rule, we have the fixpoint definition of $\mathsf{Fix} f_i \, \{ f_1 : A_1 := t_1 \; ; \;\; f_2 : A_2 := t_2 \}$ where:$$\begin{align*}
\Gamma_1 &= (x : \mathtt{LTree}) &
A_1 &= \mathrm{nat} &
t_1 &= \mathsf{case}(x, \mathtt{LTree}, \lambda y. g_1 (f_2 y)) \\
\Gamma_2 &= (l : \mathtt{list}\;\mathtt{LTree}) &
A_2 &= \mathrm{nat} &
t_2 &= \mathsf{case}(l, \mathtt{list \: LTree}, \lambda h \: r. g_2 (f_1 h) (f_2 r)) \\
\end{align*}$$
In order for the fixpoint definition to be accepted, “if $f_j$ occurs [in $t_i$] then … the argument should be syntactically recognized as structurally smaller than [the argument passed to $f_i$]” (simplifying because the functions have a single argument). Here, we need to check that
- $i=1$, $j=2$:
lmust be structurally smaller thantinsize, ok.
- $i=2$, $j=1$:
hmust be structurally smaller thanlinsize_l, looks ok but isn't!
- $i=2$, $j=2$:
rmust be structurally smaller thanlinsize_l, ok.
The reason why
h is not structurally smaller than l according to the Coq interpreter is not clear to me. As far as I understand from discussions on the Coq-club list [1] [2], this is a restriction in the interpreter, which could in principle be lifted, but very carefully to avoid introducing an inconsistency.References
Cocorico, the nonterminating Coq wiki: Mutual Induction
Coq-Club mailing list:
- [1] Thread “Meta theory: induction over terms with abstract variables”; first message by Robbert Krebbers on 2009-10-21; explanation by Adam Chlipala; counterexample by Bruno Barras; reply by Hugo Herbelin.
- [2] Thread “Termination checking with nested recursion”; first message by Gert Smolka; reply by Hugo Herbelin.
The Coq Development Team. The Coq Proof Assistant: Reference Manual. Version 8.3 (2010). [web] ch. 4.
Eduardo Giménez. Codifying guarded definitions with recursive schemes. In Types'94: Types for Proofs and Programs, LNCS 996. Springer-Verlag, 1994. doi:10.1007/3-540-60579-7_3 [Springer]
Eduardo Giménez. Structural Recursive Definitions in Type Theory. In ICALP'98: Proceedings of the 25th International Colloquium on Automata, Languages and Programming. Springer-Verlag, 1998. [PDF]
Code Snippets
Fixpoint size (t : LTree) : nat :=
let size_l := (fix size_l (l : list LTree) : nat :=
match l with
| nil => 0
| h::r => size h + size_l r
end) in
match t with Node l =>
1 + size_l l
end.Fixpoint size (t : LTree) : nat :=
match t with Node l =>
1 + (fix size_l (l : list LTree) : nat :=
match l with
| nil => 0
| h::r => size h + size_l r
end) l
end.LTree_rect =
fun (P : LTree -> Type) (f : forall l : list LTree, P (Node l)) (l : LTree) =>
match l as l0 return (P l0) with
| Node x => f x
end
: forall P : LTree -> Type,
(forall l : list LTree, P (Node l)) -> forall l : LTree, P lFixpoint LTree_rect_nest (P : LTree -> Type) (Q : list LTree -> Type)
(f : forall l, Q l -> P (Node l))
(g : Q nil) (h : forall t l, P t -> Q l -> Q (cons t l))
(t : LTree) :=
match t as t0 return (P t0) with
| Node l => f l (list_rect Q g (fun u r => h u r (LTree_rect_nest P Q f g h u)) l)
end.Context
StackExchange Computer Science Q#104, answer score: 16
Revisions (0)
No revisions yet.