patternMinor
Type system with subtyping on abstract data types?
Viewed 0 times
abstractwithsystemtypesubtypingtypesdata
Problem
Consider this example on binary trees in some ML-like language:
In the definition of rotate_left above I only want to consider rotation on trees where the right subtree is also a node:
In relation to a ML-like type system, this second version of rotate_left is partial where as the first version is total.
Is there a type system expressive enough that such functions would be considered total?
Note;
I use the term "subtyping" in the title of this question, since I think of the values of the type of the second version of rotate_left, to be a subset of the values of the type of the first version of rotate_left.
type tree =
| Null
| Node of tree * tree
let rotate_left p =
match p with
| Null -> p
| Node (a, q) ->
match q with
| Null -> p
| Node (b, c) ->
Node (Node (a, b), c)In the definition of rotate_left above I only want to consider rotation on trees where the right subtree is also a node:
let rotate_left (Node (a, Node (b, c))) =
Node (Node (a, b), c)In relation to a ML-like type system, this second version of rotate_left is partial where as the first version is total.
Is there a type system expressive enough that such functions would be considered total?
Note;
I use the term "subtyping" in the title of this question, since I think of the values of the type of the second version of rotate_left, to be a subset of the values of the type of the first version of rotate_left.
Solution
You want to write some executable code (
The problem when you have dependent types is that it's hard to keep type checking usable for a programming language. It's generally desirable to keep the type system decidable, because programmers tend to be annoyed when a programming error causes the compiler to loop forever. So at the very least the type constraints need to stick to a subset of the language that guarantees termination. And of course since type constraints are evaluated at compile time, they can't contain any side effects.
It isn't too difficult to define a subset of the language without arbitrary recursion. For example, just forbid recursive functions. And, to keep things simple, forbid the use of library functions or functions defined in a different module — if you don't do this then library functions need to be annotated, not only so that the type checker knows that they're allowed in types, but also so that the compiler knows what to make of them.
Even if type constraints are in a subset of the language that always terminates, that doesn't make them decidable. For example, if the type constraints can include enough arithmetic to encode addition and multiplication of variables — a context in which this comes up naturally is array lengths — then the type system encodes Diophantine equations and is therefore undecidable. It's possible to keep the system decidable if it only allows additions and not multiplications of variables, as in Dependent ML.
Even when type checking remains decidable, it's a lot harder to keep type inference decidable. That's not a big problem in itself, because programmers who want to benefit from the finer-grained types can just write type annotations. But it can be difficult to precisely define when type annotations are needed and when they aren't. It's still a difficult problem to find a good balance between requiring a lot of annotations, and making it easy for programmers to know when annotations are needed.
When the type constraints are purely defined in terms of pattern matching, as in your example, they can be expressed using generalized algebraic datatypes (GADT). GADT have matured enough to be supported in mainstream languages, notably OCaml. To avoid extensive analysis of which functions are allowed in types, most implementations of GADT including OCaml's use the type system to express constraints, not the term syntax. You would need to define
is_left_restartable) and use it inside a type annotation. This is, by definition, a dependent type.The problem when you have dependent types is that it's hard to keep type checking usable for a programming language. It's generally desirable to keep the type system decidable, because programmers tend to be annoyed when a programming error causes the compiler to loop forever. So at the very least the type constraints need to stick to a subset of the language that guarantees termination. And of course since type constraints are evaluated at compile time, they can't contain any side effects.
It isn't too difficult to define a subset of the language without arbitrary recursion. For example, just forbid recursive functions. And, to keep things simple, forbid the use of library functions or functions defined in a different module — if you don't do this then library functions need to be annotated, not only so that the type checker knows that they're allowed in types, but also so that the compiler knows what to make of them.
Even if type constraints are in a subset of the language that always terminates, that doesn't make them decidable. For example, if the type constraints can include enough arithmetic to encode addition and multiplication of variables — a context in which this comes up naturally is array lengths — then the type system encodes Diophantine equations and is therefore undecidable. It's possible to keep the system decidable if it only allows additions and not multiplications of variables, as in Dependent ML.
Even when type checking remains decidable, it's a lot harder to keep type inference decidable. That's not a big problem in itself, because programmers who want to benefit from the finer-grained types can just write type annotations. But it can be difficult to precisely define when type annotations are needed and when they aren't. It's still a difficult problem to find a good balance between requiring a lot of annotations, and making it easy for programmers to know when annotations are needed.
When the type constraints are purely defined in terms of pattern matching, as in your example, they can be expressed using generalized algebraic datatypes (GADT). GADT have matured enough to be supported in mainstream languages, notably OCaml. To avoid extensive analysis of which functions are allowed in types, most implementations of GADT including OCaml's use the type system to express constraints, not the term syntax. You would need to define
is_left_rotatable as a type, and to write a transformation function from matching trees to is_left_rotatable.Context
StackExchange Computer Science Q#104069, answer score: 6
Revisions (0)
No revisions yet.