Model complex structures with types

In this chapter, we’re going to explore one of the features that make Alt-Ergo’s native language great at expressing problems that turn up in program verification: its rich type system.

We’ve already encountered basic types, like bool and int which are available in some form or the other in most programming languages, or like real (whose arbitrary-precision is a bit rarer).

More generally, types can be used to represent many forms of data, and to formalize its structure. Several languages use rich type systems, both as an expressive tool for the programmer and as a tool to enforce various properties of their programs – including the absence of various categories of bugs.

In Alt-Ergo, type constructs can be used to model complex structures and prove properties about them. New types can be created, using the following features.

Abstract types.

Polymorphism: new types may depend on type variables, which could represent any type.

Algebraic datatypes: structured types created from a collection of other types.

Sum types: disjoint union of types. Their values are created by using one constructor from a fixed collection of them. These include enumerations, where all constructors are abstract (enums are tagged disjoint union of unit type).

Product types: cartesian product of types. Their values are created by specifying the value of each of their fields. They are implemented by records in Alt-Ergo’s native language.

Note that algebraic datatypes may also depend on themselves: this allows for the creation of recursive data types, such as lists and trees.

In the first section, we’ll skim over a short list of examples to illustrate these constructs and see how they can be build in Alt-Ergo’s native language. You may skim over this section to get an idea of the possibilities, and check the documentation for a more precise syntax.

We’ll then review pattern-matching, a very useful construct to navigate algebraic data types.

Finally, we’ll on a few modelisation exercises.

Examples

Example – Abstract type

type thread
    
logic     parent: thread -> thread
predicate is_ancestor_of(old:thread, young:thread) =
    old = parent(young) or is_ancestor_of(old, parent(young))
    
logic root: thread
axiom root_def: forall x:thread. is_ancestor_of(root, x)

thread is an abstract type, but it is still possible to reason about it, creating variables, functions, predicates, axioms, ..

Example – Enumeration

type suit  = Diamonds | Clubs | Hearts | Spades
type color = Red | Black

function color_of_suit(x:suit):color = match x with
    | Diamonds -> Red
    | Hearts   -> Red
    | Clubs    -> Black
    | Spades   -> Black
    end

goal g1:
    forall x, y: suit.
    color_of_suit(x) <> color_of_suit(y) -> x <> y

goal g2:
    forall x: suit.
    color_of_suit(x) = Red -> (x = Diamonds or x = Hearts)

Enums may be used to represent disjoint categories of items. Here, values of type color can be build using two constructors, which both take no arguments (this is always the case for enums) : either Red or Black.

Pattern matching is used to create the conversion function color_of_suit.

Example – Records

type point = {x:real; y:real; z:real}
type oriented_segment = {fst:point; snd:point}

predicate opposites(s1:oriented_segment, s2:oriented_segment) =
    (s1.fst = s2.snd) and (s1.snd = s2.fst)

logic p:point
logic s1, s2:oriented_segment
axiom s1_def: s1 = {fst={x=2.;y=1.;z=3.}; snd=p}
axiom s2_partial_def: s2.snd = {x=2.;y=1.;z=3.}

goal g: s2.fst = p -> opposites(s1,s2)

Records can be used to pack values together, and create abstraction. Here, point is a record with three fields, x, y and z, all of type real. Similarly, oriented_segment is a record with two fields, both of type point.

Example – Richer sum types

(* Use of the real exponentiation operator to turn x into a real number *)
function real_of_int(n:int):real = n **. 1

type summable = Int of {int_val :int} | Real of {real_val:real}
    
function plus(a: summable, b:summable):summable =
    match a with
    | Int(a_val)  -> match b with
                        | Int(b_val)  -> Int(a_val + b_val)
                        | Real(b_val) -> Real(real_of_int(a_val) + b_val)
                        end
    | Real(a_val) -> match b with
                        | Int(b_val)  -> Real(a_val + real_of_int(b_val))
                        | Real(b_val) -> Real(a_val + b_val)
                        end
    end

goal g: plus(Int(1),Real(2.)) = Real(3.)

Using records, it is possible to create sum types whose constructors accept arguments.

Here, we are able to create a new type summable, build from the disjoint union of int and real.

Example – Polymorphism

Here, option is parameterized by the type variable 'a.

We can use it with the type int to construct int option, which is used as the output type of a function which may either return an integer or fail.

(Note that Alt-Ergo actually won’t fail on division by \(0\)).

Example – Recursive datatypes

type thread
    
logic     parent: thread -> thread
predicate is_ancestor_of(old:thread, young:thread) =
    old = parent(young) or is_ancestor_of(old, parent(young))
    
logic root: thread
axiom root_def: forall x:thread. is_ancestor_of(root, x)

As algebraic datatypes can depend on themselves, it is possible to create more complex structures.

This example allows for the representation of formulas in a 3-valued logic.

Pattern matching

In the previous examples, we have used the construct match [...] with [...] end a couple times.

This construct makes it possible to use a “Really Cool Feature”, which is available in most functional languages (but not restricted to them) : pattern matching.

Pattern matching makes it possible to navigate complex data structures in an elegant and concise way.

In particular, it is extremely useful to unfold algebraic datatypes, and pairs well with them.

More concretely, pattern matching can be used as a conditional programming construct: it is used to recognize the form of values and lets the computation be guided accordingly, associating expression to compute with patterns that can be used to construct this value.

In Alt-Ergo’s native language, this can be written in the following way.

<[...]> must be replaced by a valid value to unfold, pattern, or result.

All results must have the same type.

The number of patterns is arbitrary (at least one).

_ may be used as a default pattern (used if no other pattern matches the value).

match <value> with
| <pattern1> -> <result1>
| <pattern2> -> <result2>
...
| <patternM> -> <patternM>
end

Example

Pattern matching is easily used on constructors with no arguments.

type suit  = Diamonds | Clubs | Hearts | Spades
type color = Red | Black

function color_of_suit(x:suit):color = match x with
    | Diamonds -> Red
    | Hearts   -> Red
    | Clubs    -> Black
    | Spades   -> Black
    end

Note

Pattern matching must be exhaustive, and all cases must be reachable !

If some cases are missing or redundant, the compiler will show an error.

This is one of the great things with pattern matching: the compiler helps you check that you haven’t forgotten anything !

Example

Look at what happens if a case is missing or redundant: the compiler helps you detect those errors.

type suit  = Diamonds | Clubs | Hearts | Spades
type color = Red | Black

function color_of_suit(x:suit):color = match x with
    | Diamonds -> Red
    | Hearts   -> Red
    | Clubs    -> Black
    end

type suit  = Diamonds | Clubs | Hearts | Spades
type color = Red | Black

function color_of_suit(x:suit):color = match x with
    | Diamonds -> Red
    | Hearts   -> Red
    | Clubs    -> Black
    | Spades   -> Black
    | Clubs    -> Black
    end

Pattern matching is able to extract the arguments of the constructors used to build the value that is unfolded.

Example

type point = P2 of {x2:real; y2: real} | P3 of {x3:real; y3: real; z3: real}

function x_value(p:point):real =
    match p with
    | P2(x, _)    -> x
    | P3(x, _, _) -> x
    end

Note that we didn’t have to extact the values of arguments we are not interested in: _ case be used for them.

Warning

Because of a bug, Alt-Ergo’s type checker currently fails on this example (because of the two _ ).

Todo

Fix type checker bug on double _.

Warning

In Alt-Ergo’s native language, only one kind of pattern can be used: constructors of the datatype. This isn’t the case in other languages such as OCaml.

Because of this, Alt-Ergo’s pattern matching is restricted to algebraic datatypes.

Moreover, constructors cannot be nested.

You may check which examples are correct and which aren’t.

function aux(x:int):int =
    match x with
    | 0        -> 0
    | _        -> 1
    end

type foo = X | F of {v:foo}

function aux(x:foo):int =
    match x with
    | X        -> 0
    | F(f)     -> 1 + aux(f)
    end

type foo = X | F of {v:foo}

function aux(x:foo):int =
    match x with
    | X        -> 0
    | F(X)     -> 4
    | F(f)     -> 1 + aux(f)
    end

(* X is actually interpreted as a variable in the pattern F(X), not as a constructor *)
    
goal g1:  aux(F(X)) = 4
goal g1': aux(F(X)) = 1

goal g2:   aux(F(F(X))) = 4
goal g2':  aux(F(F(X))) = 2

type foo = X | F of {v:foo}

function aux(x:foo):int =
    match x with
    | X        -> 0
    | F(X)     -> 1
    | F(F(X))  -> 2
    | F(f)     -> 1 + aux(f)
    end

Modelisation exercises

3-valued logic

Let’s start with an implementation exercise to get a hand with pattern matching before using types for modelisation.

In this exercise, for the first time in this tutorial, you’re going to navigate algebraic datatypes using pattern matching.

In the beginning of this chapter, we’ve illustrated recursive data structures with the following example, which represents propositions in some 3-valued logic.

eval_not, eval_or and eval_and have been written accordingly to the following diagram.

(F, false; U, unknown; T, true)

NOT(A)
A	¬A
F	T
U	U
T	F

AND(A, B)
A ∧ B		B
A ∧ B		F	U	T
A	F	F	F	F
	U	F	U	U
	T	F	U	T

OR(A, B)
A ∨ B		B
A ∨ B		F	U	T
A	F	F	U	T
	U	U	U	T
	T	T	T	T

Complete the code of the function eval, which computes the truth value of a prop3.

type bool3  = T | U | F
    
type prop3  = C   of {cst1:bool3}
            | Not of {not1:prop3}
            | And of {and1:prop3; and2:prop3}
            | Or  of {or1:prop3;  or2:prop3}
            
function eval_not(b: bool3):bool3 =
    match b with
    | T -> F
    | U -> U
    | F -> T
    end
    
function eval_and(b1: bool3, b2:bool3):bool3 =
    if b1=F or b2=F then
        F
    else if b1=U or b2=U then
        U
    else
        T
        
function eval_or(b1: bool3, b2:bool3):bool3 =
    if b1=T or b2=T then
        T
    else if b1=U or b2=U then
        U
    else
        F
            
function eval(p: prop3):bool3 =
    (* YOUR CODE HERE
    
    
    
    *)
    
(* Your function must pass the following tests
    Note that this is not a complete proof of correctness of your function *)

goal g0:
    forall b:bool3. eval(C(b)) = b

goal g1:
    let p   = And(Not(C(U)),Or(C(T),C(F))) in
    eval(p) = U
    
goal g2:
    let p   = And(Not(C(F)),Or(C(T),C(F))) in
    eval(p) = T

Solution

To navigate recursive datatypes, the key is recursive functions (with pattern matching).

type bool3  = T | U | F
    
type prop3  = C   of {cst1:bool3}
            | Not of {not1:prop3}
            | And of {and1:prop3; and2:prop3}
            | Or  of {or1:prop3;  or2:prop3}
            
function eval_not(b: bool3):bool3 =
    match b with
    | T -> F
    | U -> U
    | F -> T
    end
    
function eval_and(b1: bool3, b2:bool3):bool3 =
    if b1=F or b2=F then
        F
    else if b1=U or b2=U then
        U
    else
        T
        
function eval_or(b1: bool3, b2:bool3):bool3 =
    if b1=T or b2=T then
        T
    else if b1=U or b2=U then
        U
    else
        F
            
function eval(p: prop3):bool3 =
    match p with
    | C(b)        -> b
    | Not(p0)     -> eval_not(eval(p0))
    | And(p1, p2) -> eval_and(eval(p1),eval(p2))
    | Or (p1, p2) -> eval_or (eval(p1),eval(p2))
    end
    
(* Your function must pass the following tests
Note that this is not a complete proof of correctness of your function *)

goal g0:
    forall b:bool3. eval(C(b)) = b

goal g1:
    let p   = And(Not(C(U)),Or(C(T),C(F))) in
    eval(p) = U
    
goal g2:
    let p   = And(Not(C(F)),Or(C(T),C(F))) in
    eval(p) = T

Lists

Exercise

Create a datatype representing a linked list of integers.

type list = (* YOUR CODE HERE *)

(* *)

Solution

This can be done using a recursive datatype.

type list = Nil | Cons of {h:int; t:list} 

Exercise

Use polymorphism to create a linked list type parameterized by a type: for every type, you should be able to create a list having elements of this type.

Call this type list.

(* YOUR CODE HERE *)

(* *)

Solution

Use a type variable.

type 'a list = Nil | Cons of {h:'a; t:'a list} 

Exercise

Write a couple functions on lists.

length(l) should return the number of elements of the list l.

mem(e,l) should be true if and only if e appears in l.

index(e,l) should return the smallest index at which e appears in l. The index of the first element of the list is \(0\). If e doesn’t appear in l, this should return \(-1\).

(* YOUR list TYPE HERE*)

function length(l:'a list):int =
    (* YOUR CODE HERE
    
    *)
    
predicate mem(e:'a, l:'a list) =
    (* YOUR CODE HERE
    
    *)

Solution

type 'a list = Nil | Cons of {h:'a; t:'a list}

function length(l:'a list):int =
    match l with
    | Nil        -> 0
    | Cons(_, t) -> 1 + length(t)
    end
    
predicate mem(e:'a, l:'a list) =
    match l with
    | Nil        -> false
    | Cons(h, t) -> (e=h) or mem(e,t)
    end

Exercise

Complete the code of index, using the type option.

index(e,l) should return the smallest index at which e appears in l. The index of the first element of the list is \(0\). If e doesn’t appear in l, this should return None.

(* COPY THE CODE OF THE PREVIOUS EXERCISE HERE

*)
    
type int_option = None | Some of {x:int}
    
function index(e:'a, l:'a list):int_option =
    (* YOUR CODE HERE
    
    *)
        
(* Check that your fonction work on a couple of examples *)

(* YOUR CHECKS HERE

*)

Solution

Here is a solution. Note that there exists better ways to unfold the option type, using helper functions. (Monads can be used for this).

type 'a list = Nil | Cons of {h:'a; t:'a list}

function length(l:'a list):int =
    match l with
    | Nil        -> 0
    | Cons(_, t) -> 1 + length(t)
    end
    
predicate mem(e:'a, l:'a list) =
    match l with
    | Nil        -> false
    | Cons(h, t) -> (e=h) or mem(e,t)
    end
    
type int_option = None | Some of {x:int}
    
function index(e:'a, l:'a list):int_option =
    match l with
    | Nil        -> None
    | Cons(h, t) -> if e=h then 
                        Some(0) 
                    else
                        match index(e,t) with
                        | None      -> None
                        | Some(a)   -> Some(1+a)
                        end
    end

(* A few checks *)

goal g0: let l = Cons(2,Cons(3,Nil)) in
    index(0,l) = None
    
goal g2: let l = Cons(2,Cons(3,Nil)) in
    index(2,l) = Some(0)
    
goal g3: let l = Cons(2,Cons(3,Nil)) in
    index(3,l) = Some(1)

Binary trees

As a last example, have a look at binary trees. Each node may have 0, 1 or 2 children.

Implement a type tree of binary trees.

Add a fuction weight, which computes the total number of nodes in the tree.

Add a function height, which computes their height. The heights of an empty tree and of a single node are both \(0\).

Add a predicate is_heap, which checks whether a given binary tree whose values are integers is a heap.

You may have to implement helper functions.

(** YOUR tree TYPE here 

*)

function weight(t:'a tree):int =
    (** 
        YOUR CODE HERE 
    
    *)

(** 
    Add helper functions if needed

*)

function height(t:'a tree):int =
    (** 
        YOUR CODE HERE 
    
    *)
     
(** 
    Add helper functions if needed

*)

predicate is_heap(t:int tree) =
    (** 
        YOUR CODE HERE 
    *)

Solution

Here is a possible solution.

For the heap, we used a little trick: it is possible to expend the integers by adding a value greater than all other. A similar could be written without this: we would simply have to copy the code of is_heap_aux in is_heap, and drop a condition.

(* Tree type *)

type 'a tree = Nil | Node of {left:'a tree; val:'a; right:'a tree}

(* Weight *)

function weight(t:'a tree):int =
    match t with
    | Nil         -> 0
    | Node(l,_,r) -> 1 + weight(l) + weight(r)
    end

(* Height *)
    
function max(x:int, y:int):int =
    if x > y then x else y
    
function height_aux(t:'a tree):int =
    match t with
    | Nil         -> -1
    | Node(l,_,r) -> 1 + max(height_aux(l), height_aux(r))
    end
    
function height(t:'a tree):int =
    match t with
    | Nil -> 0      
    | _   -> height_aux(t)
    end
    
(* Heap *)
    
predicate is_heap_aux(t:int tree, max_key:int) =
    match t with
    | Nil         -> true
    | Node(l,v,r) -> max_key >= v and is_heap_aux(l, v) and is_heap_aux(r, v)
    end
    
logic infty:int
axiom infty_def: forall x:int. x<=infty
predicate is_heap(t:int tree) = is_heap_aux(t, infty)

(* A couple of checks *)
goal g_nil: let t = Nil in
    weight(t) = 0 and height(t) = 0 and is_heap(t)

goal g_leaf: forall v:'a. let t = Node(Nil,v,Nil) in
    weight(t) = 1 and height(t) = 0
goal g_leaf': forall v:int. let t = Node(Nil,v,Nil) in
    is_heap(t)

logic t1, t2:int tree
axiom t1_def: t1 = Node(Nil,10,Node(Nil,5,Node(Nil,7,Node(Nil,8,Nil))))
axiom t2_def: t2 = Node(Node(Nil,10,Nil),30,Node(Node(Nil,5,Nil),20,Node(Nil,2,Nil)))

goal g1:
    weight(t1) = 4 and height(t1) = 3 and not(is_heap(t1))
goal g2:
    weight(t2) = 5 and height(t2) = 2 and is_heap(t2)