Prove imperative programs

Let’s have a look at where we’re at. In the beginning of this tutorial, we’ve covered what Alt-Ergo is, and what is can be used for. We’ve then gone through the basics of Alt-Ergo’s native language: how to express problems through new symbols, axioms and goals, how to create complex logical expressions, and how to represent functions – both implicitely and explicitely.

We are now going to look at an application: the proof of imperative programs.

This chapter does not intend to provide an exhaustive list of methods which can be used to prove imperative programs. Rather, it will try illustrate the kind of ideas that can be used for this goal, using Alt-Ergo’s native language.

As we said in the introduction of this tutorial, Alt-Ergo is specialized in the proof of programs. It is usually used under the hood of some other tool such as Why3, which breaks down an higher-level specification into a problem expressed in Alt-Ergo’s native language.

The code studied in this chapter can be seen as a human-readable version of what would be automatically generated by those higher-level tools.

In this chapter, we’ll do the following.

Look out for overflows errors, including in programs which use intermediate computation and chained instructions.

Prove properties about programs which may depend of global variables, using preconditions (hypotheses) and postconditions (goals).

Prove properties about programs which use loops, using loop invariants and loop variants.

Check overflows

Integer overflows

Arithmetic overflows are the source of many bugs, such of which with devastating effects.

They are the primary cause of the crash of the 1996 maiden flight of the Ariane 5 rocket. [1]

They led to a bug in the electrical system of the Boeing 787, discovered in 2015. [2]

In 2016, they made a Casino machine print a prize ticket of $42,949,672.76 – the maximum prize was intended to be only of $10,000. [3]

They also led to a nuclear holocaust, caused by no else than Gandhi (in the video game Civilization). [4]

Such errors happen when an arithmetic operation attempts to create a numeric value outside of the range that can be represented by the datatype that is used.

Such overflow bugs may be hard to discover and diagnose using only testing, rather than formal methods for program verification. Indeed, by their very nature, they may manifest themselves only for very large input data sets, which are less likely to be used in validation tests. Moreover, it may be hard to discover (without program analysis) which input values could be prompt to create such bugs.

Here is a C snippet, which computes the arithmetic mean between two integers. This is such a simple line of code that it could be found anywhere – including in basic algorithms such as binary search or merge sort.

int arithmetic_mean(int a, int b) {
    return (a+b)/2;
}

Let’s try to prove that this simple code isn’t prone to overflow errors.

Representation of machine integers – a naive proof

For this, we’ll check that we never go outside of the range that can be represented with machine int. Signed 32 bits integers can represent values from -2 147 483 648 ( $-2^{31}$ ) to 2 147 483 647 ( $2^{31}-1$ ) included.

logic int32_min, int32_max: int
axiom int32_min_def: int32_min = -2147483648
axiom int32_max_def: int32_max =  2147483647

predicate in_range(min:int, max:int, x:int) = 
    min <= x <= max
predicate valid_int32(x:int) = 
    in_range(int32_min, int32_max, x)

Note that we could easily represent other integer datatypes by changing the constants.

A naive proof could then go on like this.

(* Prelude: int32 range in Alt-Ergo *)

logic int32_min, int32_max: int
axiom int32_min_def: int32_min = -2147483648
axiom int32_max_def: int32_max =  2147483647

predicate in_range(min:int, max:int, x:int) = 
    min <= x <= max
predicate valid_int32(x:int) = 
    in_range(int32_min, int32_max, x)
    
(* Representation of our C function *)

function arithmetic_mean(a:int, b:int):int =
    (a+b)/2

(* Naive proof *)
    
goal arithmetic_mean_valid__incorrect_proof: 
    (* Precondition: we can assume that our C function is only used on valid int32 *)
    forall a, b:int. valid_int32(a) -> valid_int32(b) ->
    (* Postcondition: we want to prove that our output still lies in the right range *)
    valid_int32(arithmetic_mean(a,b))

Here, we have represented our C function with a function in Alt-Ergo’s native language. We’ve then created a goal:

We introduce our variables (the input of the function) with a universal quantifier.

We add the hypotheses that hold at the beginning of the function (the precondition).

We add what we want to hold at the end of the function (the postcondition).

This goal is proven Valid by Alt-Ergo. However, we haven’t actually proved the absence of overflow errors: we’ve missed something.

Indeed, here’s what happens if we compute the mean of 1 billion and 2 billions.

#include <stdio.h>

int arithmetic_mean(int a, int b) {
    return (a+b)/2;
}

int main() {
    int a = 1000000000;
    int b = 2000000000;
    
    printf("The mean of %d and %d is %d\n",a,b,arithmetic_mean(a, b));
    
    return 0;
} 

This prints The mean of 1000000000 and 2000000000 is -647483648.

What happened here ? 1 billion and 2 billions were both valid 32 bits integers, and their mean (1.5 billion) is too.

However we’ve forgot about one thing: intermediate values.

Checking intermediate values – “let” expressions and local computation

The expression (a+b)/2 unfolds in the following way.

The value a+b is first computed.

It is then divided by 2 (using a bit shift).

The issue is on the first line: while :math:` leq b`, which should make the result stay in the right range, $a+b$ may overflow.

We can write out a complete goal by checking that all intermediate computations are valid.

A convenient to do this in Alt-Ergo is by using the let [...] = [...] in [...] construct, which allows for local computation. This constructs creates a local alias. For example, 2+(let x = 3+4 in x-5) is interpreted as 2+((3+4)-5).

(* Prelude: int32 range in Alt-Ergo *)

logic int32_min, int32_max: int
axiom int32_min_def: int32_min = -2147483648
axiom int32_max_def: int32_max =  2147483647

predicate in_range(min:int, max:int, x:int) = 
    min <= x <= max
predicate valid_int32(x:int) = 
    in_range(int32_min, int32_max, x)

(* Unfold the definition of arithmetic_mean to create a complete goal *)
    
goal arithmetic_mean_valid: 
    (* Introduce inputs and hypotheses *)
    forall a, b:int. valid_int32(a) -> valid_int32(b) ->
    (* A computation occurs -- check there is no overflow *)
    let sum = a+b   in valid_int32(sum) and
    (* Check output *)
    let res = sum/2 in valid_int32(res) 

Alt-Ergo says I don’t know: indeed, there is an error in our code.

As an extra check, you might want to prove that you haven’t done any mistake in unfolding the definition. This is only needed because we’re writing goals by hand and we want extra safety: a higher-level tool could unfold things by hand without making mistakes.

(* Prelude: int32 range in Alt-Ergo *)

logic int32_min, int32_max: int
axiom int32_min_def: int32_min = -2147483648
axiom int32_max_def: int32_max =  2147483647

predicate in_range(min:int, max:int, x:int) = 
    min <= x <= max
predicate valid_int32(x:int) = 
    in_range(int32_min, int32_max, x)

(* Representation of our C function *)

function arithmetic_mean(a:int, b:int):int =
    (a+b)/2

(* Unfold the definition of arithmetic_mean to create a complete goal *)
    
goal arithmetic_mean_valid: 
    (* Introduce inputs and hypotheses *)
    forall a, b:int. valid_int32(a) -> valid_int32(b) ->
    (* A computation occurs -- check there is no overflow *)
    let sum = a+b   in valid_int32(sum) and
    (* Check output - no overflow, and same as function *)
    let res = sum/2 in valid_int32(res) and res = arithmetic_mean(a, b)

Exercise: safe mean

Here is an example of C code which compute the mean of two int32 without overflows.

int arithmetic_mean_no_overflow(int a, int b) {
    return (a/2) + (b/2) + ((a%2) + (b%2))/2
}

Exercise

First, we want to prove that this new function is logically equivalent to arithmetic_mean: ignoring issues of integer precision, they always produce the same output. This will prove that we haven’t introduced a logical error when removing the overflow error.

Represent the function arithmetic_mean_no_overflow, and prove that it is logically equivalent to arithmetic_mean.

function arithmetic_mean(a:int, b:int):int =
    (a+b)/2
    
function arithmetic_mean'(a:int, b:int):int =
    (* YOUR CODE HERE *)
    
goal equiv_mean_mean':
    (** YOUR CODE HERE 
    
    
    *)

Solution

function arithmetic_mean(a:int, b:int):int =
    (a+b)/2
    
function arithmetic_mean'(a:int, b:int):int =
    (a/2) + (b/2) + ((a%2) + (b%2))/2
    
goal equiv_mean_mean':
    forall a, b:int.
    arithmetic_mean(a,b) = arithmetic_mean'(a,b)

Exercise

Let’s now prove that this new function doesn’t indeed produce overflow errors. Check that the result obtained in unfolding the definition corresponds to the higher-level representation.

You look at the code for goal arithmetic_mean_valid for inspiration.

(* Prelude: int32 range in Alt-Ergo *)

logic int32_min, int32_max: int
axiom int32_min_def: int32_min = -2147483648
axiom int32_max_def: int32_max =  2147483647

predicate in_range(min:int, max:int, x:int) = 
    min <= x <= max
predicate valid_int32(x:int) = 
    in_range(int32_min, int32_max, x)
    
(* Representation of our C function *)

function arithmetic_mean'(a:int, b:int):int =
    (a/2) + (b/2) + ((a%2) + (b%2))/2

(* Unfold the definition of arithmetic_mean' to create a complete goal *)
    
goal arithmetic_mean'_valid: 
    (** YOUR CODE HERE

    *)

Solution

Here is a solution. You might have unfolded arithmetic_mean’ in a slightly different way.

(* Prelude: int32 range in Alt-Ergo *)

logic int32_min, int32_max: int
axiom int32_min_def: int32_min = -2147483648
axiom int32_max_def: int32_max =  2147483647

predicate in_range(min:int, max:int, x:int) = 
    min <= x <= max
predicate valid_int32(x:int) = 
    in_range(int32_min, int32_max, x)
    
(* Representation of our C function *)

function arithmetic_mean'(a:int, b:int):int =
    (a/2) + (b/2) + ((a%2) + (b%2))/2

(* Unfold the definition of arithmetic_mean' to create a complete goal *)
    
goal arithmetic_mean'_valid: 
    forall a, b:int. valid_int32(a) -> valid_int32(b) ->
    let half_a  = a/2             in valid_int32(half_a)  and
    let half_b  = b/2             in valid_int32(half_b)  and
    let mod_a   = a%2             in valid_int32(mod_a)   and
    let mod_b   = b%2             in valid_int32(mod_b)   and
    let mod_sum = mod_a + mod_b   in valid_int32(mod_sum) and
    let half_ms = mod_sum/2       in valid_int32(half_ms) and
    let sum     = half_a + half_b in valid_int32(sum)     and
    let res     = sum + half_ms   in valid_int32(res)     and res = arithmetic_mean'(a,b)

Loop invariants

In the previous section, we’ve learned how we can check for properties such as absence of overflow errors using Alt-Ergo. We’ve used hypotheses on inputs, and proved properties on intermediate values and outputs.

In the rest of this chapter, we’re going to look at ways to prove programs using while loops. This section covers loop invariants, used to prove properties about loops. In particular, they can be used to prove correctness properties, assuming termination.

In the next section, we’re going to show that loops actually end (to show termination), by using loop variants.

Quick review: a simple example

Let’s review quickly out to work with loop invariants. This is only here to help you write proofs using loop invariants. If you’ve never heard of them before, you might want to check another page to learn more.

A loop invariant is a property of a program loop that is true before and after each iteration of the loop.

This property may depend on variables that are modified within the loop. At the end of the loop, the loop condition while be false, and the invariant will still be true: this can be used to prove properties about the current state just after the loop, assuming that it ends.

Have a look at the following silly example (in pseudo-code).

x = 0

while x < 10
    x := x+1

It is easy to see that the value of x is $10$ at the end of the loop.

We can prove this property by using the invariant $x\leq 0$.

This is indeed a loop invariant: if $x<10$ and $x\leq 0$ before the loop, then $x\leq 0$ is true after the loop.

Therefore, if $x\leq 0$ before the loop, then at the end of the loop $x\leq 0$ is true and $x<10$ is false.

In other words, if $x\leq 0$ is true before the loop, $x=10$ after the loop.

The hypothesis being verified in this program, $x=10$ at the end of the program.

Let’s see how this could be expressed in Alt-Ergo.

Here is a snippet proving that the invariant holds.

predicate invariant(x:int) =
    x<=10
predicate condition(x:int) =
    x<10
    
goal invariant_holds_in_loop:
    forall x:int.
    (* Invariant and condition true before the iteration *)
    invariant(x) -> condition(x) ->
    (* Local computation occurs *)
    let x' = x+1 in
    (* Check invariant at the end of the iteration *)
    invariant(x')

Let’s now integrate this snippet in a complete example, showing that $x=10$ at the end of the loop.

predicate precondition(x:int) =
    x=0
predicate postcondition(x:int) =
    x=10
predicate invariant(x:int) =
    x<=10
predicate condition(x:int) =
    x<10
    
goal program_correct:
    (* The invariant is verified at the beginning of the loop *)
    forall x:int. precondition(x) -> invariant(x)
    
    and
    
    (* The invariant is preserved by the loop, and provides the right conclusion *)
    forall x:int. invariant(x) ->
    if condition(x) then
        (* This is indeed an invariant *)
        let x'=x+1 in
        invariant(x')
    else
        (* This invariant allows the proof of the postcondition *)
        postcondition(x)

Once again, note that this is the kind of code which would be generated by an higher-level tool (although our code is more human-readable).

More interesting example: integer square root

Let’s now try to prove a more interesting example, extracted from a course given by Claude Marché at the MPRI.

This is a why3 code used to verify a function which, given a positive integer $n$, computes $\lfloor \sqrt{n} \rfloor$, its integer square root.

Note

You may try out Why3 snippets on Try Why3.

(** {1 MPRI lecture 2-36-1 "Proof of Programs"} *)

(** {2 Integer square root} *)

use int.Int

function sqr(n:int) : int = n * n

val ref n : int
val ref count : int
val ref sum : int

let isqrt ()
requires { n >= 0 }
ensures { count >= 0 }
ensures { sqr count <= n < sqr (count + 1) }
diverges (** we are not yet attempting to prove termination *)
= count <- 0; sum <- 1;
    while sum <= n do
    invariant { 0 <= count }
    invariant { sqr count <= n }
    invariant { sum = sqr (count+1) }
    count <- count + 1;
    sum <- sum + 2 * count + 1
    done

We are going to ignore all issues about overflows to focus on the logic on this program.

Let’s remove all annotations about invariants, preconditions requires and postconditions ensures, and look at a C snippet which implements and uses isqrt.

#include <stdio.h>

int n, count, sum;

void isqrt() {
    count = 0;
    sum = 1;
    
    while (sum <= n) {
        count++;
        sum += 2*count + 1;
    }
}

int main() {
    scanf("%d",&n);
    isqrt();
    printf("isqrt of %d is %d\n", n, count);

    return 0;
}

Note that isqrt works on the context, and isn’t a pure function. Why3 can notice that those global values are used in isqrt, and quantify over them in the code submitted to Alt-Ergo.

Here is an example of the code that could be generated, using the invariants, preconditions and postconditions written in Why3.

Once again, this code is much more factorized that automatically generated code, and has been factorized in a non-generic way.

function sqr(n:int):int = n * n

predicate precondition(n:int) =
    n >= 0
predicate postcondition_1(count:int) =
    count >= 0
predicate postcondition_2(n:int, count:int) =
    sqr(count) <= n < sqr(count + 1)
predicate postconditions(n:int, count:int) =
    postcondition_1(count) and postcondition_2(n, count)

predicate condition(n:int, sum:int) =
    sum <= n

predicate invariant_1(count:int) =
    0 <= count
predicate invariant_2(n:int, count:int) =
    sqr(count) <= n
predicate invariant_3(count:int, sum:int) =
    sum = sqr(count + 1)
predicate invariants(n:int, count:int, sum:int) =
    invariant_1(count) and invariant_2(n, count) and invariant_3(count, sum)

goal isqrt_correct:
    (* Global variables are used. Introduce hypotheses *)
    forall n, count, sum:int. precondition(n:int) ->

    (* Invariants are verified at the beginning of the loop *)
    (
        (* Computation occurs before the loop *)
        let count' = 0 in
        let sum'   = 1 in

        invariants(n, count', sum')
    )
    and

    (* Invariants are preserved and provide the right conlusion *)
    
    (
        forall count', sum':int. invariants(n, count', sum') ->
        if condition(n, sum') then
        (
            (* Body *)
            let count'' = count' + 1 in
            let sum''   = sum' + 2 * count'' + 1 in
            invariants(n, count'', sum'')
        )
        else
        (
            (* End of the loop *)
            postconditions(n, count')
        )
    ) 

Loop variants

Quick review: a simple example

In the last section, we’ve looked at loop invariants to prove properties about loops, assuming termination.

We’re now going to look at what Alt-Ergo proves when it examines loop variants, which are used to prove termination of loops.

Generally, a loop variant can be seen as a function with the following properties.

Its domain is the state space of the program (the value of the variant may depend on any variable in the program)
Its codomain is a well-founded set.
Its is strictly decreases at each iteration of the loop.

A specific case is that of a variant whose value is an non-negative integer. Those conditions then translates as “the variant strictly decreases at each iteration of the loop (with regards to the usual order on integers)”.

This proves that the loop terminates. Indeed, an infinite sequence in a well-founded set cannot be strictly decreasing. (For example, a strictly decreasing sequence of natural numbers cannot go below $0$, and must thus stop at some point).

Invariants can be used to help proving that variants are correct.

Let’s get back at our first example.

x = 0

while x < 10
    x := x+1

In this case, we could use the variant $10-x$.

It is indeed strictly decreasing.

Assuming that $x\geq 0$ (our invariant), $10-x$ is indeed always positive.

In Alt-Ergo’s native language, this leads to the following code (assuming the invariant is correct).

function variant(x:int):int = 10 - x

predicate invariant(x:int) =
    x<=10
predicate condition(x:int) =
    x<10
    
goal variant_correct:
    (* At each iteration of the loop, the variant strictly decreases and stays positive *)
    forall x:int. invariant(x) -> (0 <= variant(x)) -> condition(x) ->
    (* Computation occurs *)
    let x' = x+1 in
    (* Check variant properties *)
    0 <= variant(x') < variant(x)

Combining the variant and invariant proofs together, we get the following code.

function variant(x:int):int = 10 - x

predicate precondition(x:int) =
    x=0
predicate postcondition(x:int) =
    x=10
predicate invariant(x:int) =
    x<=10
predicate condition(x:int) =
    x<10
    
goal program_correct:
    forall x:int. precondition(x) -> invariant(x) and (0 <= variant(x))
    and
    forall x:int. invariant(x) -> (0 <= variant(x)) ->
    if condition(x) then
        let x'=x+1 in
        invariant(x') and (0 <= variant(x') <= variant(x))
    else
        (* Proving that the variant stays positive after the loop is unnecessary for proving termination *)
        postcondition(x)

Integer square root

Proving termination in the case of the isqrt program is actually rather easy. Indeed, the loop condition is sum <= n, and sum is strictly increasing: we can simply choose n - sum as the variant.

use int.Int

function sqr(n:int) : int = n * n

val ref n : int
val ref count : int
val ref sum : int

let isqrt ()
requires { n >= 0 }
ensures { count >= 0 }
ensures { sqr count <= n < sqr (count + 1) }
= count <- 0; sum <- 1;
    while sum <= n do
    invariant { 0 <= count }
    invariant { sqr count <= n }
    invariant { sum = sqr (count+1) }
    variant { n - sum }
    count <- count + 1;
    sum <- sum + 2 * count + 1
    done 

Note

You may try out this Why3 snippet on Try Why3.

This can be simply added in our code in Alt-Ergo’s native language.

Note that the program had to be modified a little bit, to account for the fact that we only need to prove that the loop is correct if we enter it. Otherwise, we can directly check the postcondition.

function sqr(n:int):int = n * n

function variant(n:int, sum:int):int = n - sum

predicate precondition(n:int) =
    n >= 0
predicate postcondition_1(count:int) =
    count >= 0
predicate postcondition_2(n:int, count:int) =
    sqr(count) <= n < sqr(count + 1)
predicate postconditions(n:int, count:int) =
    postcondition_1(count) and postcondition_2(n, count)

predicate condition(n:int, sum:int) =
    sum <= n

predicate invariant_1(count:int) =
    0 <= count
predicate invariant_2(n:int, count:int) =
    sqr(count) <= n
predicate invariant_3(count:int, sum:int) =
    sum = sqr(count + 1)
predicate invariants(n:int, count:int, sum:int) =
    invariant_1(count) and invariant_2(n, count) and invariant_3(count, sum)

goal isqrt_correct:
    (* Global variables are used. Introduce hypotheses *)
    forall n, count, sum:int. precondition(n:int) ->
    
    (* Computations occurs *)
    let count0 = 0 in
    let sum0   = 1 in

    if condition(n, sum0) then
    (
        (* If we enter the loop, check correctness of the loop *)
        (
            (* (In)variant properties correct at the beginning *)
            invariants(n, count0, sum0) and
            0 <= variant(n, sum0)
        )
        and
        (
            (* Iteration or exit correct *)
            forall count1,sum1:int. invariants(n,count1,sum1) ->
            if condition(n, sum1) then
            (
                (* Invariant preserved. Variant initially positive. Variant strictly decreases *)
                0 <= variant(n, sum1) and
                
                let count2 = count1 + 1 in
                let sum2   = sum1 + 2 * count2 + 1 in
                
                invariants(n, count2, sum2) and
                variant(n, sum2) < variant(n, sum1)
            )
            else
            (
                (* End of the program *)
                postconditions(n, count1)
            )
        )
        
    )
    else
    (
        (* If we never enter the loop, don't bother with it *)
        postconditions(n, count0)
    )