Functions & Induction in Athena: Part 1

Tannr October 15, 2022 [Tutorials] #Proofs #Athena #Functional Programming #Induction

Introduction

This tutorial provides a hands-on introduction to the Athena programming language. Athena is more than a programming language, though. It is also:

Tutorial Goals & Pre-requisites

Proof development, much like software development, is an iterative process. The goal of this tutorial is not only to show how to use basic features of Athena to express functions and proofs, but to demonstrate how Athena can help proof engineers through the process of proof discovery, incremental proof development, as well as presentation.

This tutorial does not assume any prior experience with the Athena language, though it does assume that the reader understands a select set of basic topics in discrete mathematics, such as induction, set theory, and logic. This tutorial also assumes that the reader has experience with at least one programming language.

I highly suggest that readers follow along by writing and running each example. The quickest way to do this is to simply open the Athena Playground in a separate browser window and create a new file to get started. You'll also find the code for this tutorial under the "tutorials" tab in the playground.

For local installations, please read Athena's installation instructions.

A Functional Program

Our goal today is to prove a particular property about factorials.

Recall that the factorial of some number $n$ is written as $n!$ and is defined like so:

$n! = n \times (n - 1) \times (n - 2) \times ... \times 2 \times 1$

Sequences & Recurrence Relations

The factorial function defines a recurrence relation. A recurrence relation is a special kind of relation that defines rules for generating sequence... a sequence is a function for which the domain is a consecutive subset of integers (such as ${0, 1, 2, 3, 4}$). Oftentimes, the domain of a sequence is referred to as the "index set". A sequence of values is analogous to "array" structure provided by many programming languages.

A recurrence relation maps the first $k$ elements of the index set to fixed values. It defines all other values in terms of one or more previously-occurring values in the sequence. The definition of the factorial function as a recurrence relation looks like this:

$F_{0} = 1$

$F_{1} = 1$

$F_{n} = n * F_{n-1} \ \ \ for \ n \gt 1$

The Factorial in Athena

Because Athena provides capabilities to write both programs and proofs, let's begin by writing a simple functional program that computes the factorial of some number.


define [- <] := [minus less?]

define factorial := lambda (n) (
                    check {
                        (n < 1) => 1
                        | else => (times (factorial(n - 1)) n)
                    }
                )


(eval factorial(5))



A few observations can be made about the above code.

First, we can bind a function to a name using the define keyword.

Second, functions can be defined via the lambda keyword, followed by parenthesized named parameters (n in this case) and then a parenthesized body: lambda (params) (body).

Third, if/else statements can be expressed in Athena with check statements. check statements have the following structure check { condition => result1 | else => result 2}.

Functions to Proofs

Now that we've defined a procedure to compute a factorial, it'd be useful if we could prove some properties about it.

For example, we know intuitively that $5! \gt 4!$. In fact, we know that for any number greater than 1, factorial n < factorial n + 1.

To prove this, we first need to axiomatize our factorial function.

Axiomatic Factorial

Let's begin by defining a module for our work. Modules allow us to encapsulate all of our definitions, axioms, and proofs into a nicely packaged theory.

We will also include as a dependency a built-in Athena module that provides an axiomatic definition of the natural numbers as well as axiomatic definitions & theorems for addition, multiplication, subtraction, and the "less-than" relation (i.e., the definition of the < as well as the <= symbol).

load "nat-less"
module Factorial {

}

Next, let's declare a function symbol, factorial whose domain and co-domain are both the natural numbers (denoted by N). We'll also give it

load "nat-less"
module Factorial {
    declare factorial: [N] -> N
}

Let's run this to ensure that the module is successfully defined. When you run this, you will notice significantly more output than when we defined factorial with a lambda expression. This output is coming from the nat-less file and describes the theorems & definitions contained therein.

You should see the following lines at the end of the output:


New symbol Factorial.factorial declared.

Module Factorial defined.

Equational Axioms

Let's now write some equational axioms. Recall that we have already defined a recurrence relation for the factorial function. All we need to do is translate that relation to Athena.

As a reminder, this is the recurrence relation we're using:

$F_{0} = 1$

$F_{n} = n * F_{n-1} \ \ \ for \ n \gt 1$

We can specify this in athena with a list of equational axioms:

define [< * one zero <=] := [N.< N.* (S N.zero) N.zero N.<= ]

assert fact-def := [
    (factorial zero = one)
    (factorial one = one)
    (forall ?n:N . factorial S n = S n * factorial n)
]

We can make this even prettier as well by removing the quantifier of the last rule, as well as the explicit ?n:N declaration (which indicates that n is a free variable representing some element in the set of natural numbers N).

The former is achieved by the assert* directive, which automatically universally quantifies the free variables we use in our sentences. We also go ahead and define some variable names to use later in proofs for convenience:

 define [n m k] := [?n:N ?m:N ?k:N]
    

assert* fact-def := [
    (factorial zero = one)
    (factorial one = one)
    (factorial S n = S n * factorial n)
]

Remember that we loaded in a module at the top of the file with load "nat-less". This module brings into scope a formal definition of the natural numbers, as well as formal definitions of < and <=. We will be using the definition of <= from that module by referencing it directly: N.Less=.<=-def. For convenience, let's just bind the axiomatic definitions to a simpler name, local to our module: define [lte-def] := N.Less=.<=-def.

Testing

Now that we have an initial axiomatic definition, we should spend a few minutes writing some simple tests to make sure we're on the right tack.

Our factorial function is defined using natural numbers N. N is an inductively defined set: datatype N := zero | (S N). You may recognize this definition as the so-called Peano representation of natural numbers.

Larger numbers are harder to read and write directly. Six, for example, looks like this: (S (S (S (S (S (S zero)))))). It'd be much more convenient if we could write factorial 6 in our test case instead of (factorial (S (S (S (S (S (S zero))))))). Similarly, we would like to view the output of a factorial evaluation as a standard integer rather than a long list of S's.

To convert any integer inputs to N in our factorial function, we can revise our declaration like so: declare factorial: [N] -> N [[int->nat]]. The list of options that appear after the function signature can include a variety of ways, including adding aliases, setting precedence level, and more.

The way we actually test our axiomatized function is by using Athena's built-in eval utility, which, given a term, will continually rewrite the term based on axioms in the assumption base about that term. Since we will be evaluating the factorial function over N, let's make eval transform its output from N to integers whenever an eval execution results in term of sort N.

(transform-output eval [nat->int])

The entire module should now look something like this (observe the test using eval at the bottom):

load "nat-less"

module Factorial {
    declare factorial: [N] -> N [factorial [int->nat]]
    
    define [< * one zero <=] := [N.< N.* (S N.zero) N.zero N.<= ]
    define [n m k] := [?n:N ?m:N ?k:N]
    
    define [lte-def] := N.Less=.<=-def

    assert* fact-def := [
        (factorial zero = one)
        (factorial one = one)
        (factorial S n = S n * factorial n)
    ]

    

    (eval factorial 5) # 120
}

If you run this, you'll notice that Athena outputs a message saying "Module Factorial was not successfully defined." and it will also print 120, indicating the result of computing $5!$.

A Proof About Factorials

With our factorial algorithm now axiomatized, we can now prove properties about the result(s) of its execution.

Mathematical Induction

The proof technique we will use is induction over the natural numbers.

A proof by induction roughly has two parts:

  1. Proof of the base case
  2. Proof that $P(a_{n}) \to P(a_{n+1})$ for some property $P$ over a sequence $a$.

Our Goal

Let's define our proof goal. As mentioned earlier in this tutorial, we want to prove that, for all natural numbers, $F_{n} < F_{n+1}$ where $F$ is the factorial function.

Let's write that in Athena:

define theorem := (forall n . factorial n <= factorial S n)

Incremental Proof Development with "Stubs"

Oftentimes, writing proofs is a gradual and creative process. Like in software development, proof development is often made easier by the ability to incrementally build towards the final deliverable, with the ability to check one's work along the way. For example, I use the Rust language on a daily basis, and when I'm building out a new module or interface, I'll often add stubs to my functions so that I can focus on the type signatures of the interface without the compiler scolding me:

pub fn my_func(a: SomeType) -> SomeOtherType {
    todo!()
}

The todo!() macro comes in really handy in these situations.

Athena provides a similar mechanism for "stubbing" portions of a proof so that we can get the structure right before moving on.

Before we jump right into Athena, let's first understand the base case and inductive step less formally:

$ Let \ P_{n} = 1 \lt n \to F_{n-1} \lt F_{n}$

$ Goal: \forall n \in \N \ | P_{n}$

$ Proof: By \ induction \ over \ n$

$ Base \ case: n = 0 \ and \ P_{0} $

$ Inductive \ step: Assume \ F_{n} < F_{n + 1}.\\ \ \ \ \ \ \ \ \ Prove \ that \ F_{n-1} \lt F_{n} \to F_{n} \lt F_{n + 1}$

Let's define this in Athena using Athena's by-induction proof method.

by-induction theorem {
    zero => ...
    | (S zero) => ...
    | (n as (S m)) => ,..
}

The body of by-induction expects zero as the base case because the natural numbers N are inductively defined, with zero as the base constructor.

When we induct over the property (forall n . factorial n <= factorial S n ), this property is instantiated with the base case into a zero-order proposition where n is substituted for the base element: factorial zero <= factorial S zero.

Notice that our proof by induction skeleton uses two base cases. This is because two out of our three axioms about factorial have, on the right side of the =, non-reducible ground terms. So they are already fully reduced or normalized.

Let's fill this skeleton in and, while we're at it, let's use Athena's !force method to "stub" our proof of each case:

by-induction theorem {
        zero => (!force (factorial zero <= factorial S zero))
        | (S zero) => (!force (factorial S zero <= factorial S S zero))
        | (n as (S m)) => (!force (factorial S m <= factorial S S m))
    }

This is the skeleton of our inductive proof. Now's a perfect time to perform a sanity check to ensure that we've correctly defined the conclusion for both base cases and the inductive step.

This is why the proof method !force is so useful: we can check that our definition of the base case(s) and inductive step are correct early on, rather than waiting until we've provided complete derivations of all conclusions, only to discover we were working towards the wrong goal the whole time.

If you've followed along, you should see the following at the end of Athena's output:

Theorem: (forall ?n:N
           (N.<= (Factorial.factorial ?n:N)
                 (Factorial.factorial (S ?n:N))))

Module Factorial defined.

This is exactly what our sentence theorem expresses. The only differences being that Athena's output is in prefix notation, outputs the module names associated with function symbols, and annotates variables such as n with its Sort (analogous to a "type").

Proving the Base Cases

Proving that factorial zero <= factorial S zero amounts to proving that they are equal (which is trivial since we have two axioms that say they are equal). After we prove they're equal, all we have to do is derive the <= relation from the = relation.

zero => (!chain-> [
                    (factorial zero)
                    = one                                        [fact-def]
                    = (factorial one)                            [fact-def]
                    ==> ((factorial zero) <= (factorial one))    [lte-def]
        ])

This proof uses a mixture of equational and implication chaining to derive the desired conclusion.

This particular chained deduction is saying: factorial zero equals one by definition of the factorial function. Further, one equals factorial one by definition of the factorial function. The factorial that factorial zero equals factorial one implies that factorial zero is less than or equal to factorial one by definition of the less-than-or-equal-to relation (which is defined in the nat-less file we loaded in from Athena's standard library at the top of our factorial file).

The second base case is similarly proved with chaining, though we have to take a few more steps.

We know that the factorial of 2 must be greater than the factorial of 1, since our axioms tell us that factorial 2 would be equal to 2 * factorial 1 which is 2. Since one is less than two, and since the factorial of one is one and the factorial of two is two, it trivially follows that factorial one is less than factorial two. Finally, we know that for any two numbers $x$ and $y$, if $x \lt y$ then we can confidently claim that $x \le y$ as well.

This is exactly what we will formalize using the same chain-> method. We'll also add a definition of two for convenience at the top of the Factorial module: define two := (S one).


(S zero) => (!chain-> [
                         (factorial two) 
                         = (two * (factorial one))               [fact-def]
                         = (two * one)              
                         = two                                          
                         ==> (one < two)
                         ==> (factorial one < factorial two)          
                         ==> (factorial one <= factorial two)         [lte-def]
                                          
        ])

Notice that not every step in the proof requires that we provide explicit justification. Athena is pretty effective at identifying justifications that already exist in the assumption base. Proof-via-chaining can be used for far more sophisticated deductions and provides mechanisms for a variety of advanced justifications. In a future tutorial, we will discuss chaining in greater depth, including such topics as when explicit justifiers are required, what kinds of justifiers are available for use, and how to order the steps in a proof chain.

With the second base case successfully derived, the entire factorial file should look like this:

load "nat-less"
(transform-output eval [nat->int])
module Factorial {
    declare factorial: [N] -> N [factorial [int->nat]]
    
    define [< * one zero <=] := [N.< N.* (S N.zero) N.zero N.<= ]
    define [n m k] := [?n:N ?m:N ?k:N]
    define two := (S one)


    assert* fact-def := [
        (factorial zero = one)
        (factorial one = one)
        (factorial S n = S n * factorial n)
    ]
    

    (eval factorial 5) # 120

    define [lte-def] := N.Less=.<=-def
    define theorem := (forall n . factorial n N.<= factorial S n)
    by-induction theorem {
        zero => (!chain-> [
                    (factorial zero)
                    = one                                   [fact-def]
                    = (factorial one)                            [fact-def]
                    ==> ((factorial zero) <= (factorial one))         [lte-def]
        ])
        | (S zero) => (!chain-> [
                         (factorial two) 
                         = (two * (factorial one))               [fact-def]
                         = (two * one)              
                         = two                                          
                         ==> (one < two)
                         ==> (factorial one < factorial two)          
                         ==> (factorial one <= factorial two)         [lte-def]
                                          
        ])
        | (n as (S m)) => (!force (factorial S m <= factorial S S m))
    }
}

Proving the Inductive Step

Now let's move onto the inductive step.

Remember that the inductive step of a proof by induction assumes the inductive hypothesis and then derives the desired conclusion.

So what is the inductive hypothesis? For any property we are proving with induction, $P(n)$, our inductive hypothesis states that $P(n - 1)$ holds.

In our case, we are attempting to prove that factorial S m <= factorial S S m. Our inductive hypothesis is therefore factorial m <= factorial S m. Athena automatically inserts the hypothesis into the inductive step's assumption base.

This makes it very easy to verify that the hypothesis is indeed what we believe it to be. We can do this in Athena by using the claim method, which derives a sentence if and only if that sentence is already in the assumption base.

(n as (S m)) => conclude goal := (factorial S m <= factorial S S m)
                            let {
                                ih := (factorial m <= factorial S m);
                                _ := (!claim ih)
                            }
                    (!force goal)

If we add the let block between the definition of the goal and the call to force and run Athena, we should see a successful output. This indicates that the inductive hypothesis we wrote is in the assumption base after all (if you need more convincing, try changing the inductive hypothesis sentence around and re-running Athena).

To prove the inductive step, we will utilize a powerful proof technique: reductio ad absurdum! This is otherwise known as proof by contradiction.

Luckily, there's a method for that.

Athena's proof-by-contradiction method takes two arguments: a sentence we wish to prove (the goal) and a deduction of the form: ~ goal ==> false. How do we derive false? Well, we can use Athena's absurd method which takes a sentence and its negation and - if both are in the assumption base (i.e., both a sentence and its negation have already been established as proven theorems or axioms) - inserts false into the assumption base.

Let's sketch this out:

conclude goal := (factorial S m <= factorial S S m)
    (!by-contradiction goal 
        assume contra := (~ goal)
    
        (!force false)
    )

Now we just have to replace that call to force with a proof that the negation of the goal leads to a contradiction.

If we assume ~ goal, we are assuming that factorial S m <= factorial S S m is not true. I hope we can all agree, as well, that if not (x <= y) then y < x. A proof of this property can be found in the nat-less file we imported in the beginning. Specifically, it is defined in a module contained within the N module called Less= and the name of the property is trichotomy1. So, we can use that property to derive the sentence factorial S S m < factorial S m:

(!by-contradiction goal 
        assume contra := (~ goal)
            let {
                ssm_lt_sm := (!chain-> [
                                contra 
                                ==> (factorial S S m < factorial S m)            [N.Less=.trichotomy1]
                                ==> (((S S m) * factorial S m) < factorial S m ) [fact-def]
                            ])
            
        }

Notice that the conclusion of this proof chain is S S m * factorial S m < factorial S m. The reason we factor out the S S m and then define the left hand of the inequality with multiplication is because it makes deriving the final contradiction easier. Essentially, we know that for any number n greater than or equal to 1, n * m < m is definitely not true: the product of n and m is equal to m if n is 1, and is greater than m otherwise. We can formalize this property: forall n m . ~ (S n * m < m). Let's use this factorial to derive the negation of the sentence ssm_lt_sm which we just proved. We will add it as a second derivation within the let block. Remember to add a semiconol after the first derivation (in let blocks, each arm except the last one must be terminated with a semicolon)

let {
    ssm_lt_sm :=
    (!chain-> [
            contra 
        ==> (factorial S S m < factorial S m) [N.Less=.trichotomy1]
        ==> (((S S m) * factorial S m) < factorial S m ) [fact-def]
    ]);
    neg_ssm_lt_sm := (!chain-> [
        ssm_lt_sm
        ==> (~ (S S m) * (factorial S m) < factorial S m) [(forall ?x ?y . ~ (S ?x * ?y < ?y))]
    ])
}

The evaluation of this proof does not succeed: Athena tells us that going from ssm_lt_sm to its negation is unsuccessful. The reason for this is that the sentence we use as justification is not in the assumption base. Having said that, it does follow from the axioms & theorems proven about multiplication in the standard library.

It looks like we need to prove this lemma before we can make further progress with our proof of the induction step. Instead of proving it ourselves, let's just define it above our entire inductive proof and ask Athena to prove it for us. This is far more convenient than taking the time to prove a property that we only need as a justifier in the proof we actually care about.

define num_times_x_gt_x := (forall n m . ~ (S n * m < m))
    
(!prove num_times_x_gt_x (ab))

Athena's prove method takes two arguments: a sentence we wish to prove and a list of assumptions that can be used in the proof. We pass the entire contents of the assumption base, since it contains all the axioms about N. The procedure ab in Athena returns all theorems and axioms in the assumption base at the location where it is called.

We can also refactor the second chaining proof to look a bit nicer:


 neg_ssm_lt_sm := (!chain-> [
                        ssm_lt_sm
                        ==> (~ ssm_lt_sm) [num_times_x_gt_x]
                    ])

With ssm_lt_sm and its negation both derived, we can use absurd to complete this proof: (!absurd neg_ssm_lt_sm ssm_lt_sm).

The final contents of the Factorial module should look like this:

load "nat-less"
(transform-output eval [nat->int])
module Factorial {
    declare factorial: [N] -> N [factorial [int->nat]]
    
    define [< * one zero <=] := [N.< N.* (S N.zero) N.zero N.<= ]
    define [n m k] := [?n:N ?m:N ?k:N]
    define two := (S one)


    assert* fact-def := [
        (factorial zero = one)
        (factorial one = one)
        (factorial S n = S n * factorial n)
    ]
    

    (eval factorial 5) # 120

    define num_times_x_gt_x := (forall n m . ~ (S n * m < m))
    
    (!prove num_times_x_gt_x (ab))


    define [lte-def] := N.Less=.<=-def
    define theorem := (forall n . factorial n N.<= factorial S n)
    by-induction theorem {
        zero => (!chain-> [
                    (factorial zero)
                    = one                                   [fact-def]
                    = (factorial one)                            [fact-def]
                    ==> ((factorial zero) <= (factorial one))         [lte-def]
        ])
        | (S zero) => (!chain-> [
                         (factorial two) 
                         = (two * (factorial one))               [fact-def]
                         = (two * one)              
                         = two                                          
                         ==> (one < two)
                         ==> (factorial one < factorial two)          
                         ==> (factorial one <= factorial two)         [lte-def]
                                          
        ])
        | (n as (S m)) => conclude goal := (factorial S m <= factorial S S m)
                            let {
                                ih := (factorial m <= factorial S m);
                                _ := (!claim ih)
                            }
                            (!by-contradiction goal 
                                assume contra := (~ goal)
                                 let {
                                     ssm_lt_sm := (!chain-> [
                                         contra 
                                        ==> (factorial S S m < factorial S m) [N.Less=.trichotomy1]
                                        ==> (((S S m) * factorial S m) < factorial S m ) [fact-def]
                                    ]);
                                    
                                    neg_ssm_lt_sm := (!chain-> [
                                        ssm_lt_sm
                                        ==> (~ ssm_lt_sm) [num_times_x_gt_x]
                                    ])
                                 }
                                (!absurd neg_ssm_lt_sm ssm_lt_sm)
                            )
    }
}

Thus, our proof is complete!

If you encountered any issues while following along, or if you've noticed a mistake, please feel free to open an issue on the GitHub repository associated with this tutorial.

Alternative solutions are also welcome, so feel free to show off your creativity by forking the repo & adding your solution to the mix.

Until next time!

QED

Back to top