Aidan Ewart

Updated 04/06/2020


The λ\lambda-Calculus, in Set Theory, in Coq - Extras

A collection of extra, weighty, vaguely rambling, content that didn't make it into the final version.

Addendum I - Coq Encoding of Intuonistic Set Theory

Coq, which implements a type theory not dissimilar to MLTT, the 'Calculus of Inductive Constructions' (CIC), provides the perfect environment in which to interpret set theory à la Aczel; 'inductive types', the types created with the W\mathbb{W} axiom in MLTT, come 'free' in Coq, meaning you needn't worry about extensionality when using them. Moreover, Coq already has a impredicative universe of propositions, Prop, which corresponds to Aczel's universe P\mathbb{P}. Aside from theoretic concerns, Coq's handling of custom notations is definitely something to envy.

Aczel created a type V\mathbb{V} to represent 'iterative small classes' - small sets. V\mathbb{V} is defined as (Wt:(ΣA:U)AP)fst(t)(\mathbb{W} t : (\Sigma A : \mathbb{U}) A \to \mathbb{P}) \mathrm{fst}(t), meaning that an element of V\mathbb{V} is comprised of a (small) type AA, a predicate P:APP : A \to \mathbb{P}, alongside an indexing function f:AVf : A \to \mathbb{V}. The set represented by any inhabitant of V\mathbb{V} can be thought of as the image of the values x:Ax : A through ff, provided there is a proof of PxP x.

V\mathbb{V} roughly corresponds to a Coq inductive type:

Inductive V : Type :=
  mkV : forall (a : Type) (P: a -> Prop), ({x | P x} -> V) -> V.

However, it makes sense to generalize this definition by discarding P,

Inductive set : Type :=
  mk : forall (a : Type), (a -> set) -> set.

as any mkV A P f can be equivalently represented by mk {x: A | P x} f. From a set-theoretic standpoint, you can think of the set created by mk a f as the indexed set (fi)ia(f_i)_{i \in a}.

The 'equivalence' of two sets is defined via structual induction; the proposition ABA \equiv B pretty much states 'for every xAx \in A, there is an element yBy \in B, such that xyx \equiv y', and vice-versa:

Fixpoint equiv (x y: set) : Prop :=
  match x, y with
  | mk a f, mk b g =>
    (forall x, exists y, equiv (f x) (g y))
    /\ (forall y, exists x, equiv (f x) (g y))
  end.

Notation "x ≡ y" := (equiv x y) (at level 70) : zf_scope.

The fact that this equivalence relation is in Prop (as opposed to Type) will have an impact on the axioms that are defineable for this interpretation further down the line; most notably, the impredicativity of Prop means that axioms like that of the powerset are provable in this interpretation - having equiv A B : Type makes this impossible. This has the effect of making this set theory more similar to 'Intuonistic Zermelo-Fraenkel Set Theory' (IZF) than to CZF.

We can already prove various lemmas about this definition of equivalence (namely the fact that it actually is an equivalence - that \equiv is reflexive, symmetric, and transitive):

Lemma equiv_refl : forall {x: set}, x ≡ x.
Lemma equiv_sym : forall {x y: set}, x ≡ y -> y ≡ x.
Lemma equiv_trans : forall {x y: set}, x ≡ y -> forall {z: set}, y ≡ z -> x ≡ z.

The proposition x(yi)x \in (y_i) for an indexed set (yi)iI(y_i)_{i \in I} simply means that there is an index zIz \in I such that yzxy_z \equiv x. As mk a f represents a set indexed by a, this directly translates to:

Definition member (x y: set) :=
  match y with
  | mk a f => exists i, f i ≡ x
  end.

Notation "x ∈ y" := (member x y) (at level 70) : zf_scope.

Similarly, the proposition (xi)(yi)(x_i) \subseteq (y_i) for indexed sets (xi)iI(x_i)_{i \in I} and (yj)jJ(y_j)_{j \in J} means that for all indexes aIa \in I, there is an index bJb \in J such that xaybx_a \equiv y_b:

Definition subset (x y: set) :=
  match x,y with
  | mk a f, mk b g => forall x, exists y, f x ≡ g y
  end.

Notation "x ⊆ y" := (subset x y) (at level 70) : zf_scope.

In order to prove that our definitions match those in traditional set theory, we need to prove that these relations are sound with respect to equivalence (meaning that equivalent terms can be substituted),

Lemma member_substs_left : forall {s a b}, a ≡ b -> a ∈ s -> b ∈ s.
Lemma member_substs_right : forall {s a b}, a ≡ b -> s ∈ a -> s ∈ b.
Lemma subset_substs_left : forall {x y z}, x ≡ y -> x ⊆ z -> y ⊆ z.
Lemma subset_substs_right : forall {x y z}, x ≡ y -> z ⊆ x -> z ⊆ y.

and that these relations have the same relationship with each other as in traditional set theory:

Lemma subset_ext : forall {x y}, (forall z, z ∈ x -> z ∈ y) -> x ⊆ y.
Lemma equiv_ext : forall {x y}, (forall z, z ∈ x <-> z ∈ y) -> x ≡ y.
(* etc... *)

Proving these properties for all subsequent definitions, while entirely possible, is rather painful and involves a lot of 'proof-passing', so I just add the following axiom, stating that ABA \equiv B is equivalent to A=BA = B, which greatly simplifies matters later on:

Axiom equiv_is_eq : forall {A B}, A ≡ B -> A = B.

We can now go ahead and construct, and prove properties of, the various ways of manipulating sets in IZF; first up, the axiom of the union set. Having defined the function index to extract the indexing type from a set, s\bigcup s of a set ss can be defined as:

Definition union (s: set) :=
  match s with
  | mk a f =>
    mk {x & index (f x)}
       (fun p => let (x,i) := p in
              let xi := f x in
              match xi return index xi -> set with
              | mk b g => fun i => g i
              end i)
  end.

Notation "⋃ x" := (union x).

Here the dependent pattern-matching syntax of Coq clouds the meaning of the definition somewhat - essentially, (si)iI\bigcup (s_i)_{i \in I} is a set not only indexed by II, but also indexed by the index of sjs_j, for some jIj \in I. This means that a member of a member of ss is simply a member of s\bigcup s, so that the property x,yx,zy,zx\forall x, \forall y \in x, \forall z \in y, z \in \bigcup x holds:

Lemma in_member_union : forall {x y z}, y ∈ x -> z ∈ y -> z ∈ ⋃x.

The axiom of the empty set \emptyset states that there is a set with no elements - this can be encoded as a set indexed by False or \bot, representing falsity or contradiction in intuonistic logic. False has no inhabitants, and so any set indexed by False is also uninhabited, so x,x\forall x, x \notin \emptyset.

Definition empty_set := mk False (False_rect set).

Notation "∅" := empty_set : zf_scope.
Notation "x ∉ y" := (x ∈ y -> False) (at level 70) : zf_scope.
Lemma empty : forall x, x ∉ ∅.

Complementary to the axiom of the empty set is the axiom of infinity; this is a set ω\omega for which the property xω,Suc(x)ω\forall x \in \omega, \mathrm{Suc}(x) \in \omega holds, where Suc(x)=x{x}\mathrm{Suc}(x) = x \cup \{x\}. Usually the Suc\mathrm{Suc} operation is defined in terms of unions and 'pairs' (another axiom), but instead I will define it in terms of insertion, with an axiom xy,z,xzyz\forall x y, \exists z, x \in z \wedge y \subseteq z:

Definition insert (e: set) (s: set) :=
match s with
| mk a f => mk (True + a) (fun i =>
  match i with
  | inl _ => e
  | inr i => f i
  end)
end.
Definition Suc s := insert s s.

For every natural number nn, we can iterate the Suc\mathrm{Suc} operation nn times on the empty set to generate the nnth element of ω\omega:

Fixpoint fin n :=
  match n with
  | S n => Suc (fin n)
  | Z => ∅
  end.

ω\omega is then defined as the set indexed by natural numbers with fin as the indexing function. Given an element of ω\omega, we know that that element has an index nNn \in \mathbb{N} (by our definition of membership above), and so Suc(n)\mathrm{Suc}(n) is simply the set with index n+1n + 1:

Definition infinity := mk nat fin.
Notation "ω" := infinity.

Lemma Suc_member_infinity : forall {x}, x ∈ ω -> Suc x ∈ ω.

We can also transform the type-theoretic induction principle for the natural numbers to a set-theoretic one, formalizing that our model of ω\omega corresponds to the one in the standard model for arithmetic - this extra definition turns our axiom of infinity into the axiom of strong infinity:

Definition ω_induction :
  {P: set -> Prop} (H: P ∅)
  (H0: forall x, P x -> P (Suc x))
  (x: set) (H1: x ∈ ω) : P x.

The next axiom is that of set 'separation' {xAϕ(x)}\{x \in A | \phi(x)\} (i.e. comprehension with a specified domain), the subset of AA for which the property ϕ\phi holds. This has a very obvious implementation; for any set AA with an indexing function s : a -> set, the set {xAϕ(x)}\{x \in A | \phi(x)\} should be indexed by the subset of a for which ϕ(f a) holds:

Definition seperation (P: set -> Prop) (x: set) :=
  match x with
  | mk a s => mk {x | P (s x)} (fun p => let (x,_) := p in s x)
  end.

The axiom of the powerset states that for every set xx, there exists a set P(x)\mathcal{P}(x) such that yP(x),yx\forall y \in \mathcal{P}(x), y \subseteq x. The powerset axiom is generally considered 'non-constructive', as there is no general way to construct the powerset of an infinite set, differentiating IZF from the constructive CZF.

Definition powerset (s: set) :=
  match s with
  | mk a f => mk (a -> Prop)
    (fun P => mk {x: a & P x}
      (fun i => let (x,_) := i in f x))
  end.

Lemma subset_in_powerset : forall {x y}, x ⊆ y -> x ∈ powerset y.

For any mk a f : set, the powerset of that set is indexed by a -> Prop - translated from Coq, this means P(s)\mathcal{P}(s) is a family of sets xix_i, where i2si \in 2^s is the chatacteristic function of any subset.

The axiom of replacement essentially states that given a set SS and a function f:SUf : S \to U, then the image of ff through SS is also a set. I also strengthen the axiom of replacement, stating that every element in the replacement of SS using ff has a nonempty fibre (inverse image) in SS:

Definition replacement {s} (f: forall x, x ∈ s -> set) : set.

Lemma image_in_replacement :
  forall {s f}, forall x (p: x ∈ s), f x p ∈ replacement f.
  
Lemma nonempty_fibre :
  forall {s f z}, z ∈ replacement f -> exists x (p: x ∈ s), f x p ≡ z.

It is important to note that the 'function' ff is not a normal set-constructed ('object-level') one; it is instead a binary relation ϕ\phi, expressed in the underlying logic of IZF, with the property that xS,!yU,ϕ(x,y)\forall x \in S, \exists ! y \in U, \phi(x,y).

Actually, (this phrasing of) the axiom of replacement can be expressed in terms of a more general axiom, the axiom of strong collection. The axiom of strong collection is essentially an extended axiom of replacement that also operates over non-functional (but still total!) relations - that is, relations ϕ\phi for which x,y,ϕ(x,y)\forall x, \exists y, \phi(x,y) holds (note the lack of uniqueness of yy). This is, regrettably, not definable using the 'membership in Prop' mantra, and so I must assert its existence using a (Coq) Axiom:

Axiom collection : forall {D P}, (forall x, x ∈ D -> exists y, P x y) -> set.
Axiom collection_prop : forall {D P x},
  x ∈ collection P <-> exists i, i ∈ D /\ P i x.

The next axiom is slightly more involved; it describes structual induction on the \in relation. Basically it states that, given a property ϕ\phi, if, in order to prove ϕ(x)\phi(x) holds for any set xx, it suffices to prove that if ϕ\phi holds for all members of some set xx, then ϕ\phi holds for xx itself. Expressed concisely, (x,[yx,ϕ(y)]ϕ(x))z,ϕ(z)(\forall x, [\forall y \in x, \phi(y)] \to \phi(x)) \to \forall z, \phi(z) for any ϕ\phi.

Definition ϵ_induction
  {P: set -> Prop} (H: prop_resp_equiv P)
  (H0: forall x, (forall y, y ∈ x -> P y) -> P x)
  (s: set) : P s.

Properties of Naturals

I needed some proofs about ω\omega for when I constructed more complex sets through induction. First, I proved that ω\omega, and all elements of ω\omega, are von Neumann ordinals, using ω\omega-induction:

Lemma ω_ordinal : forall x, x ∈ ω -> x ⊆ ω.
Lemma nat_ordinal : forall x, x ∈ ω -> forall y, y ∈ x -> y ⊆ x.

Therefore, the relations xyx \in y and xyx \subsetneq y correspond to x<yx < y and xyx \subseteq y corresponds to xyx \leq y. By von Neumann's definition, the ordinal xx is the interval [0,x)[0,x), and so every member of xx is itself in ω\omega. A direct consequence of this is that the predicessor of any element of ω\omega is also an element of ω\omega (as x,xSuc(x)\forall x, x \in \mathrm{Suc}(x)):

Lemma fin_contains_nats : forall {x y}, x ∈ ω -> y ∈ x -> y ∈ ω.
Lemma pred_is_nat : forall {x}, Suc x ∈ ω -> x ∈ ω.

We can use ω\omega-induction to prove that comparison for the naturals is decidable, and derive some related proofs:

Lemma nat_cmp_dec :
  forall {x y}, x ∈ ω -> y ∈ ω -> x ∈ y \/ x ≡ y \/ y ∈ x.
Lemma least_nat :
  forall {x}, x ∈ ω -> ∅ ∈ x \/ x ≡ ∅.
Lemma ω_nojunk :
  forall {x}, x ≡ ∅ \/ exists n, n ∈ ω /\ x ≡ Suc n.

Finally, we can use these comparisons to prove the existence of an upper bound (maximum) for two naturals, which is itself a natural:

Lemma nat_cmp_leq :
  forall {x y}, x ∈ ω -> y ∈ ω -> x ⊆ y \/ y ⊆ x.
Lemma maximum :
  forall {x y}, x ∈ ω -> y ∈ ω -> exists m, m ∈ ω /\ x ⊆ m /\ y ⊆ m.

Addendum II - The λ\lambda-Calculus

The λ\lambda-calculus (LC) is a system used to express computation. Much like in arithmetic, computing the value of an expression involves applying a series of transformations upon the expression until it reaches a 'normal form' (consider the transformations used to reduce 2+5+82 + 5 + 8 to 1313). An expression, or term, of the λ\lambda-calculus can be created with the following rules:

In an abstraction (λv.M)(\lambda v. M), MM is called the body of the abstraction, and the λ\lambda serves as a kind of quantifier for vv; inside the body of (λv.M)(\lambda v. M), any occurance of vv is considered bound, while in a term where vv is not quantified (for example (Nv)(N v)), any occurance of vv is considered free.

The λ\lambda-calculus has one rule for the reduction of expressions, β\beta-reduction, which states that you can reduce any term of the form ((λv.M)N)((\lambda v. M) N) by substituting all (free) occurances of vv in MM with NN. You can think of this as simplifying an application of a function fvMf v \mapsto M in the expression f(N)f(N) - indeed, the λ\lambda-calculus is just a calculus of function creation and application.

To normalise (or evaluate) a λ\lambda-term, one simply repeats β\beta-reductions until the resulting term is in a 'normal form'; different presentations often use different normal forms, but in this post I will use 'weak head normal form', meaning that the resulting term is an abstraction (λv.M)(\lambda v. M) - note that the body of the abstraction, MM, does not need to be in normal form.

As an example, the expression ((λx.x)(λy.y))((\lambda x. x) (\lambda y. y)) is not an abstraction, so we apply β\beta-reduction to simplify it. Substituting all occurances of xx with (λy.y)(\lambda y. y) in the expression xx, we arrive at the expression (λy.y)(\lambda y. y), which is in normal form, and so we can stop simplification; the normalised version of ((λx.x)(λy.y))((\lambda x. x) (\lambda y. y)) is (λy.y)(\lambda y. y).

Not all terms actually have a normal form; consider the expression ((λx.(xx))(λx.(xx)))((\lambda x. (x x)) (\lambda x. (x x))). Applying β\beta-reduction on this means we substitute all free occurances of xx with (λx.(xx))(\lambda x. (x x)) in the expression (xx)(x x). This is simply ((λx.(xx))(λx.(xx)))((\lambda x. (x x)) (\lambda x. (x x))), which is exactly the expression we started with, meaning that no amount of β\beta-reduction will succeed in producing a normal form for this expression. Therefore, evaluation (simplifying an expression to a normal form) is a partial function.

Addendum III - Inductively Defined Functions

Consider the Ackerman-Péter function:

ack(0,n)=Suc(n)ack(Suc(m),0)=ack(m,1)ack(Suc(m),Suc(n))=ack(m,ack(Suc(m),n))\begin{gathered} \mathrm{ack}(0,n) = \mathrm{Suc}(n) \\ \mathrm{ack}(\mathrm{Suc}(m),0) = \mathrm{ack}(m,1) \\ \mathrm{ack}(\mathrm{Suc}(m),\mathrm{Suc}(n)) = \mathrm{ack}(m,\mathrm{ack}(\mathrm{Suc}(m),n)) \end{gathered}

The base cases of this function are ack(0,n)=Suc(n)\mathrm{ack}(0, n) = \mathrm{Suc}(n) for all nωn \in \omega. For the first recursive case, for all m,xωm, x \in \omega, if x=ack(m,1)x = \mathrm{ack}(m,1), then ack(Suc(m),0)=x\mathrm{ack}(\mathrm{Suc}(m),0) = x. Similarly, for the second recursive case, for all m,n,x,yωm, n, x, y \in \omega, if x=ack(Suc(m),n)x = \mathrm{ack}(\mathrm{Suc}(m),n) and y=ack(m,x)y = \mathrm{ack}(m,x), then ack(Suc(m),Suc(n))=y\mathrm{ack}(\mathrm{Suc}(m),\mathrm{Suc}(n)) = y. We can express this as a description like so:

Γack(A)={,n,n:nω}{Suc(m),,x:m,xω,m,Suc(),xA}{Suc(m),Suc(n),y:m,n,x,yω,Suc(m),n,xAm,x,yA}\begin{aligned} \Gamma_{\mathrm{ack}}(A) = &\{ \langle \langle \emptyset, n \rangle, n \rangle : \forall n \in \omega \} \\ \cup &\{\langle \langle \mathrm{Suc}(m), \emptyset \rangle, x \rangle : \forall m, x \in \omega, \langle \langle m, \mathrm{Suc}(\emptyset) \rangle, x \rangle \in A\} \\ \cup &\{\langle \langle \mathrm{Suc}(m), \mathrm{Suc}(n) \rangle, y \rangle : \forall m, n, x, y \in \omega, \langle \langle \mathrm{Suc}(m), n \rangle, x \rangle \in A \wedge \langle \langle m, x \rangle, y \rangle \in A\} \end{aligned}

This is a rather complex example, but it shows how recursive functions can be constructed as inductively defined sets:

ack=i=0Γacki()\mathrm{ack} = \bigcup_{i = 0}^\infty \Gamma_{\mathrm{ack}}^i(\emptyset)