Ernest Schimmerling ; Basic and Intermediate Logic ; Chapter 3

© Ernest Schimmerling

Online textbook for Basic and Intermediate Logic

Chapter 3 : First-order logic

Warning: Much of the terminology we used to discuss propositional logic will be used differently here. When we really want to refer to notions from Chapter 2, we will add the word "propositional" to make it clear. For example, here we will define structures, sentences and theories which will be different from those we discussed before.

To motivate first-order logic, we begin by looking at one of the most basic objects of mathematics, the set of rational numbers

ℚ = { m / n ∣ m and n are integers and n ≠ 0 }.

Let + and x be the usual addition and multiplication operations on ℚ. Let < be the usual ordering of ℚ. Collecting together ℚ with specific members, functions and the relation we care about into a 6-tuple, we obtain an example of a structure:

A = ( ℚ , 0 , 1 , + , x , < ).

Certainly, a lot of mathematics is concerned with which "sentences" are "true" about this "structure"! To make concrete sense of the words in quotes, we need an appropriate language. Let us denote the first coordinate of A by |A|. In our example, |A| = ℚ. Notice that 0 and 1 are particular members of ℚ. Let us introduce constant symbols c and d which are interpreted by A as 0 and 1 respectively. To indicate this, we will write c^A = 0 and d^A = 1. Next, notice that + and × are binary functions on ℚ. Let us introduce binary function symbols F and G which are interpreted by A as + and × respectively. We will write F^A and G^A to indicate this. Finally, notice that < is a binary relation on ℚ. Let us introduce a binary function symbol R which is interpreted as < by A. We will write R^A to indicate this. Now, we have

A = ( | A | , c^A , d^A , F^A , G^A , R^A )

and A is a structure in the language with two constant symbols c and d, two binary function symbols F and G and one binary relation symbol, R.

Now let us think about the kinds of sentences that are relevant to A and how to formalize them. For example, we certainly care about the fact that no rational number is a square root of 2. In terms of what we have above and standard mathematical symbolism, another way to express this fact is

¬ ∃ v ∈ | A | ( G^A ( v , v ) = F^A ( d^A , d^A ) ).

Based on this, it is reasonable to say that A satisfies the sentence

¬ ∃ v ( G ( v , v ) ≈ F ( d , d ) ).

where ≈ is a symbol which is interpreted as = by A. But now we start to see the kind of languages we care about and how to formalize them!

§ 3.1 Syntax

Symbols

First we list the logical symbols that are part of every first-order language:

⊤ ⊥ ¬ ∧ ∨ → ↔ ∃ ∀

(∃ and ∀ are the existential and universal quantifier symbols respectively.)

We always have these infinitely many symbols for variables too:

v₀ v₁ v₂ v₃ etc.

We also include a symbol for equality:

≈

In addition to the logical symbols above, a first-order language can have any number of the following kinds of non-logical symbols:

constant symbols

n-ary function symbols for n ≥ 1

n-ary relation symbols for n ≥ 1

When we talk about a language we mean the collection of constant, function and relation symbols together with the information about what kind of symbols they are. For each function or relation symbol, the information must include its "arity".

Terms, formulas and sentences

Let us assume we have specified constant, function and relation symbols, and explain how to build up a language from them.

Terms of the language are built up recursively:

variables are terms

constant symbols are terms

if F is an n-ary function symbol and t₀ , … , t_n-1 are terms,
then F ( t₀ , … , t_n-1 ) is also a term.

We have added parentheses, commas and spaces to make it easier to parse terms. Officially, they are not part of the language. Moreover, they are not needed. For example, if F is a binary relation symbol, c is a constant and v is a variable, then the only reasonable way to read

FFvcFcc

is as

F ( F ( v , c ) , F ( c , c ) ) .

Formulas of the language are built up recursively:

The atomic formulas are:
- ⊤ and ⊥ are formulas.
- If s and t are terms, then s ≈ t is a formula.
- If R is an n-ary relation symbol and t₀ , … , t_n-1 are terms,
  then R( t₀ , … , t_n-1 ) is a formula.
If φ and ψ are formulas,
then so are ¬ φ , φ ∧ ψ , φ ∨ ψ , φ → ψ , φ ↔ ψ , ∃ v_n φ and ∀ v_n φ .

In the notation we have used, we need parentheses to ensure there is only one way to parse each formula. The official definition, which we will not give, uses Polish notation to avoid parentheses altogether. For example,

( R ( u , v ) ∧ ( F ( c , d ) ≈ w ) ) → S ( d )

in Polish notation becomes

→ ∧ R u v ≈ F c d w S d

Let us move on from the topic of parentheses and commas because it is boring!

The variables that occur in terms are defined by recursion:

v_i is the only variable that occurs in v_i .

No variables occur in constants.

v occurs in F ( t₀ , … , t_n-1 ) iff v occurs in t_m for some m < n.

Saying v is a variable of t is the same as saying v occurs in t.

A term is closed iff it has no variables.

For example, if F and G are binary function symbols, c and d are constants, and u is a variable, then u is a variable of the term

F ( G ( c , u ) , d )

while

F ( G ( c , d ) , d )

is a closed term because it has no variables.

The variables that occur freely in formulas are defined by recursion:

The formulas ⊤ and ⊥ have no free variables.
u occurs freely in s ≈ t iff u occurs in s or t.

u occurs freely in R ( t₀ , … , t_n-1 ) iff u occurs in t_m for some m < n.

u occurs freely in ¬ φ iff u occurs freely in φ.

u occurs freely in φ ∧ ψ , φ ∨ ψ , φ → ψ or φ ↔ ψ iff u occurs freely in φ or ψ.

u occurs freely in ∃ v φ or ∀ v φ iff u occurs freely in φ and u is not v.

Saying that u is a free variable of φ is the same as saying that u occurs freely in φ.

A formula is a sentence iff it has no free variables.

We will want to substitute a term r for variable u in a term t or formula φ and write t ( u / r ) or φ ( u / r ) .

The definition of substitution into terms is by recursion:

If u does not occur in t, then t ( u / r ) is t.
If t is a variable, v, then t ( v / r ) is r.
If t is F ( s₀ , … , s_n-1 ) , then t ( u / r ) is F ( s₀ ( u / r ) , … , s_n-1 ( u / r ) ) .

The definition of substitution into formulas is also by recursion:

If φ is ⊤ or ⊥ , then φ ( u / r ) is φ .

If φ is s ≈ t , then φ ( u / r ) is s ( u / r ) ≈ t ( u / r ) .

If φ is R ( s₀ , … , s_n-1 ) , then φ ( u / r ) is R ( s₀ ( u / r ) , … , s_n-1 ( u / r ) ) .

If φ is ¬ ψ , then φ ( u / r ) is ¬ ψ ( u / r ) .

If φ is ψ ∧ χ , then φ ( u / r ) is ψ ( u / r ) ∧ χ ( u / r ) .

Similar clauses when φ is ψ ∨ χ , ψ → χ or ψ ↔ χ .

If φ is ∃ v ψ, then
- if u is v, then φ ( u / r ) is φ .
- if u is not v, then φ ( u / r ) is ∃ v ψ ( u / r ) .
If φ is ∀ v ψ, then
- if u is v, then φ ( u / r ) is φ.
- if u is not v, then φ ( u / r ) is ∀ v ψ ( u / r ) .

Examples to illustrate syntax

Let φ be the formula R ( u , v ) where u and v are distinct variables and R is a binary relation symbol.

Then:

The free variables of φ are u and v.

φ ( u / v ) is R ( v , v ) whose only free variable is v.

φ ( u / v ) ( v / u ) is R ( u , u ) whose only free variable is u.

φ ( v / u ) is R ( u , u ) whose only free variable is u.

φ ( v / u ) ( u / v ) is R ( v , v ) whose only free variable is v.

φ ( u / c ) is R ( c , v ) whose only free variable is v.

φ ( u / c ) ( v / d ) is R ( c , d ) which has no free variables so it is a sentence.

φ ( v / d ) is R ( u , d ) whose only free variable is u.

φ ( v / d ) ( u / c ) is R ( c , d ) which has no free variables so it is a sentence.

In particular, note here that

φ ( u / v ) ( v / u ) and φ ( v / u ) ( u / v ) are different formulas while

φ ( u / c ) ( v / d ) and φ ( v / d ) ( u / c ) are the same sentence.

As another example, let ψ be the formula ( ∀ u R ( u , v ) ) ∨ ( ∃ v S ( u , v ) ) where u and v are distinct variables and R and S are distinct binary relation symbols.

Then:

ψ has free variables u and v.

ψ ( u / v ) is ( ∀ u R ( u , v ) ) ∨ ( ∃ v S ( v , v ) ) whose only free variable is v.

ψ ( u / v ) ( v / u ) is ( ∀ u R ( u , u ) ) ∨ ( ∃ v S ( v , v ) ) which has no free variables so it is a sentence.

Note that substituting v for u does not mean "change u to v everywhere you see it". Nor the other way around.

§ 3.2 Structures

A structure A consists of the following.

A nonempty set called the universe, denoted by |A|.

For each constant symbol c, a member of |A|, denoted by c^A.

For each n-ary function symbol F, an n-ary function on |A|, denoted by F^A.

For each n-ary relation symbol R, an n-ary relation on |A|, denoted by R^A.

Technically, a structure A is the complicated looking 4-tuple ( | A | , c ↦ c^A , F ↦ F^A , R ↦ R^A ) .

But sometimes we simplify the notation. For example, if the language has no constant or function symbols and only one relation symbol, R, then we would just write A = ( | A | , R^A ) .

Examples of structures

From the beginning of the chapter,

A = ( | A | , c^A , d^A , F^A , G^A , R^A ) = ( ℚ , 0 , 1 , + , x , < ).

is a structure in the language with two constant symbols c and d, two function symbols F and G, and one binary relation symbol, R.

(So, anyone studying ordered fields might use this language.)

Let T the set of all finite strings of zeros and ones. Let < be the usual extension relation on strings. Then

A = ( | A | , R^A ) = ( T , < )

is a structure in the language with no constant or function symbols and one binary relation symbol, R.

(So, anyone studying trees or partial orderings might use this language.)

Let G be a graph with a nonempty vertex set V and an edge set E. Then

A = ( | A | , R^A ) = ( V , E )

is a structure in the language with no constant or function symbols and one binary relation symbol, R.

(So, anyone studying graphs might use this language.)

Let ℝ be the set of real numbers and d(x,y) = | x - y |. Then

A = ( | A | , F^A ) = ( ℝ , d )

is a structure in the language with no constant or relation symbols and one binary function symbol, F.

This last example could be deceptive to those interested in metric spaces. To see why, let e be the usual metric on ℝ²,

e( x̅ , y̅ ) = ( ( y₀ - x₀ )² + ( y₁ - x₁ )² )^1/2.

Then ( ℝ² , e ) is a metric space but it is not a structure because ℝ is not a subset of ℝ², so e is not a binary function on ℝ².

§ 3.3 Semantics

Let A be a structure and t be a closed term. The evaluation of t in A, which is written t^A, is defined by recursion:

If t is a constant symbol c, then t^A is c^A.
If t is F ( s₀ , … , s_n-1 ), then t^A is F^A ( s₀^A , … , s_n-1^A ) .

Consider a structure A and a member x of | A |. Pick a new constant symbol c_x that is not a symbol of the original language. Expand the original language to include c_x. Expand A to a structure in the expanded language by keeping the same universe and interpretations as A and interpreting c_x as x. We will use the notation ( A , x ) to refer to this expanded structure. Repeating what we just said:

c^{( A , x )} = c^A for constant symbols c of the original language.

F^{( A , x )} = F^A for function symbols F of the original language.

R^{( A , x )} = R^A for relation symbols R of the original language.

c_x^{( A , x )} = x.

The truth function of a structure A,

Truth_A : { φ ∣ φ is a sentence } → { 0 , 1 } ,

is defined by recursion on the complexity of sentences:

Truth_A ( ⊤ ) = 1 and Truth_A ( ⊥ ) = 0

If φ is s ≈ t, then Truth_A ( φ ) = 1 iff s^A = t^A.

If φ is R ( s₀ , … , s_n-1 ), then Truth_A ( φ ) = 1 iff R^A ( s₀^A , … , s_n-1^A ).

The clauses for ¬ ψ , ψ ∧ χ , ψ ∨ χ , ψ → χ or ψ ↔ χ are the same as for propositional logic.

If φ is ∃ v ψ, then Truth_A ( φ ) = 1 iff there exists a member x of A such that Truth_{( A , x )} ( ψ ( v / c_x ) ) = 1.

If φ is ∀ v ψ, then Truth_A ( φ ) = 1 iff for every member x of A, Truth_{( A , x )} ( ψ ( v / c_x ) ) = 1.

Notice that the last two clauses of the recursion depend on ψ ( v / c_x ) having strictly lower complexity as a sentence in the expanded language than ∃ v ψ and ∀ v ψ have in the original language. This is legitimate! The best way to think about our recursion is that it handles all languages simultaneously.

The following list of definitions for first-order logic is analogous to a list we had in Chapter 2 for propositional logic.

Σ is a theory iff Σ ⊆ { φ ∣ φ is a sentence }.

(The sentences in Σ are called axioms of Σ.)

A is a model of φ and write A ⊨ φ iff Truth_A ( φ ) = 1.
(We also say A satisfies φ.)

A is a model of Σ and write A ⊨ Σ iff A is a model of every member of Σ.

(We also say A satisfies Σ.)

ψ is a consequence of φ and write φ ⊨ ψ iff every model of φ is a model of ψ.
φ is a consequence of Σ and write Σ ⊨ φ iff every model of Σ is a model of φ.

φ and ψ are logically equivalent and write φ ╞╡ ψ iff they have exactly the same models.

φ is valid and write ⊨ φ iff every structure is a model of φ.

φ is satisfiable iff φ has at least one model.

Σ is satisfiable iff Σ has at least one model.

Example to illustrate semantics

Recall ω = { 0 , 1 , 2 , 3 , … } is the set of natural numbers.

Consider the language with two constant symbols c and d and two binary function symbols F and G, and a binary relation symbol R.

Let A = ( |A| , c^A , d^A , F^A , G^A , R^A ) = ( ω , 0 , 1 , + , × , < ) .

Define terms s₀ , s₁ , s₂ etc. by letting s₀ be the constant symbol c, and s_n+1 be the term F( s_n , d ).

For every natural number n, if c_n is a new constant which is interpreted as n in the expansion ( A , n ), then ( A , n ) ⊨ c_n ≈ s_n.

Define terms t₀ , t₁ , t₂ etc. by letting t₀ be the constant symbol d, t₁ be the variable v and t_{m + 1} be the term G ( t_m , v ).

For all natural numbers m , n and p, if p = n^m, then ( A , p ) ⊨ c_p ≈ t_m ( v / s_n ).

Let ψ_m be the formula F ( t_m ( v / u ) , t_m ( v / v ) ) ≈ t_m ( v / w ) .

Let φ_m be the sentence ∃ u ∃ v ∃ w ( ¬ ( u ≈ c ) ∧ ¬ ( v ≈ c ) ∧ ¬ ( w ≈ c ) ∧ ψ_m ) .

The fact that 3² + 4² = 5² tells us that ( A , 3 , 4 , 5 ) ⊨ ψ₂ ( u / c₃ , v / c₄ , w / c₅ ).

Therefore A ⊨ φ₂ .

Let Σ = { ¬ φ_m ∣ m = 3 , 4 , 5 etc. }.

Fermat's theorem tells us that A ⊨ Σ .

§ 3.4 Tautologies

First-order logic is different from propositional logic but we will not let the work we did in the previous chapter go to waste. In general, it is not obvious whether a first-order sentence is valid. But sometimes it is obvious!

For example, let ψ be any first-order sentence. Then ψ ∨ ¬ ψ is valid. Why? The direct way to see this is to use the definition of validity: for every structure A,

Truth_A ( ψ ∨ ¬ ψ ) = 1 iff ( Truth_A ( ψ ) = 1 or Truth_A ( ¬ ψ ) = 1 ) iff ( Truth_A ( ψ ) = 1 or Truth_A ( ψ ) = 0 )

but Truth_A ( ψ ) is either 0 or 1 so we are done.

A slicker way to argue would be to "substitute" ψ for the propositional parameter P in the propositional sentence P ∨ ¬ P, then use the fact that P ∨ ¬ P is propositionally valid to conclude that the sentence φ ∨ ¬ φ is valid in first-order logic.

In general, suppose we have first-order sentences ψ₀ , … , ψ_n-1 and a propositional sentence π that involves propositional parameters P₀ , … , P_n-1. Define what it means to "substitute" the first-order sentences for the propositional parameters to obtain a first-order sentence that is naturally denoted π ( P₀ / ψ₀ ) ⋅ ⋅ ⋅ ( P_n-1 / ψ_n-1 ). Abbreviate this π ( P̅ / ψ̅ ). Finally, argue that if π is valid in propositional logic, then π ( P̅ / ψ̅ ) is valid in first-order logic. The missing definition and argument are entirely routine to fill in so we omit the details.

First-order sentences of the form π ( P̅ / ψ̅ ) where π is propositionally valid are called tautologies.

Intuitively, a tautology is a first-order sentence whose validity is obvious because it can be verified using only propositional logic. Remember that we can use truth tables to check for propositional validity and that this can be done on a computer.

The following theorem summarizes what we have sketched.

Theorem 3.1

Tautologies are valid.

Warning: What we call "tautologies" others might call "propositional tautologies of first-order logic".

Examples regarding tautologies

Not all valid first-order sentences are tautologies. An easy example is

∀ v ( v ≈ v ).

On the other hand,

∀ v ( v ≈ v ) ∨ ¬ ∀ v ( v ≈ v ).

is a tautology for reasons already explained.

Even though checking for a tautology is routine by computer, it might not be immediately obvious to a human. For example, let φ be the sentence

( ( ∃ u ∀ v R ( u , v ) ) ∧ ( ( ∃ u ∀ v R ( u , v ) ) → F ( c ) ≈ G ( d ) ) → F ( c ) ≈ G ( d ) .

Then φ is π ( P̅ / ψ̅ ) where

ψ₀ is ∃ u ∀ v R ( u , v ),

ψ₁ is F ( c ) ≈ G ( d ) and

π is ( P₀ ∧ ( P₀ → P₁ ) ) → P₁.

Note that ψ₀ and ψ₁ are first-order sentences, and π is a propositional validity. Therefore, φ is a tautology. It is worth noting that neither ψ₀ nor ψ₁ is valid.

§ 3.5 Basic semantic principles

Here we will record some useful facts about the logical consequence relation ⊨. In the subsequent section, some of these facts will be formalized as deduction rules about ⊢.

Substitution Lemma 3.2

Let s and t be closed terms and φ be a formula with a single free variable v.

Then

s ≈ t ⊨ φ ( v / s ) ↔ φ ( v / t ) .

Sketch of Lemma 3.2

Assume A ⊨ s ≈ t .

Let x ∈ | A | so that s^A = x = t^A.

In the definition of satisfaction, we introduced a new constant c_x and used the notation ( A , x ) to mean the expansion of A that interprets c_x as x.

Thus s^A = c_x^{( A , x )} = t^A.

Claim I: For every term r in the original language such that r does not have any variables other than v,

( r ( v / s ) )^A = ( r ( v / c_x ) )^{( A , x )} = ( r ( v / s ) )^A .

The proof of Claim I is by induction on the complexity of r.

Claim II: A ⊨ φ ( v / s ) iff ( A , x ) ⊨ φ ( v / c_x ) iff A ⊨ φ ( v / t ) .

The proof of Claim II is by induction on the complexity of φ. It uses Claim I.

Lemma 3.2 follows from Claim II since A was an arbitrary model of s ≈ t .

QED

Specialization Lemma 3.3

Let t be a closed term and φ be a formula whose only free variable is v.

Then

∀ v φ ⊨ φ ( v / t )

Proof of Lemma 3.3

Assume A ⊨ ∀ v φ .

By the ∀-clause in the definition of satisfaction, for every x ∈ | A |,

( A , x ) ⊨ φ ( v / c_x ).

Let x = t^A. Then

( A , x ) ⊨ c_x ≈ t.

By Lemma 3.2,

( A , x ) ⊨ φ ( v / c_x ) ↔ φ ( v / t )

By the ↔ clause in the definition of satisfaction and the fact that t is a term in the language without c_x, we have that

( A , x ) ⊨ φ ( v / c_x ) iff ( A , x ) ⊨ φ ( v / t ) iff A ⊨ φ ( v / t ).

Combining facts, we see that A ⊨ φ ( v / t ).

Lemma 3.3 follows because A was an arbitrary model of ∀ φ .

QED

Lemma 3.4

Let A be a structure such that, for every x ∈ | A |, there is a closed term t such that t^A = x.

Let φ be a formula with a single free variable v.

Then A ⊨ ∀ v φ iff for every closed term t, A ⊨ φ ( v / t ).

Proof of Lemma 3.4

The left-to-right direction is by Lemma 3.3.

Assume the left side fails. Then there exists x ∈ | A | such that

( A , x ) ⊭ φ ( v / c_x ) .

Pick a term t such that x = t^A. By Claim II in the sketch of Lemma 3.2,

A ⊭ φ ( v / t ) .

So the right side fails.

QED

Lemma 3.5

Let t be a closed term and φ be a formula whose only free variable is v.

Then

φ ( v / t ) ⊨ ( ∃ v φ )

Proof of Lemma 3.5

Apply the duality between universal and existential quantification to see:

∀ v ¬ φ ⊨ ¬ φ ( v / t ) iff ¬ ( ∃ v φ ) ⊨ ¬ φ ( v / t ) .

The left side is by Lemma 3.3 so the right side holds too.

Take a contrapositive to see:

¬ ( ∃ v φ ) ⊨ ¬ φ ( v / t ) iff φ ( v / t ) ⊨ ( ∃ v φ ) .

The right side is Lemma 3.5.

QED

Generalization Lemma 3.6

Let Σ be a theory and φ be a formula whose only free variable is v .

Let c be a constant symbol that does not occur in φ or any sentence of Σ .

Assume that Σ ⊨ φ ( v / c ) .

Then Σ ⊨ ∀ v φ .

Proof of Lemma 3.6

Let A be an arbitrary model of Σ in the language without c. Let x be an arbitrary member of | A |. Let c_x be a new constant symbol. Let B be the expansion of A which interprets both c_x and c as x. In other words,

c_x^B = x = c^B.

Then

B ⊨ c_x ≈ c.

By Lemma 3.2,

B ⊨ φ ( v / c_x ) ↔ φ ( v / c ) .

In other words,

B ⊨ φ ( v / c_x ) iff B ⊨ φ ( v / c ) .

At the start of the proof, we assumed A ⊨ Σ . Since B is an expansion of A,

B ⊨ Σ.

By the assumption in the statement of Lemma 3.6,

B ⊨ φ ( v / c ) .

Therefore,

B ⊨ φ ( v / c_x ) .

The sentence on the right does not involve the constant symbol c so

( A , x ) ⊨ φ ( v / c_x ).

Since x was arbitrary, by the ∀-clause in the definition of satisfaction, A ⊨ ∀ φ .

This proves Lemma 3.6.

QED

Lemma 3.7

Let Σ be a theory, φ be a formula and ψ be a sentence .

Let c be a constant symbol that does not occur in ψ , φ or any sentence of Σ .

Assume v is the only free variable of φ

Suppose that Σ ∪ { φ ( v / c ) } ⊨ ψ.

Then Σ ∪ { ∃ v φ } ⊨ ψ .

Sketch of Lemma 3.7

Combine two results that appear later in this chapter: Example A and Soundness Theorem 3.8.

QED

It is worth reflecting on how Lemmas 3.6 and 3.7 mirror the kind of reasoning we use in everyday mathematics.

Generalization corresponds to arguments of the form:

Make some background assumptions.

Consider an arbitrary x.

Prove that a certain property holds about x.

∴ The property holds about every x.

To see the correspondence with Lemma 3.6, take Σ to stand for the assumptions and φ to stand for the property. Unfortunately, at the beginning, c stands for x but, at the end, v stands for x. The fact that x plays two roles, first as a constant, then as a variable, might cause confusion. However, in everyday mathematics, we are so used to this that we can usually tell when x is playing which role. Note, too, that it is important not to introduce x until after we made our background assumptions because otherwise x would not be arbitrary. This corresponds to the requirement that c does not occur in any sentence in Σ.

Now let us analyze Lemma 3.7. The corresponding line of reasoning is more complicated but not uncommon.

Make some background assumptions.

Consider an arbitrary x.

Suppose that x has a certain property.

Draw a conclusion that does not mention x.

∴ The conclusion follows from the background assumptions and the mere existence of a witness to the property.

This time, take Σ to stand for the assumptions, φ for the property and ψ for the conclusion. Again, first c stands for x, then v does.

Return to these thoughts while reading the next section.

§ 3.6 A deduction system

We are now ready to define a deduction system for first-order logic. As in Chapter 2, it will consist of some deduction rules and justifications for other deductions. Keep in mind that the turnstile symbol ( ⊢ ) stands for a relation from theories to sentences only.

Deduction rules

Propositional rules

Include all the deduction rules from Chapter 2 as understood for first-order sentences and theories instead of propositional sentences and theories.

≈-in (Equivalence)

If s and t are closed terms, then:

{ } ⊢ t ≈ t

{ } ⊢ s ≈ t ↔ t ≈ s

{ } ⊢ ( r ≈ s ∧ s ≈ t ) → r ≈ t

≈-out (Substitution)

If s and t are closed terms and v is the only free variable of φ,

then { s ≈ t } ⊢ φ ( v / s ) ↔ φ ( v / t ) .

∀-in (Generalization)

If c is a constant symbol that does not occur in any sentence of Σ and does not occur in φ,

then Σ ⊢ φ ( v / c ) implies Σ ⊢ ∀ v φ .

∀-out (Specialization)

If t is a closed term and φ is a formula with at most one free variable, v,

then { ∀ v φ } ⊢ φ ( v / t ) .

(Duality)

∃-in

{ ∀ v ¬ φ } ⊢ ¬ ∃ v φ

∃-out

{ ¬ ∃ v φ } ⊢ ∀ v ¬ φ

Examples of deduction

Example A

Let Σ be a theory, φ be a formula and ψ be a sentence.

Let c be a constant symbol that does not occur in ψ, φ or any sentence in Σ.

Assume v is the only free variable of φ and Σ ∪ { φ ( v / c ) } ⊢ ψ.

From this assumption, skipping some obvious steps in the justification we see that:

Σ ∪ { φ ( v / c ) } ⊢ ψ (assumption)

Σ ⊢ φ ( v / c ) → ψ

Σ ⊢ ( ¬ ψ ) → ( ¬ φ ( v / c ) )

Σ ∪ { ¬ ψ } ⊢ ¬ φ ( v / c )

Σ ∪ { ¬ ψ } ⊢ ∀ v ¬ φ (Generalization)

Σ ∪ { ¬ ψ } ⊢ ¬ ∃ v φ (Duality)

Σ ⊢ ( ¬ ψ ) → ( ¬ ∃ v φ )

Σ ⊢ ( ∃ v φ ) → ψ

Σ ∪ { ∃ v φ } ⊢ ψ

Notice how this corresponds to Lemma 3.7. The only difference is that there we had ⊨ while here we have ⊢. In fact, what we showed above together with Soundness Theorem 3.8 below implies Lemma 3.7.

Example B

Let φ be a formula with exactly two free variables, u and v. Let c and d be distinct constants that do not occur in φ. Then we have the justification:

{ ∀ u ∀ v φ } ⊢ ∀ v φ ( u / c ) (Specialization)

{ ∀ v φ ( u / c ) } ⊢ φ ( u / c ) ( v / d ) (Specialization)

{ ∀ u ∀ v φ } ⊢ φ ( u / c ) ( v / d ) (R2)

{ ∀ u ∀ v φ } ⊢ φ ( v / d ) ( u / c ) (Exercise 3.3)

{ ∀ u ∀ v φ } ⊢ ∀ u φ ( v / d ) (Generalization)

{ ∀ u ∀ v φ } ⊢ ∀ v ∀ u φ (Generalization)

The final deduction is a useful rule on changing the order of universal quantifiers.

Example C

Here is another justification that skips a few obvious steps.

{ ∀ u ( ¬ φ ( v / d ) ) } ⊢ ¬ φ ( u / c ) ( v / d ) (Specialization and Exercise 3.3)

{ } ⊢ φ ( u / c ) ( v / d ) → ¬ ∀ u ¬ φ ( v / d ) (Propositional rules)

{ } ⊢ φ ( u / c ) ( v / d ) → ∃ u φ ( v / d ) (Duality and Propositional rules)

{ ∀ v φ ( u / c ) } ⊢ φ ( u / c ) ( v / d ) (Specialization)

{ ∀ v φ ( u / c ) } ⊢ ∃ u φ ( v / d ) (Modus ponens)

{ ∀ v φ ( u / c ) } ⊢ ∀ v ∃ u φ (Generalization)

{ ∃ u ∀ v φ } ⊢ ∀ v ∃ u φ (Example A)

The final deduction is a useful rule on changing the order of universal and existential quantifiers.

Clarification on justification and deduction

Note that our deduction rules apply to all first-order languages simultaneously. In particular, we do not require that justifications stay within a particular language. This seems reasonable given that in everyday mathematics, we often make statements of the form:

Consider an arbitrary c with a certain property φ.

In a justification of a deduction, this corresponds to adding a new constant symbol c and letting φ ( v / c ) be a member of the theory on the left side of the turnstile.

Examples A, B and C illustrate the power of having available outside constants.

Warning: Many other textbooks do not allow this sort of expansion of the language in justifications of deductions. (Maybe all but I do not know.) Instead, they develop a deduction calculus that involves formulas that might not be sentences and sets of formulas that might not be theories.

§ 3.7 Low hanging fruit

Here is a summary of the main things we know so far.

The Equivalence Rules are sound because equality is an equivalence relation. Clearly the Duality Rules are sound. From Theorems 2.3 and Lemmas 3.2, 3.3 and 3.5, it follows that the other deduction rules are also sound.

Soundness Theorem 3.8

Assume Σ ⊢ ψ . Then Σ ⊨ ψ .

Eventually, we will prove a converse to Theorem 3.8. But we will have to develop some new ideas first.

From Theorem 2.9 and our method for translating propositional validities to tautologies, we obtain the following result. Note that the proof uses only the propositional deduction rules.

Tautology Lemma 3.9

If ψ is a tautology, then { } ⊢ ψ .

The reasoning that gave us Lemma 2.4 easily adapts to give us the following result.

Finiteness of Deductions Lemma 3.10

Suppose Σ ⊢ φ.

Then there is a finite Δ ⊆ Σ such that Δ ⊢ φ.

The following definitions are just like in the propositional case.

Let Σ be a theory. As expected, we define:

Σ is consistent iff ⊥ is not a theorem of Σ.

Σ is complete iff for every sentence φ in the language of Σ, either Σ ⊢ φ or Σ ⊢ ¬ φ.

Σ is closed under deduction iff for every sentence φ in the language of Σ, if Σ ⊢ φ, then φ belongs to Σ.

Using only propositional rules, it is easy to see that, for every theory Σ the following are equivalent:

Σ is consistent, complete and closed under deduction.

For every sentence φ in the language of Σ, either φ belongs to Σ or ¬ φ belongs to Σ but not both.

The proof of Lemma 2.7 trivially adapts to give the following result in the case of countable first-order languages. It also adapts in the case of uncountable languages but for that you must understand certain concepts from set theory which are not prerequisites for the course. (Some indication of how will be given in lecture.)

Complete Extensions Lemma 3.11

Let Σ be a consistent theory.

Then there is a theory Γ that extends Σ such that Γ is consistent, complete and closed under deduction.

(The language of Γ is the same as that of Σ.)

§ 3.8 Gödel completeness theorem

We have adapted the ideas from Section 2 to first-order logic. Now we introduce other ideas, which are due to Leon Henkin.

Lemma 3.12

Let Σ be a consistent theory.

Let φ be a formula whose only free variable is u.

Let c be a constant symbol that does not occur in any sentence in Σ .

Let η be ( ∃ u φ ) → φ ( u / c ) .

Then Σ ∪ { η } is consistent.

Proof of Lemma 3.12

Otherwise, Σ ⊢ ¬ η.

So Σ ⊢ ∃ u φ and Σ ⊢ ¬ φ ( u / c ).

By the ∀-in rule, Σ ⊢ ∀ u ¬ φ ( u ).

By the ∃-in rule, Σ ⊢ ¬ ∃ u φ ( u ).

Thus Σ is inconsistent. Contradiction!

QED

Henkin Extensions Lemma 3.13

Let Σ be a consistent theory.

Then there is a consistent extension Σ^H of Σ in an expanded language with new constant symbols
such that, for every formula φ in the expanded language, if φ has a single free variable, say v,
then there is a constant symbol c such that the sentence

( ∃ v φ ) → φ ( v / c )

belongs to Σ^H .

Moreover, if the language of Σ is countable, then so is the language of Σ^H .

Proof of Lemma 3.13

By recursion on natural numbers n, define an increasing chain of theories

Σ = Σ₀ ⊆ Σ₀ ⊆ ⋅ ⋅ ⋅ ⊆ Σ_n ⊆ ⋅ ⋅ ⋅

For each formula φ in the language of Σ_n such that φ has a single free variable, say v, let c_φ be a distinct new constant and let η_φ be the sentence

( ∃ v φ ) → φ ( v / c_φ ).

Then let Σ_n+1 be Σ_n ∪ { η_φ ∣ φ as above }.

Let Σ^H be the union of all these theories.

For every formula φ in the language of Σ^H, there exists a natural number n such that φ is in the language of Σ_n . If φ has one free variable, then η_φ belongs to Σ_n+1 hence to Σ^H.

It remains to see that Σ^H is consistent. But just like in the proof of the Complete Extension Lemma 3.11, the union of an increasing chain of consistent theories is a consistent theory. (The proof does not depend on staying in the same language.)

QED

Model Existence Theorem 3.14

Assume Σ is a consistent theory. Then Σ has a model.

Proof of Theorem 3.14

Apply the Henkin Extension Lemma 3.13 to Σ to obtain Σ^H.

Apply the Complete Extension Lemma 3.11 to Σ^H to obtain Σ^*.

Note that Σ^H and Σ^* have the same language, so Σ^* also has the Henkin property.

Recap:

Σ^* is a complete and consistent theory.

Σ^* is an expansion of Σ to a language with more constant symbols.

For every sentence φ if Σ^* ⊢ φ , then φ belongs to Σ^* .

For every formula φ with a single free variable, say v, there is a constant symbol c such that the sentence
( ∃ u φ ) → φ ( u / c )
belongs to Σ^*.

Let S be the set of closed terms of the language of Σ^* .

For closed terms s and t in the language of Σ^*, define s ∼ t iff the sentence s ≈ t belongs to Σ^*.

It follows from the Equivalence Rules that ∼ is an equivalence relation on S.

Write [ t ] for the equivalence class { s ∈ S ∣ s ∼ t }.

Attempt to define a structure A in the language of Σ^* as follows.

| A | = { [ t ] ∣ t ∈ S } .

If c is a constant symbol, then c^A = [ c ] .

If F is an n-ary function symbol,
then F^A ( [ s₀ ] , … , [ s_n-1 ] ) = [ t ] iff the sentence F ( s₀ , … , s_n-1 ) ≈ t belongs to Σ^* .

If R is an n-ary function symbol,
then R^A ( [ t₀ ] , … , [ t_n-1 ] ) iff the sentence R ( t₀ , … , t_n-1 ) belongs to Σ^* .

Claim A: This is a legitimate definition of A and A is a structure in the language of Σ^*.

There are two things to prove.

First, assume that s'_i ∼ s_i for all i < n and t' ∼ t.

In other words, the sentences s'_i ≈ s_i for i < n and t' ≈ t all belong to Σ^*.

We use these n+1 axioms of Σ^* and the Substitution Rule (and a few more steps) we justify the deduction

Σ^* ⊢ F ( s₀ , … , s_n-1 ) ≈ t ↔ F ( s'₀ , … , s'_n-1 ) ≈ t' .

Since Σ^* is closed under deduction, the biconditional sentence on the right also belongs to Σ^* .

It follows that F^A ( [ s₀ ] , … , [ s_n-1 ] ) = [ t ] iff F^A ( [ s'₀ ] , … , [ s'_n-1 ] ) = [ t' ] .

So, our attempt to define F^A is legitimate and F^A is an n-ary function on | A |.

Second, assume that t'_i ∼ t_i for all i < n.

Then R^A ( [ s₀ ] , … , [ s_n-1 ] ) iff R^A ( [ s'₀ ] , … , [ s'_n-1 ] ) .

The proof is similar to what we did first.

So, our attempt to define R^A is legitimate and R^A is an n-ary relation on | A |.

Claim B: t^A = [ t ] for every t ∈ S.

The proof is by induction on the complexity of a closed term t.

Suppose that t is a constant symbol c. Then c^A = [ c ] by the definition of A.

Suppose t is F ( s₀ , … , s_n-1 ) . Then:

( F ( s₀ , … , s_n-1 ) )^A = F^A ( s₀^A , … , s_n-1^A ) by the definition of evaluation in a structure

= F^A ( [ s₀ ] , … , [ s_n-1 ] ) by the induction hypothesis

= [ F ( s₀ , … , s_n-1 ) ] by the definition of A since the sentence F ( s₀ , … , s_n-1 ) ≈ F ( s₀ , … , s_n-1 ) belongs to Σ^*.

Claim C: For every sentence φ, A ⊨ φ iff φ belongs to Σ^* .

The proof is by induction on the complexity of φ.

We may assume that φ is built up out of ¬ , ∧ and ∃ without ∨ , → , ↔ or ∀ .

First, we handle the two kinds of atomic sentences.

A ⊨ R ( t₀ , … , t_n-1 ) iff R^A ( t₀^A , … , t_n-1^A ) iff R^A ( [ t₀ ] , … , [ t_n-1 ] ) iff R ( t₀ , … , t_n-1 ) belongs to Σ^* .

A ⊨ s ≈ t iff s^A = t^A iff [ s ] = [ t ] iff s ∼ t iff s ≈ t belongs to Σ^* .

Second, we handle negations.

A ⊨ ¬ φ iff A ⊭ φ iff φ does not belong to Σ^* iff ¬ φ belongs to Σ^* .

Third, we handle conjunctions.

A ⊨ φ ∧ ψ iff ( A ⊨ φ and A ⊨ ψ ) iff both φ and ψ belong to Σ^* iff φ ∧ ψ belongs to Σ^* .

Finally, we handle existential sentences.

First, suppose that the sentence ∃ v φ belongs to Σ^* .

Recall that there is a constant symbol c such that the sentence ( ∃ v φ ) → φ ( v / c ) belongs to Σ^* .

By the →-out Modus Ponens Rule, Σ^* ⊢ φ ( v / c ) .

As Σ^* is closed under deduction, φ ( v / c ) belongs to Σ^* .

By the induction hypothesis, A ⊨ φ ( v / c ) .

By Lemma 3.5, A ⊨ ∃ v φ.

Finally, suppose that ∃ v φ does not belong to Σ^* .

Then, ¬ ∃ v φ belongs to Σ^* because Σ^* is complete and closed under deduction.

Using the ∃-out Duality Rule, we justify the deduction Σ^* ⊢ ∀ v ¬ φ

Since Σ^* is closed under deduction, the sentence ∀ v ¬ φ belongs to Σ^* .

Similarly, using the ∀-out Specialization Rule, we see that for every closed term t in the language of Σ^*, the sentence ¬ φ ( v / t ) belongs to Σ^* .

By the induction hypothesis, for every closed term t in the language of Σ^*, A ⊨ ¬ φ ( v / t ) .

By Claim B and Lemma 3.4, A ⊨ ∀ v ¬ φ

Hence A ⊭ ∃ v φ .

That completes the proof of Claim C, which implies that A is a model of Σ^*.

Since Σ^* contains Σ, the reduction of A to the language of Σ is a model of Σ .

Thus, Theorem 3.14 is proved.

QED

Completeness Theorem 3.15

Assume Σ ⊨ φ. Then Σ ⊢ φ.

Proof of Theorem 3.15

Use the Theorem 3.14 the way Theorem 2.8 is used in the proof of Theorem 2.9.

QED

Theorem 3.15 was originally proved by Kurt Gödel using a different approach.

Compactness Theorem 3.16

Let Σ be a theory. Assume that every finite subtheory of Σ has a model. Then Σ has a model.

Proof of Theorem 3.16

Use the Theorem 3.14 the way Theorem 2.8 is used in the proof of Theorem 2.10.

QED

When we say that a model is countable, we always mean that its universe is countable.

Lemma 3.17

Let Σ be a consistent theory with a countable language.

Then Σ has a countable model.

Sketch of Lemma 3.7

Define Σ^H and Σ^* as in the proof of the Model Existence Theorem 3.14.

Because Σ is countable, by the Henkin Extension Lemma 3.13, Σ^H is also countable.

Clearly Σ^* is countable too since it is a theory in the same language as Σ^H.

There are only countably many terms in a language with countably many symbols.

So S = { t ∣ t is a closed term in the language of Σ^* } is countable too.

In the proof of Theorem 3.14, we found a model A of Σ with | A | = { [ t ]_∼ ∣ t belongs to S } where ∼ was a certain equivalence relation on S.

Since S is countable, so is | A |.

QED