A Mathematical Introduction to Logic

The purpose of these comments is to explain, section by section, what I am trying to do in the book. My hope is that the commentary will add a helpful perspective for the reader.

In addition being a place for helpful material, this website gives me the chance to post additional material. Users of the book might have varying views on this role of the website.

I welcome suggestions for how the commentary can be made more useful.

Jump to   Chapter One   Chapter Two   Chapter Three   Chapter Four. Chapter Zero   Chapter Two   Chapter Three   Chapter Four. material conditional (using the traditional horseshoe symbol for the conditional, instead of the arrow).

Chapter Zero

Of chickens and eggs:   Students sometimes ask whether they should study set theory first or logic first. My answer is "Yes." That is, whichever subject one learns first will be helpful in learning the other.

Chapter Zero consistly largely of standard notation that the reader has already been using. An exception is the notation "A;t" for the result of adding t to the set A. (I picked this up from Quine, as I recall.)

It is possible to get through the book avoiding uncountable cardinal numbers, at the cost of losing some theorems (see the footnote on page 153).

For more on set theory, see the reference on page 307 to my favorite book on the subject.

The end-of-proof symbol ( -| ) introduced on page 1 comes from the Chang and Keisler book. Its origin is linked to the turnstile symbol for provability on page 110. Not only is there some rationale to the symbol, but it looks much better than those ugly black boxes that are often used. (By the way, I had a fight with my Academic Press editor over this symbol.)

Jump to  

Chapter One

Section 1.0.

Some of the remarks on page 14-15, about what the symbols are, might be better postponed. But pay attention to the induction principle on page 18; it will get a lot of use in the coming pages.

Section 1.1.

A note on typography: the book systematically uses both lightface and boldface parentheses, and it helps if you can see the difference. The boldface parentheses are official symbols in the object language. The lightface parentheses are used (as usual) in the meta-language.

The first edition of the book was set in Monotype, and the difference in the parentheses was easier to see. The second edition is generated by a LaTeX2e file, except for the figures, which are stripped in from the first edition. If you look closely, you can see the differences. For example, on page 17 compare the symbols in the tree (from the first edition) with the symbols in the line displayed above the tree.

Additional exercise:
6. If we know that a wff has a construction sequence of length 5, what is the largest possible length of the wff?

Section 1.2.

This section presents the core concepts of sentential logic. In particular, the concept of tautological implication defined on page 23 is central to Chapter One. (To digress, mathematicians use the word "imply" is a way that is different from the way others use that word. Part of learning mathematics is learning the language mathematicians use.)

In connection with conditional statements, as discussed at the bottom of page 21, here is a link to another discussion of the so-called

Page 23 says, "Now consider a set Sigma of wffs (thought of as hypotheses) and another wff tau (thought of as a possible conclusion)." Here by "hypotheses" is meant given assumptions, or axioms, or premisses. In statistics, the word "hypothesis" is used in a very different sense.

By the way, the word "premiss," used to indicate a given assumption, is often spelled "premise." But Bertrand Russell, Charles Peirce, and Alonzo Church all used the spelling "premiss." It has the advantage of distinguishing "premisses" from the legal term, as in "The landlord shall have access to the premises."

Whenever Sigma tautologically implies a formula tau, we also say that tau is a tautological consequence of Sigma.

On page 24, one finds the claim that the compactness theorem follows from Tychonoff's theorem. Here are more details on this point.

Page 26 makes a passing mention of P and NP. The book does not go into either concept, but as you may already have heard, there is a million dollar prize riding on the connection between them. You can collect the million dollars either by finding a polynomial-time decision procedure for the tautologies, or by showing that no such procedure exists.

One fact that is used -- without mention -- at various points is the monotonicity principle: If a set of premisses tautologically implies a conclusion, then automatically any larger set of premisses (including the first set as a subset) tautologically implies the same conclusion. (This is clear if you stop and think about it; the more premisses we have, the fewer truth assignments will satisfy all of them.) In other words, adding new premisses cannot result in a loss of consequences.

An important topic in artificial intelligence is "non-monotonic logic," where this principle is violated. From a set of premisses (Tweety is a bird), one might draw the reasonable conclusion that Tweety can fly. If an additional premiss (Tweety can swim) is added, then we might need to withdraw that conclusion, because penguins cannot fly.

Exercise 7: This is traditional. It can be made more challenging by adding side conditions. For example, suppose the words used in this country for "yes" and "no" are "plink" and "plonk," but unfortunately you don't know which is which. Now what one question should you ask? (It will be such a long question that the poor fellow will need to pull out pencil and paper to figure out his answer.) With one yes-or-no question, you can get one bit of information. You need to make sure it is the right bit.

Exercise 8 (substitution): For example, if phi is (A v B) then phi* is the formula (alpha_1 v alpha_2). The definition of phi* can be usefully recast as a definition by recursion:
First, A_i* is the formula alpha_i, for each i.
Secondly, for negation, (-phi)* is the formula (-phi*);
for disjunction, (phi v theta)* is the formula (phi* v theta*);
similarly for the other connectives.

Exercise 10(c): The wording of this problem, "Sigma = {sigma_0, sigma_1, ... }" exploits the fact that we have a countable language, and so the members of Sigma can be indexed by natural numbers. Nonetheless, a stronger fact is true: The result stated in 10(c) holds for languages of any cardinality. A reference for the uncountable case is Iégor Reznikoff, Tout ensemble de formules de la logique clasique est équivalent à un ensemble indépendant, Comptes rendus Acad. Sci. Paris, vol. 260 (1965), pp. 2385-2388. [MR vol. 31 #2132.]

Additional exercise:
16. Show that for any formula we can find a tautologically equivalent formula in which negation (if it occurs at all) is applied only to sentence symbols.

Section 1.3.

As the footnote on page 29 indicates, this section can be postponed. There are two things going on in this section.

First, the section shows that formulas are constructed unambiguously. That is, for any given wff, there is only one possible tree like the tree on page 22. (In the first edition, this section was titled Unique Readability.)

Secondly, the section provides an algorithm that, applied to an expression, will determine whether or not is a wff, and if so, will return its tree as output. Why do want such an algorithm? We want it as a subroutine of an even better algorithm:

Theorem 13C. There is an algorithm that given a wff and given a truth assignment for its sentence symbols, will calculate the truth value assigned to that wff by that truth assignment.

Section 1.3 gives an algorithm for constructing the wff's tree. Then the procedure continues as on page 22, working from the bottom of the tree up. At the top of tree, we get the answer.

Here by an algorithm is meant an "effective procedure" (see Section 1.7) -- a procedure we could write a program for -- that returns the right answer. Section 1.7 focuses on exactly such procedures. So there would be some merit to combining Section 1.3 and the relevant part of Section 1.7.

It is a curious phenomenon that formulas written in Polish notation are hard to read. Hard for humans, that is. Humans find Polish notation difficult, and find infixed notation (with parentheses) easy. For machines, it seems to be exactly the opposite. (I have been told that compilers typically change any given algebraic expressions into Polish -- or reverse Polish -- notation, this being more amenable to automated processing than infixed notation with parentheses.) To digress, I have heard it argued that chess isn't really all that hard. It is, however, hard for humans. Machines can play chess, and Deep Blue once beat Kasparov. In contrast, Go is really hard, but humans are good at it. There is no Go program so far that can play any better than human rank amateurs.

Section 1.4.

Like the preceding section, we have here a footnote (on page 34) authorizing postponement. That is not to say that the section is unimportant.

The idea of constructing a class C by starting with an initial set B and then "closing" it under some operations is an idea that comes up not only in logic, but throughout mathematics. Whenever a class C is constructed in this way, there is a natural principle of "structural induction" (page 37) for proving statements about it. And if C is freely generated, then there is a useful method of defining functions on it by "structural recursion" (page 39).

On the one hand, the material in Section 1.4 is stuff that everyone in mathematics should know. On the other hand, typically a logic course does not have the luxury of spending much (if any) time on this section.

Section 1.5.

The material in this section is rather combinatorial in nature. The subject was investigated by the American logician Emil Post, in the years 1920-1940.

Digression:   It is not hard to see that the set {T, +, &} is complete. Here is a way to show that, not using Theorem 15B (and which yields an independent proof of Theorem 15B). Identify {F, T} with the two-element field {0, 1}. Then any polynomial function (in several variables) over this field can be realized by using these connectives (which correspond to 1, addition, and multiplication). But over a finite field, every function is a polynomial, by the Lagrange interpolation theorem.

Exercise 7. This exercise gives a four-element set of connectives that is complete (part (a)) and independent (part (b)). Post showed that it is not possible to obtain a five-element (or larger) set with these properties.

Exercise 9. For a DNF formula, it is easy to see if it is satisfiable, but hard to see if it is a tautology. With a CNF formula, this situation is reversed.

In part (a), the formula has been written without parentheses. We have a convention (page 33) saying where the parentheses go. But it doesn't matter; the biconditional is associative. The formula does not say that A, B, and C all have the same truth value; cf. Exercise 1 on page 27.

Exercise 10. For example, if phi is (A & B), then resolution on A yields (T & B) v (_|_ & B), which is tautologically equivalent to B.

Suggestion for parts (a) and (b): Substitution (Exercise 8 in Section 1.2) can be useful here, because we are replacing the sentence symbol A by another formula (either T or _|_).

In part (c), saying a formula is satisfiable means that there exists some truth assignment that satisfies it. The point is that resolution on A produces an equally satisfiable A-free formula.

One possible way to test a formula for satisfiability is to apply resolution to one sentence symbol after another, eventually obtaining an equally satisfiable formula with no sentence symbols at all. Then we can easily test the latter for satisfiability. This procedure, like truth tables, still requires exponential time; that is, it does not escape the "2^n" difficulty described on page 26. The problem is that each time we perform resolution on a sentence symbol, the length of the formula can double.

Exercise 11. The interpolation theorem for first-order logic is not -- unfortunately -- covered in this book.

Additional exercise:
13. The ternary connective "conditioned disjunction" (written with brackets [ [ ] ]) is defined as follows:
[alpha[beta]gamma], which can be read "alpha or gamma, according as beta or not beta," is tautologically equivalent to
   ((beta --> alpha) & (-beta --> gamma)).
Show that {[ [ ] ], T, _|_} is a complete set of connectives.

Section 1.6.

The circuit viewpoint gives a somewhat more concrete way of looking at formulas. And Boolean circuits arise in certain ways of specifying the complexity of functions.

But for actually constructing computer hardware, the picture given in this section is old-fashioned. There was a day when a plug-in NAND gate was something you could hold in your hand. Integrated circuits have made such objects obsolete.

Exercises 2 and 3 touch on a method, due to Quine, for minimizing circuit size. I think the method no longer has any practical applications in the real world.

Section 1.7.

We want to claim (see Exercises 5, 6, and 7 in this section for one formulation) that whenever a set of premisses tautologically implies a conclusion, then there is a finite, utterly convincing argument -- a proof -- to that effect. If the set of premisses is infinite, that finite argument might be unable to make use of all of premisses. Luckily, compactness comes to our rescue here. Only finitely many of the premisses are really needed. So there is hope that the finite, utterly convincing argument exists (as Exercise 7 -- or the truth-table method -- verifies). Moreover, that argument is "mechanically verifiable," provided we have a way to tell a premiss from a non-premiss.

On page 63, as the proof to Theorem 17C, we find: "The truth-table procedure (Section 1.2) meets the requirement." True, but one might ask for more detailed instructions for the procedure.

Step I is to run the parsing algorithm on each premiss in Sigma and on the alleged conclusion tau. This produces the tree (like the tree on page 22) for each of these formulas. The sentence symbols are leaves in the tree; say there are n of them.

Step II is to start listing all 2^n truth assignments on those sentence symbols. For each, we work our way up each tree that we constructed in step I. (Cf. "Theorem 13C" above, under Section 1.3.) If any premiss comes out false, we go on to the next truth assignment. If all the premisses are true and tau is false, we return the answer "No" and halt.

Finally, if after checking all 2^n truth assignments, we have not found one satisfying all the premisses but not the conclusion, then we return the answer "Yes" and halt.

In other words, the truth-table procedure meets the requirement.

This section talks about, for an arbitrary set Sigma of formulas, the set -- call it Cn Sigma -- of all tautological consequences of Sigma. That is, a formula phi is in Cn Sigma iff Sigma tautologically implies phi.

Thus Cn Sigma is a set of formulas. It clearly includes Sigma as a subset. And in some cases it might even be the same as Sigma (that is the smallest it could be). The largest Cn Sigma could be is the set of all formulas (this happens iff Sigma is unsatisfiable). As a special case, where 0 is the empty set, we can say that Cn 0 is the set of tautologies. Corollary 17D tells us that Cn 0 is decidable.

Theorem 17G tells us that whenever we have a decidable (or at least semidecidable) set Sigma, then Cn Sigma is sure to be semidecidable. For example, suppose Sigma is the set of all sentence symbols. What can we say about this set? It is satisfiable (but only by one truth assignment). It is decidable. So Theorem 17G tells us that Cn Sigma is semidecidable. But in this particular case, Cn Sigma is even decidable.

Exercise 2. There is also a way to do the half of this exercise we need (namely that whenever phi is in Delta then v satisfies phi) without using induction on phi. Look at the finite subset of Delta consisting of
(i) phi,
(ii) each sentence symbol in phi, if that symbol is in Delta, and
(iii) the negation of each sentence symbol in phi, if that negation is in Delta.
That finite subset of Delta must be satisfiable, say by the truth assignment u. But on phi's sentence symbols, u must agree with v. So v satisfies phi.

This way of doing Exercise 2 was pointed out to me by Vladimir Lifschitz. Although it is simpler than the induction proof, the induction proof will nevertheless be needed for the completeness theorem in Section 2.5. So doing Exercise 2 by induction now will prepare for that. (Another point: If we use this shortcut for Exercise 2, then the construction of Delta can be simplified: Instead of going through all formulas, it is enough to go through all sentence symbols.)

Exercises 5, 6, and 7 are here to make the following point (as mentioned above): Whenever a (possibly infinite) decidable set Sigma of premisses tautologically implies a formula tau, then there is finite, verifiable, convincing evidence -- that is, a proof -- that tau tautologically follows from Sigma. (Truth tables also can be viewed as making this point.)

Exercise 12. Parts (a), (b), and (c) apply to subsets of size one, two, and three, respectively. The sequence could be extended: four, five, six, ... . And the solution to any part is automatically a solution to all earlier parts.
What the compactness theorem says is that there is no one solution that covers all the parts in this extended version.

Additional exercise:
13. Assume that A is an effectively enumerable set of expressions, and moreover that we have an effective procedure that lists exactly the members of A in such a way that each expression on the list is longer than the expressions occurring earlier on the list. Explain why A is decidable.

Jump to   Chapter Zero   Chapter One   Chapter Three   Chapter Four. Stanford webpage. Also there are movies you can watch, if you have a good connection and a soundcard.

Chapter Two

Section 2.0.

First-order logic is what real logic is. There is indeed something called "second-order logic" (Chapter 4 is devoted to that topic). But as I said, real logic is first-order logic.

Section 2.1.

There are two topics in this section. One is the purely syntactical matter of saying which strings of symbols qualify as genuine formulas. And the other topic is translation between two languages: the formal first-order language and English. The business of translation will continue in the next section.

At the top of page 76 there is a passing reference to the "axiom of regularity for set theory" -- this is also called the "foundation axiom."

Section 2.2.

This section is the longest one in the book, but it presents core concepts of first-order logic. In particular, the concept of logical implication (page 88), which is analogous to tautological implication in Chapter 1, is defined here.

The concept of logical implication is sometimes defined only for sentences (as in Corollary 22C on page 88), but it can be defined also for formulas with free variables. In fact this can be done in two inequivalent ways. If the deductive calculus in Section 2.4 had an unrestricted universal generalization rule (as some calculi do), then we would need the other definition of logical implication for formulas.

To expand on that point: There are two ways in which one might read (or interpret) a formula phi(x,y) with free variables. This book opts for the "particular" reading: phi(x,y) asserts that phi is true of whatever particular objects (unknown to us) that x and y might name. That is, the free variables x and y function like constant symbols. The other approach is the "universal" reading: the commutative law of addition,
    x + y = y + x
is understood as asserting an equation that holds of every x and y. Either approach is possible; it is not a good idea to mix them. And of course when it comes to sentences, it no longer matters which approach to free variables has been used.

A contrast can be drawn between Section 2.2 (about semantics) and Section 2.4 (about formal deductions, a syntactical topic). That is, Section 2.2 studies the interplay between formulas and structures, and deals with such concepts as truth in a structure. The theorems here are of course proved in the meta-language.

By contrast, Section 2.4 deals with deductions, which are "proofs" within the object language. Many logic books start with this topic.

The role of Section 2.5 is to bring the semantical approach of Section 2.2 and the syntactical approach of Section 2.4 together.

Digression: Pages 83-84 define truth and satisfaction in the "natural" way. The universe of a structure can be any non-empty set. But it can not be a proper class. Why not? One answer is that the definition -- by recursion -- as described in the middle of page 84 simply is not possible for proper classes. Another answer is related to Tarski's theorem (in Section 3.5). Suppose we are doing logic within set theory (ZFC, to be technical). And take the language for set theory. Try to define "truth in the universe," that is, take the universe to be the "real" universe of all sets. The critical clause says that
   forall x phi is satisfied by s iff for every set d, phi is satisfied by s(x|d).
If this definition worked, you could define truth. But if you can define truth, then you can define falsity. And with one more self-reference trick, you can make a sentence that says, "this sentence is false." Now you are in big trouble.

Another note on the typography: The book uses two different equal signs, boldface (the symbol in the object language) and lightface (the symbol in the meta-language). It is not easy to tell them apart at first, but see page 83, line 4 from below. The first edition used a wavy equal sign for the object-language symbol. It looked a bit odd, but at least it was easy to distinguish from the meta-language symbol.

In definability results, as on page 91, the positive results (showing something is definable) are usually easy, and the negative results are usually hard. There are exceptions to this rule. Julia Robinson got a Ph.D. largely for showing that the integers are definable in the rational field. (For more about this result, see the paper by Flath and Wagon, American Mathematical Monthly, vol. 98 (1991), pp. 812-823.) As page 98 points out, we can sometimes use an automorphism argument to show non-definability. But sometimes we can't. Some structures have no non-trivial automorphisms; see the discussion below of Exercise 12(c).

The "EC_Delta" notation (page 92) follows the pattern of the "G_delta" notation in analysis. (Upper-case Delta means arbitrary intersection, lower-case delta means countable intersection.) An EC_Delta class is an intersection of ECs. In analysis one uses similarly Sigma and sigma for arbitrary and countable union, respectively, as in "F_sigma-delta." But unlike the situation in analysis, here EC_DeltaSigma = EC_SigmaDelta; these are the classes that are closed under elementary equivalence.

Many readers will be familiar with the concepts of isomorphism and homomorphism from algebra courses. In Section 2.2, we make the connection to logic.

On page 94, the concept of "weak homomorphism" is what you get by replacing "iff" in clause (a) by "==>." Think of the example of partially ordered sets; the weak homomorphisms are order-preserving (in one direction). (Here is another example: A finite graph G is n-colorable if and only if there is a weak homomorphism from G into the complete n-element graph K_n.)

Here is another example of two isomorphic structures. Assume that the parameters are the quantifier symbol, a two-place function symbol o (where we write "x o y" for the value of the function at (x, y)), and a constant symbol. For the first structure for this language, take the additive group of real numbers: (R; +, 0), consisting of the real numbers with o interpreted as addition and the constant interpreted as 0. For the second structure for this language, take ((0, +infinity); x, 1), consisting of the positive real numbers, with o interpreted as multiplication and the constant interpreted as 1. These two structures are isomorphic. The function h(x) = e^x is an isomorphism from the first onto the second.

On page 96, the displayed line in part (b) of the Homomorphism Theorem can be (usefully) rewritten using the notation of page 86: Where the free variables of alpha are among v1, ... , vk, the condition becomes
|= (sub-A) alpha[[a1, ... , ak]]   iff   |= (sub-B) alpha[[h(a1), ... , h(ak)]]
for every a1, ... ak.

In part (a) of the Homomorphism Theorem (and in Exercise 13), the situation is this: We have two functions,
  s : V --> |A|, and h o s : V --> |B|
(where V is the set of variables). For a term t, we can either evaluate it in A, using the first function, or we can evaluate it in B, using the second function. Part (a) of the theorem states that these two alternatives match up; that is,
h(the first option) = the second option.

An interesting "list of exercises" together with the "theory behind the exercises" be found (in pdf format) on a

Exercise 2. To make this problem more challenging, use in each case finite structures of the smallest possible size.

Exercise 12(c). The unproved converse follows from Tarski's theorem discussed below under Section 2.6: the theory of real-closed ordered fields admits elimination of quantifiers. It follows that any relation definable in the real ordered field has a quantifier-free definition. The real field is rigid (i.e., it has no non-trivial automorphisms), so it is not possible to use an automorphism argument to show non-definability here. (In contrast, the complex field has uncountably many -- in fact 2^c many -- automorphisms, but only the two continuous ones -- the identity and the complex conjugate -- are well known.)

Exercise 13. See the comments above on part (a) of the Homomorphism Theorem.

Exercise 15. The same prime-switching argument shows that in N with multiplication, the only definable elements are 0 and 1. By the way, Skolem showed that the theory of this structure is decidable.

Exercise 17(a). For example, consider the finite structure pictured on page 82 (and page 85). We can characterize this structure up to isomorphism by a sentence that says there are four objects in the universe, all different, that exhaust the universe and are related in certain ways, and only those ways.

Exercise 18(a). For a particular example, consider the structure (N; <) consisting of the natural numbers with their usual order and the structure (R; <) consisting of the real numbers with their usual order. Then (N; <) is a substructure of (R; <).

Now take the existential formula "exists t (xPt & tPy)." If this is satisfied in (N; <) with s then we know that there is a natural number greater than s(x) and less than s(y). In that case, the formula is certainly satisfied in the larger structure (R; <) with s -- simply use the same number.

But the argument of the preceding paragraph goes only one way. There is indeed a real number between 2 and 3, but no natural number. In other words, the formula "exists t (xPt & tPy)" is satisfied in (R; <) by a function s having s(x) = 2 and s(y) = 3. Not so with (N; <).

Additional exercise:
29. Assume the only parameters in the language are the quantifier symbol and a two-place predicate symbol P. And assume the language does not have the equality symbol.
(a) Find a sentence that is satisfiable, but has no models of size 3 or less.
(b) Does part (a) generalize to numbers other than 3?
(c) In part (a), can we replace "or less" by "or more" and still find a sentence?

Section 2.3.

The parsing algorithm for terms is actually applicable to Polish notation in general, and shows that Polish notation is unambiguous. The algorithm builds the tree from the top down, which means it scans the term from left to right. The opposite approach is to build the tree from the bottom up, which involves scanning the term from right to left. Both approaches work.

Exercise 2. Jáskowski's algorithm is a right-to-left algorithm. It gives a linear-time (i.e., fast) test for being a well-formed expression in Polish notation.

Section 2.4.

The deductive calculus presented here is called a "Hilbert-style" system, in contrast to what are called "natural deduction" or "Gentzen-style" systems.

Neither a Hilbert-style system nor a Gentzen-style system is well suited for computer implementation. But in recent years, substantial progress has been made in the field of "automated deduction," that is, computer programs that find deductions (in a suitable system) of given formulas. Here is a link to information about one such program, Otter, which is readily available, and is well documented.

Exercise 2: We will argue in the next section (see p. 143) that the set Lambda of logical axioms is a decidable set of formulas. This exercise is an experiment in carrying out the decision procedure.

Exercise 7(a): According to page 89 (Example 7), "This is a strange -- but valid -- sentence." It has the format
   exists x ... --> ...
which page 73 says never to do.

Additional exercise, to show that our deductive calculus is consistent (even before we have the soundness theorem of Section 2.5):
18. For each formula phi, define phi* to be the result of erasing all quantifiers and terms (leaving the predicate symbols and connectives), and replacing = by T. For example, if phi is
  (forall x Px --> (Pgy v x = y))
then phi* is
  (P --> (P v T)).
(a) Show that for each logical axiom alpha in Lambda, the formula alpha* is a tautology of sentential logic.
(b) Show that for any formula phi that is deducible (from the empty set), phi* is a tautology of sentential logic.
(c) Show that not possible for both a formula and its negation to be deducible (from the empty set).

Section 2.5.

There is a sense in which the soundness theorem is inevitable: No sensible author would knowingly write a logic book with an unsound deductive system! That is, if our deductive system had failed to satisfy the soundness theorem (by allowing deduction of formulas that were not logical consequences of the premisses) then we would change Section 2.4 to block such deductions.

By contrast, the completeness theorem enjoys no corresponding inevitability. As discussed back on page 109, if either the compactness theorem or the enumerability theorem had been false, then it would have been impossible to make a complete deductive system of the type desired.

The completeness theorem is the central result in Chapter 2. It tells us non-trivial information (i) about the proof concept in mathematics, (ii) about compactness in first-order logic, and (iii) about effective enumerability (i.e., semidecidability) in first-order logic.

First, about the proof concept: Suppose we start (perhaps on the first day of an algebra class) with a given set Sigma of sentences (referred to as "axioms"). For example, Sigma might be the axioms for group theory, or field theory, or such. And then the course proceeds to study Mod Sigma, the class of all models of the axioms (e.g. all groups, or all fields, or whatever).

Further suppose that tau is a sentence that is true in every model of Sigma (e.g. true in every group, or in every field, etc.). Could we ever hope to know that fact? Can we, in some sense, prove tau from the axioms?

YES! The fact that tau is true in every model of Sigma tells us that Sigma logically implies tau. Hence, by the completeness theorem, there is a deduction of tau from Sigma. That is, "modus ponens + Lambda" is a sufficient set of methods to prove tau from Sigma. Another sufficient list of methods can be roughly described as
   "modus ponens + generalization + tautologies + axiom group 2 + equality."
Having more methods available (RAA or EI) might be nice, but we have an upper bound on what is required: no more that modus ponens + Lambda is required.

Secondly, the compactness theorem is an easy consequence (as on page 142) of the completeness theorem. And the compactness theorem will have a number of interesting applications (e.g. Exercise 8 in this section, and the construction of non-standard models in the next section).

And thirdly, the completeness theorem tells us (pp. 142-144) that for a decidable set of axioms (or premisses) in a finite language, the set of provable formulas is semidecidable. This point will play a significant role in Chapter 3. In Chapter 3 we want to compare what is provable from axioms with what is true in a structure. In particular, we will find that the set of sentences of number theory true in the standard structure (N; 0, S, +, x) is not semidecidable. This will rule out any possibility of finding a decidable set of axioms that will give us all of number theory -- the Gödel incompleteness theorem.

There is an interesting 1996 article by Leon Henkin, "The discovery of my completeness proofs," available as a postscript file.

Exercise 8: This problem shows that we cannot express "The graph is connected" in this language. (We can certainly say that any model is a graph: P is symmetric and irreflexive. It is the connectivity that we cannot express.) There is a more interesting fact: We cannot express connectivity even when we restrict attention to finite graphs. But the compactness theorem is useless for working with finite structures. Instead, there is a method called "Ehrenfeucht-Fraïssé games." Unfortunately, this technique is not developed in this book.

Additional exercise:
10. Assume the language (with equality) has just the parameters forall, P, c, and d, where P is a two-place predicate symbol and c and d are constant symbols. Show that there is no set Sigma of sentences with the property that a structure A is a model of Sigma iff the pair (c^A, d^A) belongs to the transitive closure of the relation P^A. Suggestion: Consider the set consisting of Sigma together with sentences saying that there is no short P-path from c to d.

Section 2.6.

The concept of a complete theory is defined on page 156. There is the issue of what to do about the inconsistent theory (the set of all sentences) -- Should that be called complete? I find it more natural to take the "yes" choice. But some books (e.g. Chang and Keisler) opt for the other choice.

Page 159 mentions briefly Tarski's theorem that the theory of the real field is decidable. This result has an interesting history. Tarski's paper on this topic was accepted by a French journal, and was to be published (in Paris) in 1940. Of course, 1940 proved to be a chaotic year for Paris, because of the German invasion. The journal issue that was to contain Tarski's paper never appeared; all that remained were the page proofs in Tarski's possession.

The first actual publication of the result was in a 1948 booklet from the Rand Corporation, in Santa Monica, California, as part of "U. S. Air Force Project Rand R-109." (The booklet was reprinted in 1951 by the University of California Press.) One might ask how the military got mixed up in this.

The theorem implies that all of high school algebra and geometry is included in a certain decidable theory. In other words, we have a decidable theory that includes all of mathematics up to the tenth-grade level. Might it not therefore be useful to implement this decision procedure by means of a computer program?

The initial attempts in this direction were disappointing; the program was too slow to decide anything beyond utter trivialities. There was at the time a feeling that if computers were 100 times faster, then the procedure would be useful.

A year or two later, computers were a hundred times faster. And not long after that, their speed improved by another factor of a hundred. The decision program still gave nothing of interest.

Finally when the subject of "computational complexity" was developed, the situation became clearer. A 1974 paper by Fischer and Rabin showed that the time required for the decision procedure grew exponentially with the length of the formula. That is bad news for people planning to run the procedure. 2^{80} microseconds is a long time; cf. page 26.

A modern development of Tarski's theorem can be extracted from Shoenfield's book, pages 87-88. One shows that the theory of "real-closed ordered fields" admits elimination of quantifiers (see p. 190) and is complete.

Exercise 1: For part (a), in any model of the negation, P must be irreflexive and transitive. In part (b), it might be helpful to consider, for each element of the universe, the set of points Q-related to it.

Exercise 2: This exercise is related to Exercise 8 in Section 2.2 (page 100).

Exercise 6: If a set of natural numbers can be effectively enumerated in increasing order, then the set is decidable. Something similar holds for sets of sentences.

Exercise 9: An example of a set with the finite model property is the set of E_2 sentences in a language without function symbols, as in Exercise 10.

Additional exercise:
11.   Assume that T is a finitely axiomatizable theory (in a finite language). Assume that every sentence not in T fails in some finite model of T. Show that the theory T is decidable.

Additional exercise:
12.   Assume that Sigma is a set of sentences in a finite language (with equality), and that all models of Sigma are finite. Show that Sigma has only finitely many models, up to isomorphism.

Section 2.7.

The purpose of this section is to develop methods for comparing two theories, even when those theories are in different languages. As the footnote on page 164 indicates, these methods will be used in Section 3.7 (see page 270) to compare a system of number theory to a system of set theory.

Section 2.8.

This section presents an initial exposure to the concepts of non-standard analysis, with infinitesimals and such. Keisler wrote (and used) a freshman calculus book, based on non-standard analysis.

But Section 2.8 gives only an initial exposure to the ideas. For serious work in analysis, a more expansive framework is needed than the one given here.

Jump to   Chapter Zero   Chapter One   Chapter Two   Chapter Four. computability theory." Here is a paper by Bob Soare (in postscript format) dealing with what the terminology has been and should be.

Chapter Three

Section 3.0.

A major goal of Chapter 3 is to prove that the theory of the natural numbers with addition and multiplication (for short, call this the theory of "true arithmetic") is undecidable. It is convenient to add additional parameters for zero, successor, ordering, and exponentiation. But these are all definable from addition and multiplication, so we could get along without them if we had to.

The undecidability of true arithmetic has the following three immediate consequences.

1. The theory of true arithmetic is not effectively enumerable. This is because the theory is complete; if it were effectively enumerable, then it would be decidable.

2. The theory of true arithmetic is not axiomatizable. This is because axiomatizable theories are effectively enumerable.

3. For any decidable set of axioms, all of which are true in the natural numbers with addition and multiplication (and who would want false axioms?), there is a true sentence that is not deducible from those axioms (the Gödel incompleteness theorem). Of course the negation of that sentence is not deducible from the axioms, either, because from true axioms one gets only true consequences.

This holds because the set of deducible sentences is an effectively enumerable subset of the theory of true arithmetic. So it must be a proper subset.

Why is the theory of true arithmetic undecidable? One answer is that concepts of computability are definable in arithmetic. In particular, the "halting problem" in computability theory can be reduced (in a way that can be made precise) to the theory of true arithmetic. So a decision procedure for true arithmetic would yield a decision procedure for the halting problem.

In fact, even more is true. There is a simple finitely axiomatizably subtheory that is strong enough to represent (in a sense) concepts of computability theory. (We will give this subtheory in Section 3.3.) It will then follow that even this subtheory, and all of its true extensions, are undecidable.

Section 3.1.

By way of contrast to the theory of the natural numbers with addition and multiplication, we find here that the theory of the natural numbers with just zero and successor is fairly simple. That is, the theory is decidable, and we are able to get a good picture of what its models look like. (The downside is that the theory is boring; there are no interesting sentences in the theory.)

This section also presents the method of quantifier elimination. For those theories that admit elimination of quantifiers (and most theories do not), the method often yields an informative analysis.

Section 3.2.

As in the preceding section, the theory of the natural numbers with addition is decidable, and we are able to get a good picture of what its models look like.

On the one hand, we have a more interesting theory here than we had in Section 3.1. On the other hand, the proofs here are more involved.

Although the theory of the natural numbers with 0, S, <, and + is decidable, as Section 3.2 shows, we will see later that once we add multiplication to this list, the theory becomes undecidable. So here is the boundary. Why here? One answer is that with addition and multiplication one can code long strings of numbers by a single number (and then decode that single number to retrieve the originals). That enables one to do long complicated things, one little step at a time. (See Exercise 1 on page 281.)

Historical note: To quote from the Feferman biography of Tarski: "As an `exercise,' Tarski suggested to one of the students, Mojzesz Presburger, that he find an elimination-of-quantifiers procedure for the additive theory of the natural numbers.... Presburger succeeded in arriving at this in the spring of 1928 by a suitable augmentation of the language.... This result served as Presburger's thesis for a master's degree; published two years later, it was his sole paper in logic.... He left the university soon after to work in the Polish insurance industry throughout the 1930s. A sad coda to this story is that Presburger, a Jew, perished in the Holocaust around 1943."

Section 3.3.

I said above that a major goal of Chapter 3 is to prove that the theory of the natural numbers with addition and multiplication (for short, call this the theory of "true arithmetic") is undecidable. But this needs clarification. As I said back on page 62, we can show that a set is decidable without having a formal definition of the concept of decidability. But now we want the opposite direction: showing undecidability. Section 3.3 formalizes (page 207) the concept of a recursive set. Then the undecidability of the theory of true arithmetic factors into two parts: For one, we will see that the theory of true arithmetic is not recursive. (Or more precisely, the set of Gödel numbers of the sentences of true arithmetic is not a recursive set of natural numbers.) And secondly, this section introduces Church's Thesis (also called the Church-Turing Thesis): that recursiveness is the correct formalization of the decidability concept.

There are a lot of other things going on in this section; we need to build up some machinery for showing that various relations are recursive. But the goal remains the one stated: showing that the theory of true arithmetic is undecidable.

About the choice of the word "recursive": As mentioned on page 208, there are many equivalent ways to define the concept of recursiveness. The one we adopt officially (page 207) uses representability in consistent finitely axiomatizable theories. There is nothing here that involves "recursion" as that word in normally used. Of the other equivalent definitions of recursiveness, some involve recursion (in the sense of procedures that call themselves) and some do not.

In his 1931 paper, Gödel (writing in German) used the term "rekursiv" for a class of functions now called the primitive recursive functions. A key feature of his definition allowed defining a function at n + 1 in terms of its value at n -- i.e., recursion. So the name made sense.

Shortly thereafter, Kleene and others used the term "general recursive" for an extended class of functions. In time this name became simply "recursive." And the recursive relations are those for which the characteristic function is a recursive function. This is how, historically, the name "recursive" arose.

But because some of the equivalent definitions of recursiveness don't really involve recursion, it might be considered a historical accident that the name "recursive" was applied. Recently there has been a movement to change the name from "recursive" to "computable." And the study of recursive functions has gone from being called "recursive function theory" to being called "recursion theory" and now to being (sometimes) called "

Whatever the choice of terminology, it is necessary to have separate adjectives for the informal concept of what can be calculated by an intuitively effective procedure, and for the formal concept that is here called recursiveness. (The word "calculable" is sometimes used for the informal concept, thereby freeing "computable" to be reassigned to the formal concept.)

By the way, don't you think that "Primitive Recursion" would be a good name for a rock band? As a name, it ranks up there with "Twas Brillig and the Slithey Toves." There already is a musical group called Axiom of Choice; I have one of their CDs.

What can we say about the non-standard models of Th(N; 0, S, <, +, x, E)? Back on pages 188-189, we had a complete picture of the non-standard models of Th(N; 0, S). On pages 195 and 197, we had a partial picture of the non-standard models of Th(N; 0, S, <, +). All we can say here is that the non-standard models of Th(N; 0, S, <, +, x, E) look like that, together with some multiplication and exponentiation operations.

Like parentheses, two sorts of angle brackets are used: lightface and boldface. On page 220, item 9, both sorts are used. It would help if the difference were more noticeable. Lightface angle brackets are used for ordered pairs and tuples; boldface angle brackets are for the decoding function introduced by item 9.

Exercise 2: The theory axiomatized by A_E is not omega-complete, as will be seen. That is, there exist formulas phi such that
    A_E |- phi(0),   A_E |- phi(S0),   A_E |- phi(SS0),...
and yet A_E does not prove   forall v phi(v). There is an interesting open problem here. Suppose that for every n, we have a proof from A_E of phi(S^n0) and moreover we have a uniform bound on the lengths of the proofs. (A deduction is a finite sequence of formulas; its length is simply the length of the sequence.) Now does it follow that A_E proves   forall v phi(v)? (The converse is certainly true. That is, if A_E proves   forall v phi(v), then there is a uniform bound on the lengths of the shortest proofs of phi(S^n0), as n varies.)   Kreisel's conjecture states that for first-order Peano arithmetic PA (which is axiomatized by A_E plus all induction axioms; see page 269), whenever PA proves phi(S^n0) for every n and there is uniform bound on the lengths of the proofs, then PA proves   forall v phi(v). The status of Kreisel's conjecture remains open; it has been the subject of recent research.

Section 3.4.

While the preceding section showed that various general-purpose relations are recursive, in Section 3.4 these tools are applied to one situation: syntactical properties in the language of arithmetic (such as being a formula, or being a deduction).

One consequence of the work here is the fact (page 232) that any recursive relation (i.e., a relation representable in some consistent finitely axiomatizable theory) is actually representable in Cn A_E. (Of course, the axioms of A_E were selected to make this happen.) This is tied to the fact that the theory given by A_E is strong enough to represent computability concepts. Section 3.6 will pursue this point.

It follows that every recursive relation is definable in (N; +, x). But as we will see, the converse does not hold; many non-recursive relations are also definable in this structure.

The technique of Gödel numbering is used so that the concept of recursiveness (defined for sets of numbers and such) can be applied to sets of words over our alphabet, such as the set of formulas. An attractive alternative is to formulate the concept of recursiveness directly for sets of words.

Theorem 34A on page 232 tells that every recursive relation (i.e., a relation representable in some finitely axiomatizable consistent theory) is in fact representable in the theory given by A_E. This fact, together with work in Section 3.3, yields closure results for the class of recursive relations. Using item 0 on page 217, we can now see that the class of recursive relations is closed under unions, intersections, and complements. Also it is closed under bounded quantification, by items 0 and 3 (page 218).

As on page 210 and page 247, define a function f : N^k --> N to be a recursive function if its graph is a recursive (k + 1)-ary relation. (This concept is not to be confused with recursive partial functions, to be considered later. For emphasis, we can speak of "total" functions, to clarify that the domain is all of N^k.) From item 1 on page 217 we can now conclude that a relation is recursive if and only if its characteristic function is a recursive function. From item 2, we obtain closure of the class of recursive relations under the substitution of (total) recursive functions.

Moreover, in Theorem 33P on page 222, we can say that if g is recursive, then so is f. Similarly in Exercise 8 on page 224, if g and h are recursive, then so is f (that is, the class of recursive functions is closed under primitive recursion). From Exercise 10 on that page, we obtain closure under definition-by-cases.

Additional exercise:
5. Show that there can be no recursive binary relation P on the natural numbers that is "universal" for recursive sets, in the sense that each recursive set A of natural numbers is, for some number a, the "vertical section"
P_a = {b in N | (a, b) is in P}.
Suggestion: Diagonalize out.

Section 3.5.

This section, in 12 pages, proves the incompleteness theorem by what Section 3.0 calls the self-reference approach. In fact, it does the incompleteness theorem twice. First, on page 236, there is
   "undefinability of truth" --> "truth is not recursive" --> "the theory of N is not axiomatizable."
Secondly, on page 237, there is the argument that no sufficiently strong consistent theory is recursive; here the sentence constructed does, in a sense, say "I am unprovable."

What then of the other two approaches described in Section 3.0? The diagonalization approach is Exercise 3 on page 245. And the computability approach arises in Section 3.6, pages 256-258: the theory of N is not recursively enumerable.

Much has been written about the Gödel incompleteness proof, and its construction of a sentence G asserting its own unprovability from the axioms in question. A little more accurately, G asserts that there is no number p with certain properties. When we analyze those properties, we find they hold iff p codes a proof of G from the axioms.

G is unprovable from the axioms. So there must be a nonstandard model M of the axioms in which G is false. Thus in M, there is an "infinite number" p that M thinks codes a proof from the axioms of G (although it doesn't really).

G is true in the standard model of arithmetic. That is, there isn't really any number p that codes a proof of G from the axioms.

About the concept of recursive enumerability: Back in Section 1.8, we met the informal concept of a decidable set (now formalized by the concept of a recursive set), and the informal concepts of a semi-decidable set and an effectively enumerable set (which turned out to be equivalent). How should we now formalize this?

By analogy to our definition of recursiveness, we might consider the following as a formal way of saying that R is a semi-decidable set of numbers:
(1)   For some consistent finitely axiomatizable theory T and some formula phi, the following hold for each number a:
If a is in R, then T |- phi(S^a0).
If a is not in R, then T does not prove phi(S^a0).

Similarly, we might consider the following as a formal way of saying that R is an effectively enumerable set of numbers:
(2)   Either R is empty, or for some recursive function f,
R = ran f = {f(0), f(1), f(2), ... }.

Finally, we want to add to this list, the following concept:
(3)   R is the domain of some recursive binary relation Q. That is, for each number a,
a is in R iff is in Q for some b.

Theorem. (1), (2), and (3) are equivalent.

As our definition of recursive enumerability, we adopt (3), which is easy to work with and which generalizes readily to the situation where R is a k-ary relation on the natural numbers. Church's thesis tells us that we have the correct formalization of the concepts of semi-decidability and effective enumerability. Theorem 35I connects the concept to logic: Any recursively axiomatizable theory is r.e.

Additional exercise:
11. Assume that T is a complete theory (in the language of A_E) that includes A_E. Show that if T is omega-consistent, then T is true arithmetic (i.e., T is the theory of the standard model).

Section 3.6.

On page 248, there is informal discussion of programs and computations and such. But then the formalization (e.g. the definition of the Kleene T predicate on page 249) uses formulas and deductions. That is, for us the programs for computing functions are given as formulas. And we can draw the moral: a computation is a proof and a proof is a computation.

Here is link to further discussion of recursive partial functions, in pdf format.

On page 257 the Gödel incompleteness theorem arises again. It is not the book has many different proofs of this theorem; instead it has several ways of packaging the same proof.

The bottom line is that the (set of Gödel numbers of the) theory of true arithmetic is not recursive and hence not r.e., and so any recursively axiomatizable subtheory is incomplete. And the reason why the theory of true arithmetic is not recursive is that it is strong enough to encode computability concepts: in particular, all recursive relations are definable in the structure for arithmetic. For example, back on page 236 (in the "self-reference" approach), Corollary 35A (that the theory of true arithmetic is not recursive) has the one-line proof that "Any recursive set is ... definable in N." In the "diagonalization approach" in Exercise 3 on page 245 (with T equal to the theory of true arithmetic), the key fact is that recursive sets are weakly representable in the theory, which amounts to definability in the structure. Finally, the computability approach on page 257 uses the definability of the complement of K, which follows from the definability of recursive sets.

Page 256 makes passing reference to magic oracles. The idea is immediately discarded, in favor of the concept of "many-one reducibility." Nonetheless, these oracles are perfectly respectable objects, and they lead to the study of the so-called degrees of unsolvability.

For sets A and B of natural numbers, we say that A is Turing reducible to B iff there is an effective procedure that, when augmented by an oracle for B, correctly decides questions of membership in A. (This can be made precise, of course.)

At first blush, the concept of Turing reducibility might seem like a strange marriage of effectiveness and its opposite extreme. It is to Turing's credit that he saw that the marriage could be fruitful. All the non-recursive sets are undecidable, but some are more undecidable than others. Turing reducibility gives a useful tool for saying that one set is no more undecidable than another.

It is not hard to see that both many-one reducibility and Turing reducibility (and for that matter, "one-one reducibility" which is exactly like many-one reducibility except that one requires in addition that the reducing function f be one-to-one) are "pre-order relations," that is, they are both reflexive (on the power-set of N) and transitive. Consequently we obtain three equivalence relations by adding symmetry:

  • A is one-one equivalent to B iff each is one-one reducible to the other;
  • A is many-one equivalent to B iff each is many-one reducible to the other;
  • A is Turing equivalent to B iff each is Turing reducible to the other.

In each case, the set of equivalence classes is partially ordered by the corresponding reducibility. In the case of Turing equivalence, the equivalence classes are called Turing degrees, or degrees of unsolvability. (See the book by H. Rogers listed on page 307 for much more about degrees.)

Mike Sipser has suggested that the term "many-one reducible" is uninformative, and that "mapping reducible" might be better.

The register machines described on pages 261-263 have the advantage of being very simple. But they have the disadvantage that the programs are not "structured"; they allow "goto" commands. Here is a description (in pdf format) of a more structured programming language: the loop and while programs.

Exercise 1(b). No knowledge of pi is required.

Exercise 4(a). Suggestion: If f() = a whenever is in Q, then the domain of Q is included in the range of f.

Exercise 10. Suppose we get tired of having students write programs that run forever, and so we set out to design a "safe" programming language: one in which programs can be written only for total functions. Of course, the set of all programs should be decidable (or at least r.e.). Exercise 10 tells us that no such language can calculate all total recursive functions. Suggestion: By Exercise 4(a), we can represent A (if non-empty) as the range of a total recursive function. Diagonalize out.

Additional exercise:
18. Let f be a recursive partial function.
(a) Show that if the domain of f is a recursive set, then the graph of f is a recursive binary relation.
(b) Does the converse of part (a) hold in general?

Section 3.7.

Here for a sentence (or any formula) phi, we construct the sentence "Prb phi" saying that there exists a deduction of phi from the axioms in question. Here "Prb" acts as a modal operator, and the methods of modal logic can be utilized. This approach is developed in the book The Unprovability of Consistency, by George Boolos (Cambridge, 1979).

In 1952, Leon Henkin raised the question whether the sentence saying "I am provable" is provable (for a system such as PA). (See The Journal of Symbolic Logic (JSL), vol. 17, p. 160.) The problem was solved in the affirmative by Löb, who submitted his solution to JSL for publication. The referee -- surely it was Henkin -- pointed out that Löb's argument could be applied to do more than merely answer Henkin's question: it could be applied to prove what is now called Löb's theorem. Löb's paper was then published in JSL, vol. 20 (1955), pp. 115-118.

Section 3.8.

The "Chinese remainder theorem" seems to have been given this name because combinatorial problems solved by this theorem were worked out by ancient Chinese mathematicians.

Jump to   Chapter Zero   Chapter One   Chapter Two   Chapter Three. Subsystems of Second Order Arithmetic.

Chapter Four

Section 4.1.

The ideas in this chapter deserve to be better known. Sections 4.1-4.2 deal with what is often called the "standard semantics" for second-order logic.

There is an alternate proof of Theorem 41B, using Exercises 2(a) and 4.

Section 4.2.

The material on Herbrand expansions is new to the second edition.

Section 4.3.

This section does not deal with second-order logic per se, but it is needed for Section 4.4.

Section 4.4.

This section deals with what is often called the "Henkin semantics" for second-order logic. "General structures," as defined on page 302, are also known as "Henkin structures."

For more about models of second-order number theory, see the book by Steve Simpson,

Jump to   Chapter Zero   Chapter One   Chapter Two   Chapter Three   Chapter Four.

Back matter.

The reading list on page 307 will, over time, need updating.

The Barwise and Etchemendy book, The Language of First-Order Logic, has been replaced by Language, Proof and Logic.

The Boolos and Jeffrey book is now in its fourth edition, the fourth edition having been prepared by John Burgess. (I was looking at a copy of this recently, and I looked myself up in the index. It said "Enderton, Herbert, 348" so I looked at page 348 -- and it was blank! John, are you trying to tell me something?)

There are three interesting biographies of logicians mentioned in the book:

  • John Dawson, Jr., Logical Dilemmas: The Life and Work of Kurt Gödel, A K Peters, 1997.
  • Anita Burdman Feferman and Solomon Feferman, Alfred Tarski: Life and Logic, Cambridge, 2004.
  • Andrew Hodges, Alan Turing: The Enigma, Simon and Schuster, 1983. Hugh Whitemore wrote a play, based on this book, called "Breaking the Code."


And here is a link to part of an article about Alonzo Church in pdf format.

Be sure to look up "self-reference" in the index.

你可能感兴趣的:(numbers,recursion,structure,function,algorithm,construction)