Archive:
In this section:
Subtopics:
Comments disabled |
Thu, 13 Feb 2020
Gentzen's rules for natural deduction
Here is Gerhard Gentzen's original statement of the rules of Natural Deduction (“ein Kalkül für ‘natürliche’, intuitionistische Herleitungen”): Natural deduction looks pretty much exactly the same as it does today, although the symbols are a little different. But only a little! Gentzen has not yet invented !!\land!! for logical and, and is still using !!\&!!. But he has invented !!\forall!!. The style of the !!\lnot!! symbol is a little different from what we use now, and he has that tent thingy !!⋏!! where we would now use !!\bot!!. I suppose !!⋏!! didn't catch on because it looks too much like !!\land!!. (He similarly used !!⋎!! to mean !!\top!!, but as usual, that doesn't appear in the deduction rules.) We still use Gentzen's system for naming the rules. The notations “UE” and “OB” for example, stand for “und-Einführung” and “oder-Beseitigung”, which mean “and-introduction” and “or-elimination”. Gentzen says (footnote 4, page 178) that he got the !!\lor, \supset, \exists!! signs from Russell, but he didn't want to use Russell's signs !!\cdot, \equiv, \sim, ()!! because they already had other meanings in mathematics. He took the !!\&!! from Hilbert, but Gentzen disliked his other symbols. Gentzen objected especially to the “uncomfortable” overbar that Hilbert used to indicate negation (“[Es] stellt eine Abweichung von der linearen Anordnung der Zeichen dar”). He attributes his symbols for logical equivalence (!!\supset\subset!!) and negation to Heyting, and explains that his new !!\forall!! symbol is analogous to !!\exists!!. I find it remarkable how quickly this caught on. Gentzen also later replaced !!\&!! with !!\land!!. Of the rest, the only one that didn't stick was !!\supset\subset!! in place of !!\equiv!!. But !!\equiv!! is much less important than the others, being merely an abbreviation. Gentzen died at age 35, a casualty of the World War. Source: Gerhard Gentzen,
“Untersuchungen über das logische Schließen I”,
pp. 176–210 [ Addendum 20200214: Thanks to Andreas Fuchs for correcting my German grammar. ]
The rubber duck explains coherence spaces
I've spent a chunk of the past week, at least, trying to understand
the idea of a The definition is one of those simple but bafflingly abstract ones
that you often meet in mathematics: There is a set !!\lvert
\mathcal{A}\rvert!! of - If !!a!! is a clique, and !!a'\subset a!!, then !!a'!! is also a clique.
- Suppose !!\mathcal M!! is some family of cliques such that !!a\cup a'!! is a clique for each !!a, a'\in \mathcal M!!. Then !!\bigcup {\mathcal M}!! is also a clique.
To beginning math students it often seems like these sorts of definitions are generated at random. Okay, today we're going to study Eulerian preorders with no maximum element that are closed under finite unions; tomorrow we're going to study semispatulated coalgebras with countably infinite signatures and the weak Cosell property. Whatever, man. I have a long article about this in progress, but I'll summarize: they
are So when I see something like this I must stop immediately and ask ‘wat’. I can try to come up with the explanation myself, or I can read on hoping for an explanation, or I can do some of each, but I am not going to progress very far until I understand what it is about. And I'm not sure anyone short of Alexander Grothendieck would have any more success trying to move on with the definition itself and nothing else. Girard explains shortly after:
Okay, fine. I understand the point of the project, although not why the definition is what it is. I know a fair amount about types. And Girard has given two examples: booleans and integers. But these examples are unusually simple, because none of the cliques has more than one element, and so the examples are not as illuminating as they might be. Some of the ways I tried to press onward were: Read ahead and see if there is more explanation. I tried this but I still wasn't getting it. The next section seemed clear: the cliques define a “coherence” relation on the tokens, from which the cliques can be recovered. Consider a graph, where the vertices are tokens and there is an edge !!a—b!! exactly when !!\{a, a'\}!! is a clique; we say that !!a!! and !!a'!! are *coherent*. Then the cliques of the coherence space are exactly the cliques of the graph; hence the name. The graph is called the*web*of the space, and from the web one can recover the original space.But after that part came *stable functions*, which I couldn't figure out, and I got stuck again.Read ahead and see if there is a more complicated specific example. There wasn't. Read ahead and see if any of the derived concepts are familiar, and if so then work backward. For instance, if I had been able to recognize that I already knew what stable functions were, I might have been able to leverage tha tinto an understanding of what was going on with the coherence spaces. But for me they were just another problem of the same sort: what is a stable function supposed to be modeling? Read someone else's explanation instead. I tried several without much success. They all seemed to be written for someone who already had a clue what was going on. (That is a large part of the reason I have written up this long and clueless explanation.) Try to construct some examples and see if they make sense in the context of what comes later. For example, I know what the coherence space of booleans looks like because Girard showed me. Can I figure out the structure of the coherence space for the type of “wrapped booleans”? `-- (Haskell) data WrappedBoolean = W Bool` Can I figure it out for the type of pairs of booleans? `-- (Haskell) type BooleanPair = (Bool, Bool)`
None of this was working. I had several different ideas about what the coherence spaces might look like for other types, but none of them seemed to fit with what Girard was doing. I couldn't come up with any consistent story. So I prepared to ask on StackExchange, and I spent about an hour writing up my question, explaining all the things I had tried and what the problems were with each one. And as I drew near to the end of this, the clouds parted! I never had to post the question. I was in the middle of composing this paragraph:
I decided I hadn't made enough of an effort to understand the direct product. So even though I couldn't see how it could possibly give me anything like what I wanted, I followed its definition for !!{{\mathcal B}ool}^2!! — and the light came on. Here's the puzzling coproduct-like definition of the product of two coherence spaces, from page 61:
That is, the tokens in the product space are literally the disjoint union of the tokens in the component spaces. And the edges in the product's web are whatever they were in !!{\mathcal A}!!'s
web (except lifted from
!!|{\mathcal A}|!! to !!\{1\}×|{\mathcal A}|!!), whatever they were in !!{\mathcal B}!!'s web (similarly),
There is no edge between !!\langle 1, \text{True}\rangle!! and !!\langle 1, \text{False}\rangle!! because in !!{{\mathcal B}ool}!! there is no edge between !!\text{True}!! and !!\text{False}!!. This graph has nine cliques. Here they are ordered by set inclusion: (In this second diagram I have abbreviated the pair !!\langle1, \text{True}\rangle!! to just !!1T!!. The top nodes in the diagram are each labeled with a set of two ordered pairs.) What does this mean?
The ordered pairs of booleans are being represented by The nodes in the bottom diagram represent functions. The top row are fully-defined functions. For example, !!\{1F, 2T\}!! is the function with !!f(1) = \text{False}!! and !!f(2) = \text{True}!!, representing the boolean pair !!\langle\text{False}, \text{True}\rangle!!. Similarly if we were looking at a space of infinite lists, we could consider it a function from !!\Bbb N!! to whatever the type of the lists elements was. Then the top row of nodes in the coherence space would be infinite sets of pairs of the form !!\langle n, \text{(list element)}\rangle!!. The lower nodes are still functions, but they are functions about
which we have only incomplete information. The node !!\{2T\}!! is a
function for which !!f(2) = \text{True}!!. But we don't yet know what
!!f(1)!! is because we haven't yet tried to compute it. And the
bottommost node !!\varnothing!! is a function where we don't know
anything at all — yet. As we test the function on various arguments,
we move up the graph, always following the edges. The lower nodes are
Now the importance of I had been thinking that the !!\varnothing!! node was somehow like the
!!\bot!! element in a Scott domain, and then I struggled to
identify anything like !!\langle \text{False}, \bot\rangle!!.
It looks at first
like you can do it somehow, because there are the right number of
nodes at the middle level.
Trouble arises in other coherence
spaces. Presented with a value from I think there isn't anything like !!\bot!! or !!W\ \bot!! in the coherence space. Or maybe they they are there but sharing the !!\varnothing!! node. But I think more likely partial objects will appear in some other way. Whew! Now I can move along. (If you don't understand why “rubber duck”, Wikipedia explains:
I spent a week on this but didn't figure it out until I tried formulating my question for StackExchange. The draft question, never completed, is here if for some reason you want to see what it looked like.)
The Blind Spot and the cut rule
[ The Atom and RSS feeds have done an unusually poor job of preserving the mathematical symbols in this article. It will be much more legible if you read it on my blog. ] Lately I've been enjoying
He continues a little later:
As you can see, it is not written in the usual dry mathematical-text
style, presenting the material as a perfect and aseptic distillation
of absolute truth. Instead, one sees the history of logic, the rise
and fall of different theories over time, the interaction and relation
of many mathematical and philosophical ideas, and Girard's reflections
about it all. It is a transcription of a lecture series, and reads
like one, including all of the speaker's incidental remarks and
offhand musings, but written down so that each can be weighed and
pondered at length. Instead of wondering in the moment what he meant
by some intriguing remark, then having to abandon the thought to keep
up with the lecture, I can pause and ponder the significance. Girard
is really, really smart, and knows way more about logic than I ever
will, and his offhand remarks reward this pondering. The book is
The book really gets going with its discussion of Gentzen's sequent calculus in chapter 3. Between around 1890 (when Peano and Frege began to liberate logic from its
medieval encrustations) and 1935 when the sequent calculus was
invented, logical proofs were mainly in the “Hilbert style”.
Typically there were some axioms, and some rules of deduction by which
the axioms could be transformed into other formulas. A typical
example consists of the axioms $$A\to(B\to A)\\
(A \to (B \to C)) \to ((A \to B) \to (A \to C)) $$
(where !!A, B, C!! are understood to be placeholders that can be
replaced by any well-formed formulas) and the deduction rule In contrast, sequent calculus has few axioms and many deduction
rules. It deals with A typical deductive rule in sequent calculus is: $$ \begin{array}{c} Γ ⊢ A, Δ \qquad Γ ⊢ B, Δ \\ \hline Γ ⊢ A ∧ B, Δ \end{array} $$ Here !!Γ!! and !!Δ!! represent any lists of formulas, possibly empty. The premises of the rule are: !!Γ ⊢ A, Δ!!: If we can prove *all*of the formulas in !!Γ!!, then we can conclude either the formula !!A!! or*some*of the formulas in !!Δ!!.!!Γ ⊢ B, Δ!!: If we can prove *all*of the formulas in !!Γ!!, then we can conclude either the formula !!B!! or*some*of the formulas in !!Δ!!.
From these premises, the rule allows us to deduce:
The only axioms of sequent calculus are utterly trivial: $$ \begin{array}{c} \phantom{A} \\ \hline A ⊢ A \end{array} $$ There are no premises; we get this deduction for free: If can prove !!A!!, we can prove !!A!!. (!!A!! here is a metavariable that can be replaced with any well-formed formula.) One important point that Girard brings up, which I had never realized
despite long familiarity with sequent calculus, is the symmetry
between the left and right sides of the turnstile ⊢. As I mentioned,
the interpretation of !!Γ ⊢ Δ!! I had been taught was that it
means that if every formula in !!Γ!! is provable, then some formula in
!!Δ!! is provable. But instead let's focus on just one of the
formulas !!A!! on the right-hand side, hiding in the list !!Δ!!. The sequent
!!Γ ⊢ Δ, A!! can be understood to mean that to prove
!!A!!, it suffices to prove
all of the formulas in !!Γ!!, and to The all-some correspondence, which had previously caused me to wonder why it was that way and not something else, perhaps the other way around, has turned into a simple relationship about logical negation: the formulas on the left are positive, and the ones on the right are negative.[2]) With this insight, the sequent calculus negation laws become not merely simple but trivial: $$ \begin{array}{cc} \begin{array}{c} Γ, A ⊢ Δ \\ \hline Γ ⊢ \lnot A, Δ \end{array} & \qquad \begin{array}{c} Γ ⊢ A, Δ \\ \hline Γ, \lnot A ⊢ Δ \end{array} \end{array} $$ For example, in the right-hand deduction: what is sufficient to prove !!A!! is also sufficient to disprove !!¬A!!. (Compare also the rule I showed above for ∧: It now says that if proving everything in !!Γ!! and disproving everything in !!Δ!! is sufficient for proving !!A!!, and likewise sufficient for proving !!B!!, then it is also sufficient for proving !!A\land B!!.) But none of that was what I planned to discuss; this article is (intended to be) about sequent calculus's “cut rule”. I never really appreciated the cut rule before. Most of the deductive rules in the sequent calculus are intuitively plausible and so simple and obvious that it is easy to imagine coming up with them oneself. But the cut rule is more complicated than the rules I have already shown. I don't think I would have thought of it easily: $$ \begin{array}{c} Γ ⊢ A, Δ \qquad Λ, A ⊢ Π \\ \hline Γ, Λ ⊢ Δ, Π \end{array} $$ (Here !!A!! is a formula and !!Γ, Δ, Λ, Π!! are lists of formulas, possibly empty lists.) Girard points out that the cut rule is a generalization of modus ponens: taking !!Γ, Δ, Λ!! to be empty and !!Π = \{B\}!! we obtain: $$ \begin{array}{c} ⊢ A \qquad A ⊢ B \\ \hline ⊢ B \end{array} $$ The cut rule is also a generalization of the transitivity of implication: $$ \begin{array}{c} X ⊢ A \qquad A ⊢ Y \\ \hline X ⊢ Y \end{array} $$ Here we took !!Γ = \{X\}, Π = \{Y\}!!, and !!Δ!! and !!Λ!! empty. This all has given me a much better idea of where the cut rule came from and why we have it. In sequent calculus, the deduction rules all come in pairs. There is a rule about introducing ∧, which I showed before. It allows us to construct a sequent involving a formula with an ∧, where perhaps we had no ∧ before. (In fact, it is the only way to do this.) There is a corresponding rule (actually two rules) for getting rid of ∧ when we have it and we don't want it: $$ \begin{array}{cc} \begin{array}{c} Γ ⊢ A\land B, Δ \\ \hline Γ ⊢ A, Δ \end{array} & \qquad \begin{array}{c} Γ ⊢ A\land B, Δ \\ \hline Γ ⊢ B, Δ \end{array} \end{array} $$ Similarly there is a rule (actually two rules) about introducing !!\lor!! and a corresponding rule about eliminating it. The cut rule seems to lie outside this classification. It is not paired. But Girard showed me that it $$ \begin{array}{c} \phantom{A} \\ \hline A ⊢ A \end{array} $$ can be seen as an introduction rule for a pair of !!A!!s, one on each side of the turnstile. The cut rule is the corresponding rule for eliminating !!A!! from both sides. Sequent calculus proofs are much easier to construct than Hilbert-style proofs. Suppose one wants to prove !!B!!. In a Hilbert system the only deduction rule is modus ponens, which requires that we first prove !!A\to B!! and !!A!! for some !!A!!. But what !!A!! should we choose? It could be anything, and we have no idea where to start or how big it could be. (If you enjoy suffering, try to prove the simple theorem !!A\to A!! in the Hilbert system I described at the beginning of the article. (Solution) In sequent calculus, there is only one way to prove each kind of thing, and the premises in each rule are simply related to the consequent we want. Constructing the proof is mostly a matter of pushing the symbols around by following the rules to their conclusions. (Or, if this is impossible, one can conclude that there is no proof, and why.[3]) Construction of proofs can now be done entirely mechanically! Except! The cut rule The good news is that Gentzen, the inventor of sequent calculus, showed that one can dispense with the cut rule: it is unnecessary:
Gentzen's demonstration of this shows how one can take any proof that involves the cut rule, and algorithmically eliminate the cut rule from it to obtain a proof of the same result that does not use cut. Gentzen called this the “Hauptsatz” (“principal theorem”) and rightly so, because it reduces construction of logical proofs to an algorithm and is therefore the ultimate basis for algorithmic proof theory. The bad news is that the cut-elimination process can
super-exponentially increase the size of the proof, so it does not
lead to a $$ \begin{array}{cc} \begin{array}{c} Γ, A, A ⊢ Δ \\ \hline Γ, A ⊢ Δ \end{array} & \qquad \begin{array}{c} Γ ⊢ A, A, Δ \\ \hline Γ ⊢ A, Δ \end{array} \end{array} $$ And suddenly Girard's invention of linear logic made sense to me. In linear logic, contraction is forbidden; one must use each formula in one and only one deduction. Previously it had seemed to me that this was a pointless restriction.
Now I realized that it was no more of a useless hair shirt than the
intuitionistic rejection of the law of the proof by contradiction:
not a stubborn refusal to use an obvious tool of
reasoning, but a
restriction of proofs to produce
The book is going to get into linear logic later in the next chapter. I have read descriptions of linear logic before, but never understood what it was up to. (It has two logical and operators, and two logical or operators; why?) But I am sure Girard will explain it marvelously. - In place of a profound excerpt, I will present
the following, which isn't especially profound but struck a chord for
me: “By the way, nothing is more arbitrary than a
modal logic « I am done with this logic, may I have another one ? »
seems to be the
*motto*of these guys.” (p. 24) - Compare my alternative representation of arithmetic expressions that collapses addition and subtraction, and multiplication and division.
- A typically Girardian remark is that analytic tableaux are “the poor man's sequents”.
A brief but interesting discussion of
The annoying boxes puzzle: solution
There are two boxes on a table, one red and one green. One contains a treasure. The red box is labelled "exactly one of the labels is true". The green box is labelled "the treasure is in this box."It's not too late to try to solve this before reading on. If you want, you can submit your answer here:
## Results
66.52% 300 red 25.72 116 not-enough-info 3.55 16 green 2.00 9 other 1.55 7 spam 0.44 2 red-with-qualification 0.22 1 attack 100.00 451 TOTAL One-quarter of respondents got
the right answer, that there is not enough information
given to solve the problem, Two-thirds of respondents said the
treasure was in the red box.
This is wrong. The treasure
is in the green box.
## What?Let me show you. I stated:
The labels are as I said. Everything I told you was literally true. The treasure is definitely not in the red box. No, it is actually in the green box. (It's hard to see, but one of the items in the green box is the gold and diamond ring made in Vienna by my great-grandfather, which is unquestionably a real treasure.) So if you said the treasure must be in the red box, you were simply mistaken. If you had a logical argument why the treasure had to be in the red box, your argument was fallacious, and you should pause and try to figure out what was wrong with it. I will discuss it in detail below.
## SolutionThe treasure is undeniably in the green box. However, correct answer to the puzzle is "no, you cannot figure out which box contains the treasure". There is not enough information given. (Notice that the question was not “Where is the treasure?” but “Can you figure out…?”)
## (Fallacious) Argument
Many people erroneously conclude that the treasure is in the red box,
using reasoning something like the following: |

Gold | Silver | Lead |
---|---|---|

THE PORTRAIT IS IN THIS CASKET | THE PORTRAIT IS NOT IN THIS CASKET | THE PORTRAIT IS NOT IN THE GOLD CASKET |

Which casket should the suitor choose [to find the portrait]?

A well-constructed puzzle will always contain such a certification, something like “one label is true and one is false” or “on this island, each person always lies, or always tells the truth”. I went to What is the Name of this Book? to get the example above, and found more than I had bargained for: problem 70 is exactly the annoying boxes problem! Smullyan says:

Good heavens, I can take any number of caskets that I please and put an object in one of them and then write any inscriptions at all on the lids; these sentences won't convey any information whatsoever.(Page 65)

Had I known ahead of time that Smullyan had treated the exact same topic with the exact same example, I doubt I would have written this post at all.

One respondent referred me to a similar post on lesswrong.

I did warn you all that the puzzle was annoying.

I started writing this post in October 2007, and then it sat on the shelf until I got around to finding and photographing the boxes. A triumph of procrastination!

[ Addendum 20150911: Steven Mazie has written a blog article about this topic, A Logic Puzzle That Teaches a Life Lesson. ]

*[Other articles in category /math/logic]
permanent link*

The annoying boxes puzzle

Here is a logic puzzle. I will present the solution on Friday.

There are two boxes on a table, one red and one green. One contains a treasure. The red box is labelled "exactly one of the labels is true". The green box is labelled "the treasure is in this box."Starting on 2015-07-03, the solution will be here.Can you figure out which box contains the treasure?

*[Other articles in category /math/logic]
permanent link*

What to make of this?

- Sentence 2 is false.
- Sentence 1 is true.

Many answers are possible. The point of this note is to refute one particular common answer, which is that the whole thing is just meaningless.

This view is espoused by many people who, it seems, ought to know better. There are two problems with this view.

The first problem is that it involves a theory of meaning that appears to
have nothing whatsoever to do with pragmatics. You can certainly
*say* that something is meaningless, but that doesn't make it so.
I can claim all I want to that "jqgc ihzu kenwgeihjmbyfvnlufoxvjc sndaye"
is a meaningful utterance, but that does not avail me much, since
nobody can understand it. And conversely, I can say as loudly and as
often as I want to that the utterance "Snow is white" is meaningless,
but that doesn't make it so; the utterance still means that snow is
white, at least to some people in some contexts.

Similarly, asserting that the sentences are meaningless is all very well, but the evidence is against this assertion. The meaning of the utterance "sentence 2 is false" seems quite plain, and so does the meaning of the utterance "sentence 1 is true". A theory of meaning in which these simple and plain-seeming sentences are actually meaningless would seem to be at odds with the evidence: People do believe they understand them, do ascribe meaning to them, and, for the most part, agree on what the meaning is. Saying that "snow is white" is meaningless, contrary to the fact that many people agree that it means that snow is white, is foolish; saying that the example sentences above are meaningless is similarly foolish.

I have heard people argue that although the sentences are individually
meaningful, they are meaningless in conjunction. This position is
even more problematic. Let us refer to a person who holds this
position as *P*.
Suppose
sentence 1 is presented to you in isolation. You think you understand
its meaning, and since *P* agrees that it is meaningful, he
presumably would agree that you do. But then, a week later, someone
presents you with sentence 2; according to *P*'s theory, sentence
1 now becomes meaningless. It was meaningful on February 1, but not
on February 8, even though the speaker and the listener both think it
is meaningful and both have the same idea of what it means. But
according to *P*, as midnight of February 8, they are suddenly
mistaken.

The second problem with the notion that the sentences are meaningless comes when you ask what makes them meaningless, and how one can distinguish meaningful sentences from sentences like these that are apparently meaningful but (according to the theory) actually meaningless.

The answer is usually something along the lines that sentences that contain self-reference are meaningless. This answer is totally inadequate, as has been demonstrated many times by many people, notably W.V.O. Quine. In the example above, the self-reference objection is refuted simply by observing that neither sentence is self-referent. One might try to construct an argument about reference loops, or something of the sort, but none of this will avail, because of Quine's example:

"snow is white" is false when you change "is" to "is not".Or similarly:

If a sentence is false, then its negation is true.Nevertheless, Quine's sentence is an antinomy of the same sort as the example sentences at the top of the article.

But all of this is peripheral to the main problem with the argument
that sentences that contain self-reference are meaningless. The main
problem with this argument is that it *cannot* be true. The
sentence "sentences that contain self-reference are meaningless" is
itself a sentence, and therefore refers to itself, and is therefore
meaningless under its own theory. If the assertion is true, then the
sentence asserting it is meaningless under the assertion itself; the
theory deconstructs itself. So anyone espousing this theory has
clearly not thought through the consequences. (Graham Priest says that
people advancing this theory are subject to a devastating *ad
hominem* attack. He doesn't give it specifically, but many such
come to mind.)

In fact, the self-reference-implies-meaninglessness theory obliterates not only itself, but almost all useful statements of logic. Consider for example "The negation of a true sentence is false and the negation of a false sentence is true." This sentence, or a variation of it, is probably found in every logic textbook ever written. Such a sentence refers to itself, and so, in the self-reference-implies-meaninglessness theory, is meaningless. So too with most of the other substantive assertions of our logic textbooks, which are principally composed of such self-referent sentences about properties of sentences; so much for logic.

The problems with ascribing meaninglessness to self-referent sentences run deeper still. If a sentence is meaningless, it cannot be self-referent, because, being meaningless, it cannot refer to anything at all. Is "jqgc ihzu kenwgeihjmbyfvnlufoxvjc sndaye" self-referent? No, because it is meaningless. In order to conclude that it was self-referent, we would have to understand it well enough to ascribe a meaning to it, and this would prove that it was meaningful.

So the position that the example sentences 1 and 2 are "meaningless" has no logical or pragmatic validity at all; it is totally indefensible. It is the philosophical equivalent of putting one's fingers in one's ears and shouting "LA LA LA I CAN'T HEAR YOU!"

There are better positions. Priest's position is that the sentences are both true and false. This would seem to be just as defensible as the position that they are neither true nor false, but in fact the two positions are neither equivalent nor symmetric. For fuller details, see the article on "dialetheism" in

*[Other articles in category /math/logic]
permanent link*