User:Mpagano/Sequent Calculus
In proof theory and mathematical logic, the sequent calculus is a widely known deduction system for first-order logic (and propositional logic as a special case of it). The system is also known under the name LK, distinguishing it from various other systems of similar fashion that have been created later and that are sometimes also called sequent calculi. Another term for such systems in general is Gentzen systems.
Since sequent calculi and the general concepts relating to them are of major importance to the whole field of proof theory and mathematical logic, the system LK will be explained in greater detail below. Some familiarity with the basic notions of predicate logic (especially its syntactic structure) is assumed.
The system LK
[edit]A (formal) proof in this calculus is a sequence of sequents, where each of the elements is derivable from a number of sequents, that appear earlier in the sequence, by using one of the below rules. An intuitive explanation of these rules is given thereafter. Please do also refer to sequent and rule of inference if you are not familiar with these concepts.
History
[edit]The Sequent calculus LK has been introduced by Gerhard Gentzen as a tool for studying natural deduction (which has been around before, although not quite as formal). It has subsequently turned out to be a much more easy to handle calculus when constructing logical derivations. The name itself is derived from the German term Logischer Kalkül, meaning logical calculus. Sequent calculi are the method of choice for many investigations on the subject.
The inference-rules for LK
[edit]The following notation will be used:
- A and B denote formulae of first-order predicate logic (one may also restrict this to propositional logic),
- Γ, Δ, Σ, and Π are finite (possibly empty) sequences of formulae, called contexts,
- t denotes an arbitrary term,
- A[t] denotes a formula A, in which some occurrences of a term t are of interest
- A[s/t] denotes the formula that is obtained by substituting the term s for the specified occurrences of t in A[t],
- x and y denote variables,
- a variable is said to occur free within a formula if its only occurrences in the formula are not within the scope of quantifiers ∀ or ∃.
Axiom: | Cut: | |||
|
(I) |
|
|
|
Left logical rules: | Right logical rules: | |||
|
()
|
|||
|
()
|
|
()
|
|
|
()
|
|
()
|
|
|
()
|
|||
|
(→L) |
|
(→R) |
|
|
(¬L) |
|
(¬R) |
|
|
(∀L) |
|
(∀R) |
|
|
(∃L) |
|
(∃R) |
|
Left structural rules: | Right structural rules: | |||
|
(WL) |
|
(WR) |
|
|
(CL) |
|
(CR) |
|
|
(PL) |
|
(PR) |
Note: In the rules (∀R) and (∃L), the variable y must not be free within Γ , A[x/y], or Δ .
An intuitive explanation
[edit]The above rules can be divided into two major groups: logical and structural ones. Each of the logical rules introduces a new logical formula either on the left or on the right of the turnstile |-. In contrast, the structural rules operate on the structure of the sequents, ignoring the exact shape of the formulae. The two exceptions to this general scheme are the axiom of identity (I) and the rule of (Cut).
Although stated in a formal way, the above rules allow for a very intuitive reading in terms of classical logic. Consider, for example, the rule (∧L1). It says that, whenever one can prove that Δ can be concluded from some sequence of formulae that contain A, then one can also conclude Δ from the (stronger) assumption, that A∧B holds. Likewise, the rule (¬R) states that, if Γ and A suffice to conclude Δ, then from Γ alone one can either still conclude Δ or A must be false, i.e. ¬A holds. All the rules can be interpreted in this way.
For an intuition about the quantifier rules, consider the rule (∀R). Of course concluding that ∀x A[x/y] holds just from the fact that A[y] is true is not in general possible. If, however, the variable y is not mentioned elsewhere (i.e. it can still be chosen freely, without influencing the other formulae), then one may assume, that A[y] holds for any value of y. The other rules should then be pretty straightforward.
Instead of viewing the rules as descriptions for legal derivations in predicate logic, one may also consider them as instructions for the construction of a proof for a given statement. In this case the rules can be read bottom-up. For example, (∧R) says that, in order to prove that A∧B follows from the assumptions Γ and Σ, it suffices to prove that A can be concluded from Γ and B can be concluded from Σ, respectively. Note that, given some antecedent, it is not clear how this is to be split into Γ and Σ. However, there are only finitely many possibilities to be checked since the antecedent by assumption is finite. This also illustrates how proof theory can be viewed as operating on proofs in a combinatorial fashion: given proofs for both A and B, one can construct a proof for A∧B.
When looking for some proof, most of the rules offer more or less direct recipes of how to do this. The rule of cut is different: It states that, when a formula A can be concluded and this formula may also serve as a premise for concluding other statements, then the formula A can be "cut out" and the respective derivations are joined. When constructing a proof bottom-up, this creates the problem of guessing A (since it does not appear at all below). This issue is addressed in the theorem of cut-elimination.
The second rule, that is somewhat special, is the axiom of identity (I). The intuitive reading of this is obvious: A proves A.
An example derivation
[edit]As for an example, this is the sequential derivation of (A¬A), known as the Law of excluded middle (tertium non datur in Latin).
(I) | |
(¬R) | |
(R2) | |
(PR) | |
(R1) | |
(CR) | |
This derivation also emphasizes the strictly formal structure of a syntactic calculus. For example, the right logical rules as defined above do always act on the first formula of the right sequent, such that the application of (PR) is formally required. This very rigid reasoning may at first be difficult to understand, but it forms the very core of the difference between syntax and semantics in formal logics. Although we know that we mean the same with the formulae and , a derivation of the latter would not be equivalent to the one that is given above. However, one can make syntactic reasoning more convenient by introducing lemmas, i.e. predefined schemes for achieving certain standard derivations. As an example one could show that the following is a legal transformation:
Once a general sequence of rules is known for establishing this derivation, one can use it as an abbreviation within proofs. However, while proofs become more readable when using good lemmas, it can also make the process of derivation more complicated, since there are more possible choices to be taken into account. This is especially important when using proof theory (as often desired) for automated deduction.
Structural rules
[edit]The structural rules deserve some additional discussion. The names of the rules are Weakening (W), Contraction (C), and Permutation (P). Contraction and Permutation assure that neither the order (P) nor the multiplicity of occurrences (C) of elements of the sequences do matter. Thus, one could instead of sequences also consider sets.
The extra effort of using sequences, however, is justified since part or all of the structural rules may be omitted. Doing so, one obtains the so called substructural logics.
Properties of the system LK
[edit]This system of rules can be shown to be both sound and complete with respect to first-order logic, i.e. a statement A follows semantically from a set of premisses Γ (Γ |= A) iff the sequent Γ |- A can be derived by the above rules.
In Sequent calculus, the rule of cut is admissible. This result is also referred to as Gentzen's Hauptsatz ("Main Theorem").
Modifications of the system
[edit]The above rules can be modified in various ways without changing the essence of the system LK. All of these modifications may still be called LK.
First of all, as mentioned above, the sequents can be viewed to consist of sets or multisets. In this case, the rules for permuting and (when using sets) contracting formulae are obsolete.
The rule of weakening will become admissible, when the axiom (I) is changed, such that any sequent of the form Γ, A |- A, Δ can be concluded. This means that A proves A in any context. Any weakening that appears in a derivation can then be performed right at the start. This may be a convenient change when constructing proofs bottom-up.
Independent of these one may also change the way in which contexts are split within the rules: In the cases (∧R), (∨L), and (→L) the left context is somehow split into Γ and Σ when going upwards. Since contraction allows for the duplication of these, one may assume that the full context is used in both branches of the derivation. By doing this, one assures that no important premisses are lost in the wrong branch. Using weakening, the irrelevant parts of the context can be eliminated later.
All of these changes yield equivalent systems in the sense that every derivation in LK can effectively be transformed in a derivation using the alternative rules and vice versa.
The system LJ
[edit]Surprisingly, some small changes in the rules of LK suffice in order to turn it into a proof system for intuitionistic logic. To this end, one has to restrict to intuitionistic sequents (i.e. the right contexts are eliminated) and modify the rule (∨L) as follows:
|
()
|
where C is an arbitrary formula.
The resulting system is called LJ. It is sound and complete with respect to intuitionistic logic and admits a similar cut-elimination proof.