1 Introduction
For the logical inferentialist, logical expressions must at least have harmonious introduction and elimination rules. Canonically, rules are harmonious when the elimination rule “follows from” the introduction rule. This notion has met our expectations for first-order logic (FOL), where (at least) the quantifiers and binary connective rules are harmonious and, thereby, logical. Nevertheless, identity remains contentious, with competing claims about the possibility for harmonious rules. While Read [Reference Read23] (see also [Reference Read24]) and Klev [Reference Klev13] have argued that their provided rules for identity are harmonious, Griffiths and Ahmed [Reference Griffiths5, Reference Griffiths and Ahmed7] argue that the rules fail appropriate harmony tests. While the latter cast doubt on the possibility of harmonious rules for identity, the more pressing question is why the inferentialist should expect such rules.
I argue that the inferentialist has no good reason to expect identity to be harmonious. I begin by reviewing the literature on the harmony of identity (§2). In §3 I distinguish three inferential roles played by identity in the (putatively logical) language of first-order logic with identity (
$\text {FOL}^{=}$
): variable coordination, definitional substitutability, and co-reference.Footnote
1
In §4.1 I introduce a Gentzen-style natural deduction system for Wittgensteinian predicate logic (
$\text {N}^{W}$
) based on the sequent calculus given by Wehmeier [Reference Wehmeier32] (
$\text {S}^{W}$
).
$\text {N}^{W}$
is important here because it tracks variable coordination without using identity. By showing that the rules for
$\text {N}^{W}$
are harmonious (indeed, that
$\text {N}^{W}$
is normalizable) (§4.2), I establish that variable coordination is harmonious but that identity qua variable coordination is inferentially superfluous. For this reason, we cannot infer from identity’s use in variable coordination that identity per se is a logical expression. Finally (§5), I argue that recent attempts to demonstrate the harmony of definitional substitution and co-reference actually serve as evidence that they are not logical on the inferentialist account. In sum, then, there is no reason for the inferentialist to expect identity to admit of harmonious rules, let alone be logical.
2 The current status of identity
The status of identity has long been a thorn in the side of logical inferentialists. In following Wittgenstein’s edict that meaning is use, inferentialists have hoped to account for the meaning of expressions by the role that they play in inferential use. A natural place to mete out inferential use is proof theory, specifically Gentzen-style natural deduction, where expressions are characterized by rules for their introduction and elimination in derivations. Insofar as they characterize the inferential use of expressions, these rules thereby serve to characterize their meaning. Yet some of these expressions we believe to be special, in the sense that we can make arbitrary, but uniform, substitutions for the other particles in a derivation while still preserving its validity; we call these particles logical (specifically, inferentially logical). Of course, this requires specification of a notion of validity, for which purpose has developed a notion of harmony: intuitively, introduction and elimination rules are harmonious when the elimination rules “draw no more and no less from an assertion than the introduction rules warrant” [Reference Read21, p. 115].Footnote 2
This notion has borne out intuitions about (at least intuitionistic) FOL. Following Griffiths [Reference Griffiths and Ahmed7, pp. 1450–1451], the harmony of an expression
$\#$
requires the following of the introduction (
$\#$
I) and elimination rules (
$\#$
E):Footnote
3
-
#-reduction The
$\#$
E license us to draw a
$\#$
-free conclusion Q from a
$\#$
-involving assertion P together with any side-premises, only if we could already have inferred Q from those side-premises together with any
$\#$
-free ground from which
$\#$
I licensed us in inferring P in the first place. -
#-expansion The
$\#$
E license us to draw a
$\#$
-free conclusion Q from a
$\#$
-involving assertion P together with any side premises, if we could already have inferred Q from those side-premises together with any
$\#$
-free ground from which
$\#$
I licensed us in inferring P in the first place.
The standard rules for conjunction, for example, meet this demand. If we can derive Q from
$A\wedge B$
, we can derive Q from the grounds for
$A\wedge B$
, i.e., from A and B (
$\wedge $
-reduction). Conversely, we can derive Q from
$A\wedge B$
if we could have derived it from A and B (
$\wedge $
-expansion). At least in this case, harmony is easy to see just from the presentation of the rules in Gentzen-style natural deduction:

That is, to generate
$\wedge $
-reductions we need only excise the sub-derivation sitting above Q, and to generate
$\wedge $
-expansions we need only place the original derivation in that same slot. Similarly, the standard rules for the universal and existential quantifiers, disjunction, and the conditional satisfy
$\#$
-reduction and
$\#$
-expansion.
But now consider the standard rules for identity:

where F is a formula in which a is free and
$F^a_b$
is the formula had by replacing in F any occurrence of a with b. Are these rules for identity harmonious? It seems not. Suppose (LL) licenses us to draw the conclusion
$F^a_b$
from
$a=b$
(
$=$
-involving assertion P) and F (
$=$
-free side premise).
$=$
-reduction and
$=$
-expansion require that there is some way to derive
$a=b$
from (Refl)—but this is obviously not possible. This suggested to Read [Reference Read23, p. 115] and Klev [Reference Klev13, p. 869] that (Refl) and (LL) fail
$\#$
-reduction since we can draw
$F^a_b$
from the latter without its occurrence as a ground for
$a=b$
. Intuitively, (Refl) needed to be replaced with a rule licensing the introduction of heteronymous identity statements like
$a=b$
, and the substitutability property embodied in (LL) should serve as a ground for their introduction.
Read [Reference Read23] therefore proposed new I-rules which aimed to derive such mixed-identity statements from substitutability. Yet these were problematic. Read proposed the following new introduction rules, called Congruence and Congruence
$'$
:

where P is a predicate variable ranging over monadic predicates. Intuitively, this solved the problem: (Refl) did not allow for the introduction of mixed-identity statements, specifically on substitutability grounds, but (Congr) and (Congr
$'$
) do precisely this. Yet this ‘solution’ is odd because, in fact, it is impossible to derive in FOL
$Pb$
from
$Pa$
, a and b distinct. This implies that (Congr) and (Congr
$'$
) are inferentially-equivalent to (Refl) [Reference Griffiths5, Reference Griffiths and Ahmed7].Footnote
4
The equivalence of these introduction rules is a problem for harmony because it means we still don’t have grounds corresponding to the (non-trivial) substitutability observed in the elimination rule.Footnote
5
Likewise, Klev [Reference Klev13] proposed new introduction rules meant to capture the substitutability property we observe in the E-rule. The core of Klev’s new proposal is the observation that identity statements are often justified by theoretical definitions. For example, in the language of set theory, we define arithmetical expressions, like ‘
$1$
’, using set-theoretically familiar terms (e.g.,
$\emptyset $
,
$\{\}$
) so that identities like ‘
$1=\{\emptyset \}$
’ are licensed. Klev claims that these definitional equivalences suffice to justify (Refl) and (LL), and he characterizes a collection of introduction rules for definitional equivalence that suffice for the derivation of mixed-identity statements. Insofar as these warrant the introduction of mixed-identity statements on the grounds of definitional substitutability, Klev’s rules appear to satisfy
$=$
-reduction.
Nevertheless, Klev’s new I-rules for definitional equivalence create a new (see fn. 5) problem for the harmony of identity, as Griffiths and Ahmed [Reference Griffiths and Ahmed7, pp. 1466–1467] argue. Roughly, the problem is this: with mixed-identity statements deriving canonically (or “deriving”—see [Reference Griffiths and Ahmed7, fn. 26]) from definitional identities, we can now ask whether the definitional equivalence of a and b can be derived from
$a=b$
, i.e., whether (LL) now suffices for
$=$
-expansion. It does not, and informally this is because substitution of definitionally equivalent terms in a formula is only allowed when that formula has not been derived by invocation of definitional equivalence. More bluntly, as Griffiths and Ahmed [Reference Griffiths and Ahmed7, p. 1467] point out, if we could derive the definitional equivalence of a and b from
$a=b$
, then identity just is definitional equivalence. At least on the usual understanding of identity as context-free—where the meaning does not vary with the domain of discourse—this can’t be correct.
With the failure of Read’s and Klev’s rules in the rearview mirror, Griffiths and Ahmed see little prospect for a harmonious account of identity. As they sum up the situation, it is unclear why we should expect identity to be harmonious [Reference Griffiths and Ahmed7, p. 1468]:
There are many grounds on which we assert identities: consider the grounds on which we assert, e.g., the identity of sets, of numbers, of rivers and of persons. Why think these can be captured in simple rules like those governing conjunction? And why think that grasp of such rules is necessary for understanding identity? If anything is necessary for that, we suspect that it is Leibniz’s Law; and that the only thing uniting the open-ended set of grounds for identity statements is not that they all instantiate some schema but rather that they justify that elimination-rule. For instance, our grounds for ‘A is the same person as B’ ought to justify the inference from ‘A wore a hat at t’ to ‘B wore a hat at t’. But then there may be no tidy introduction rules for identity, and no prospect of establishing harmony between them and the elimination rule; so by inferentialist criteria identity is not a logical constant.
But, as they note, this is “little more than a gesture” for where more inquiry is needed. This is, of course, not enough to justify calling off the search for harmonious rules.
3 The uses of identity
But the search should nevertheless be called off. We can deliver this verdict by considering what, if anything, unites the various uses of identity rather than what unites the grounds for identity statements, as Griffiths and Ahmed suggest. As a frame, let us consider identity in
$\text {FOL}^{=}$
. If we assume the language has bound variables (x, y, etc.), terms containing functional terms (f,
$g,$
etc.), and constants (c,
$d,$
etc.), we can see identity is flanked in six ways, i.e.:
-
(1)
$x=y;$
-
(2)
$x=c;$
-
(3)
$x=f;$
-
(4)
$c=d;$
-
(5)
$c=f;$
-
(6)
$f=g$
.
Do we have reason to believe each of these uses is logical? Let’s consider them in turn.
The first two uses of identity are often considered logical because identity “allows us to capture a large range of valid arguments” that are not otherwise expressible [Reference Griffiths5, p. 506]. These valid arguments include expressions like ‘there are at most…’, ‘there are at least…’, and numerically definite quantification. To express ‘there are at least two objects’ in
$\text {FOL}^{=}$
, for example, we may write
$\exists x \exists y x\neq y$
; likewise, to express ‘there is something other than c’ we may write
$\exists x x\neq c$
. Thus, insofar as we believe there are valid arguments involving expressions such as these, identity’s use suggests it is logical. Call this use of identity variable coordination for the way it coordinates variable assignment (extensionally speaking).
But while variable coordination may, in fact, be logical, it does not require identity. Hintikka [Reference Hintikka9] showed this a half-century ago by translating
$\text {FOL}^{=}$
into an equivalent, identity-free notation.Footnote
6
The characteristic feature of this alternative notation—call it Wittgensteinian logic, or W-logic for short—is that variables are no longer interpreted inclusively but rather exclusively. Consider, for example, the sentence ‘Any two points of a straight line completely determine that line’, which we take to be a truth of geometry [Reference Hintikka9, p. 225]. Intuitively, the logical structure of this sentence is something like ‘
$\forall x \forall y ((Px \wedge Py) \rightarrow Lxy)$
’. On the inclusive interpretation x and y can each refer to the point p; but since two distinct points are required to determine a line, our translation is not true on every interpretation. It is true on the exclusive interpretation, however, because x and y cannot refer to the same point. Thus, W-logic handles variable coordination via the quantifier semantics rather than by the use of identity in the syntax, as in
$\text {FOL}^{=}$
.
Variable coordination is the only essential use
$\text {FOL}^{=}$
makes of identity when the language has only variables and constants that do not co-refer (see [Reference Wehmeier34]). Since W-logic is co-expressive with
$\text {FOL}^{=}$
in this setting, identity is expressively superfluous. Nevertheless, this does not imply that identity is superfluous for the logical inferentialist. Were W-logic to fail the inferentialist’s tests for logicality, but
$\text {FOL}^{=}$
pass (where
$=$
is restricted to its variable coordination role in
$\text {FOL}^{=}$
), this would be evidence that identity is necessary to capture variable coordination’s (presumed) logicality.
Unfortunately for identity, W-logic meets the logical inferentialist’s conditions for logicality, as the next section shows. Thus while variable coordination may be logical, this does not count in favor of identity’s logicality.
4 The harmony of W-logic
The guiding semantic idea behind W-logic is to interpret variables exclusively, i.e., interpret distinct variables as picking out distinct objects. This contrasts with the (standard) inclusive interpretation of variables, where distinct variables can pick out the same object. What follows makes this exclusive interpretation of variables precise by adapting Wehmeier’s formulation of W-logic to the setting of Gentzen-style natural deduction.
We begin with the language
$\mathcal {L}$
of W-logic, which coincides with the language of FOL. Let
$\rightarrow $
,
$\vee $
,
$\wedge $
be the usual binary propositional connectives and
$\forall $
,
$\exists $
be the usual quantifier symbols. (I leave the reader to substitute their preferred rules for
$\neg $
.) Further, let
$\mathcal {X}=\{x_i|i \in \mathbb {N}\}\cup \{x,y\}$
be the set of countably-many bound variables,
$\mathcal {A}=\{a_i|i \in \mathbb {N}\}\cup \{a,b\}$
be the set of countably-many individual constants, and let
$\mathcal {P}=\{P^n_{i}|i \in \mathbb {N}\}$
be the set of countably-many predicate symbols for every arity
$n \geq 1$
. (Note that the language does not contain function or constant symbols or the identity symbol.) We then inductively define the formulas of W-logic as follows:
-
•
$P^n_{i}(a_0, \dotsc , a_{n-1})$
are formulas, for
$P^n_{i} \in \mathcal {P}$
and
$a_0, \dotsc , a_{n-1} \in \mathcal {A}$
; -
• whenever F and G are formulas, then
$\neg F$
,
$(F \rightarrow G)$
,
$(F \vee G)$
, and
$(F \wedge G)$
are formulas; -
• whenever F is a formula,
$a \in \mathcal {A}$
, and
$x \in \mathcal {X}$
doesn’t occur in F, then
$\forall xF_a[x]$
and
$\exists xF_a[x]$
are formulas (where for any formula F,
$F_s[t]$
is the result of replacing all occurrences of the term s in F with the term t).
With a few more definitions we can define the W-logical correlate of the usual inclusive-variable, model-theoretic semantics for FOL. As usual, a structure
$\mathcal {U}=\langle U, \langle P^{n}_{\mathcal {U}}\rangle \rangle $
, where the domain, U, is a nonempty set, and for each n-ary predicate symbol
$P^n$
of
$\mathcal {L}$
,
$P^{n}_{\mathcal {U}}$
is an n-ary relation over U. A
$\mathcal {U}$
-assignment maps the individual constants into U. W-logic departs from FOL with its definition of satisfaction, where only
$\mathcal {U}$
-assignments 1–1 on the individual constants of a formula are considered. Let A, F, and G be formulas in
$\mathcal {L}$
, IC(A) be the individual constants in a formula A. Let
$\sigma $
be a
$\mathcal {U}$
-assignment, and
$[\sigma \{a:=u\}]$
be the
$\mathcal {U}$
-assignment generated from
$\sigma $
by additionally mapping a to
$u \in U$
. Recursively define W-satisfaction of A by
$\sigma $
on
$\mathcal {U}$
(write
$\mathcal {U} \Vdash A [\sigma ]$
) as follows, where
$\sigma $
is 1–1 on IC(A):
-
•
$\mathcal {U} \Vdash P^{i}(a_0, \dotsc , a_{n-1}) [\sigma ]$
iff
$\langle \sigma (a_0), \dotsc , \sigma (a_{n-1}) \rangle \in P^{i}_{\mathcal {U}}$
; -
•
$\mathcal {U} \nVdash \bot [\sigma ]$
; -
•
$\mathcal {U} \Vdash F \vee G [\sigma ]$
iff
$\mathcal {U} \Vdash F [\sigma ]$
or
$\mathcal {U} \Vdash G [\sigma ]$
; -
•
$\mathcal {U} \Vdash F \rightarrow G [\sigma ]$
iff
$\mathcal {U} \nVdash F [\sigma ]$
or
$\mathcal {U} \Vdash G [\sigma ]$
; -
•
$\mathcal {U} \Vdash F \wedge G [\sigma ]$
iff
$\mathcal {U} \Vdash F [\sigma ]$
and
$\mathcal {U} \Vdash G [\sigma ]$
; -
•
$\mathcal {U} \Vdash \forall x F_a[x] [\sigma ]$
iff
$\mathcal {U} \Vdash F [\sigma \{a:=u\}]$
for all
$u \notin \sigma [IC(\forall x F_a[x])]$
; -
•
$\mathcal {U} \Vdash \exists x F_a[x] [\sigma ]$
iff
$\mathcal {U} \Vdash F [\sigma \{a:=u\}]$
for some
$u \notin \sigma [IC(\exists x F_a[x])]$
.
We can now use W-satisfaction to define W-logical validity, truth, and logical consequence in the usual way, where A is an
$\mathcal {L}$
-formula and
$\Gamma $
is a set of
$\mathcal {L}$
-formulas:
-
• A is W-valid in
$\mathcal {U}$
,
$\mathcal {U} \Vdash A$
, if
$\mathcal {U} \Vdash A [\sigma ]$
for every
$\mathcal {U}$
-assignment
$\sigma $
1–1 on IC(A) -
• A is W-true in
$\mathcal {U}$
if A is a sentence (i.e.,
$\text {IC}(A)=\emptyset $
); -
• A is W-valid,
$\Vdash A$
, if
$\mathcal {U} \Vdash A$
for all
$\mathcal {U}$
; -
• and A is a logical consequence of
$\Gamma $
,
$\Gamma \Vdash A$
, if for every structure
$\mathcal {U}$
, if
$\mathcal {U} \Vdash \Gamma [\sigma ]$
for
$\sigma $
1–1 on IC(
$\Gamma ,A$
), then
$\mathcal {U} \Vdash A[\sigma ]$
.
It can be shown (see the Appendix, Theorem A.1), as is usual, that individual constants are schematic in these definitions.
4.1 Natural deduction rules of W-logic
It may surprise the reader that the requirement that variable assignments be 1–1 changes very little about the natural-deduction system for W-logic (
$\text {N}^{W}$
) compared to that for FOL (
$\text {N}^{FOL}$
). The only potential barrier to the W-soundness of the I and E rules is that the newly-generated derivations don’t preserve 1–1-ness of variable assignments. This could be by (i) insufficient restrictions on the individual constants in the quantifier rules or (ii) losing at least one individual constant in the transformation from the old derivation to the new derivation by use of the propositional rules (when the set of individual constants of the new derivation are non-empty and there is more than one individual constant in the old derivation; 1–1-ness is trivial when the old derivation contains no individual constants and the new derivation exactly one individual constant). But (i) and (ii) do not happen, as we will see.Footnote
7
The propositional rules for
$\text {N}^{W}$
are just the same as those for
$\text {N}^{FOL}$
. This is because the usual propositional rules turn out to have the property of variable containment. That is, given derivations of F from
$\Gamma $
and C from F and
$\Delta $
, the composite proof of C from
$\Gamma $
and
$\Delta $
is such that
$IC(F)\subseteq IC(\Gamma , \Delta , C)$
.Footnote
8
This rules out (ii). With this established, the W-soundness of the propositional rules is obvious; indeed, only the elimination rules need to be considered since the conclusion of an introduction rule contains its premise(s) as (a) subformula(e). Consider, for instance,
$\vee $
-E:

Assume
$\Gamma \Vdash A\vee B$
,
$\Delta , A \Vdash C$
, and
$\Delta , B \Vdash C$
. Suppose
$\Gamma , \Delta \nVdash C$
, i.e.,
$\mathcal {U} \Vdash \Gamma [\sigma ]$
,
$\mathcal {U} \Vdash \Delta [\sigma ]$
, and
$\mathcal {U} \nVdash C [\sigma ]$
for some
$\mathcal {U}$
-assignment
$\sigma $
that is 1–1 on
$IC(\Gamma ,\Delta , C)$
. Then since
$IC(A,B) \subseteq IC(\Gamma , \Delta , C)$
(by variable containment),
$\sigma $
is 1–1 on
$IC(\Gamma , \Delta , A, B, C)$
. But then either
$\mathcal {U} \nVdash \Gamma [\sigma ]$
or
$\mathcal {U} \nVdash \Delta [\sigma ]$
, contradicting our assumptions. So
$\mathcal {U} \nVdash C [\sigma ]$
, hence
$\Gamma , \Delta \Vdash C$
.
It remains only to show that the
$\text {N}^{W}$
quantifier rules preserve 1–1-ness of variable assignments. First, consider the universal quantifier. We define
$\forall $
-I as the rule that transforms derivations of
$F(a)$
from
$\Gamma ,$
where
$a \notin IC(\Gamma )$
and
$F_a[b]$
for each
$b \in IC(\Gamma )\setminus IC(\forall x F_a[x])$
into a derivation of
$\forall x F_a[x]$
from
$\Gamma $
, i.e.:

This requirement of derivations of
$F_a[b]$
from
$\Gamma $
for each
$b \in IC(\Gamma )\setminus IC(\forall x F_a[x])$
is peculiar to
$\text {N}^{W}$
, but it is required to ensure that the rule is W-sound (see also [Reference Wehmeier32, pp. 4–5]). In essence, the additional derivations mirror the fact that the bound variable x in
$\forall x F(x)$
ranges over all elements u of the domain not already mapped to by one of the individual constants in
$\forall x F(x)$
. This is unlike FOL, where x ranges over all elements of the domain. Since W-logic tracks distinct elements by distinct individual constants, we must consider whether u has already been mapped to by some individual constant
$b\in \Gamma $
, and, in such a case, we must independently establish
$F(b)$
for each b.
More explicitly: suppose that
$\mathcal {U} \nVdash \forall x F(x) [\sigma ]$
for some
$\sigma $
that is 1–1 on
$IC(\Gamma , \forall x F(x))$
. This means there is an element u of U not in the image of
$\sigma $
on the individual constants of
$\forall x F(x)$
such that, were we to extend
$\sigma $
by mapping
$c \notin IC(\forall x F(x))$
to this u, i.e.,
$\sigma \{c:=u\}$
, then
$\mathcal {U} \nVdash F(c) [\sigma \{c:=u\}]$
. That is, one of the semantic prerequisites must be violated. For the introduction rule to be sound, we thus must lack a prerequisite derivation. If u is not in the image of
$\sigma $
restricted to
$IC(\Gamma )$
, then we will lack a derivation of
$F(a)$
, the condition familiar from
$\text {N}^{FOL}$
. This matches the semantics:
$\sigma \{a:=u\}$
is 1–1 on
$IC(\Gamma , F(a))$
since
$a\notin IC(\Gamma , \forall x F(x))$
, so we have
$\mathcal {U} \nVdash F(a) [\sigma \{a:=u\}]$
. Otherwise,
$\sigma $
maps some
$b\in IC(\Gamma )\setminus IC(\forall x F(x))$
to u, so that
$\mathcal {U} \nVdash F(b) [\sigma ]$
. But then we lack a derivation of
$F(b)$
, i.e., one of the additional prerequisite derivations in
$\text {N}^{W}$
.
The definition of
$\forall $
-E is nearly the same as for
$\text {N}^{FOL}$
: given derivations of
$\forall x F$
from
$\Gamma $
and C from
$F_x[a'], \Delta $
, where
$a' \in IC(\Gamma ,\Delta )\setminus IC(\forall x F)$
or
$IC(\Gamma , \Delta )=\emptyset $
and
$\|IC(C)\|\leq 1$
, the following is a derivation of C from
$\Gamma , \Delta $
:

The added condition is the natural one for the exclusive interpretation of variables, namely, that the individual constant we substitute does not already occur in
$\forall x F$
. The left disjunct captures this in the case that there are already individual constants available in the deduction, while the right disjunct captures this in the case where there are none.
It is worth pausing for a moment to reflect on the kind of inferences ruled out by the new condition on
$\forall $
-E in
$\text {N}^{W}$
. As noted, these restrictions highlight the difference in meaning of quantified statements. What I want to explicitly note is that they rule out seemingly-natural inferential transitions, such as from
$\forall xy (Rxy \wedge \neg Rxy)$
to
$\exists xy(Rxy \wedge \neg Rxy)$
by repeated
$\forall $
-E (where
$\Delta =\emptyset $
). Semantically speaking, such transitions are not sound because there is no way to extend the variable assignment in a 1–1 way; to put it another way, W-validity has a close relationship to domain cardinality. In this case, while
$\forall xy (Rxy \wedge \neg Rxy)$
is W-valid in cardinality-1 domains, thereby licensing the inference to
$\forall y(Ray \wedge \neg Ray)$
because it, too, is W-valid in cardinality-1 domains, there is no individual constant substitution, call it b, for y that would license transition to
$Rab \wedge \neg Rab$
—either the variable assignment maps b to the same element as a, i.e., fails to be 1–1 because the domain has cardinality 1, or we extend the domain to cardinality
$>$
1 to get a 1–1 variable assignment and, in the process, make
$\forall xy (Rxy \wedge \neg Rxy)$
W-invalid.Footnote
9
Proof-theoretically speaking, we prevent such inferences by ensuring that the individual constant already occurs on the branch in which it is instantiated, or else that it is the only individual constant on that branch. This also suffices for variable containment, hence the soundness of
$\forall $
-E.
Now consider the existential quantifier.
$\exists $
-I is the transformation taking a derivation of
$Fa$
from
$\Gamma $
to a derivation of
$\exists F_a[x]$
provided either that
$a\in IC(\Gamma )$
or
$IC(\Gamma , \exists F_a[x])=\emptyset $
, i.e.:

Likewise,
$\exists $
-E is the transformation taking a derivation of
$\exists x F$
from
$\Gamma $
and a derivation of C from
$\Delta $
and
$F_x[a]$
to a derivation of C from
$\Gamma $
and
$\Delta $
when
$a\notin IC(\exists x F, \Delta , C)$
and either (1)
$IC(\exists x F)\subseteq IC(\Gamma , \Delta , C)$
and
$a\in IC(\Gamma )$
or (2)
$IC(\Gamma ,\Delta ,\exists x F, C)=\emptyset $
, i.e.:

As with
$\forall $
-E, the additional condition on
$\exists $
-E in
$\text {N}^{W}$
is owed to the exclusive interpretation of variables. The condition also suffices for variable containment, and hence the soundness of
$\exists $
-E. Thus, since neither the quantifier rules (barrier (i)) nor the propositional rules (barrier (ii)) can fail to preserve 1–1-ness of variable assignments, the rules for
$\text {N}^{W}$
are sound.
We are now in a position to determine whether the introduction and elimination rules for W-logic are harmonious.
4.2 Harmony of W-logic
The rules for W-logic are harmonious. More precisely, we can show that the elimination rules draw “no more” (
$\#$
-reduction) and “no less” (
$\#$
-expansion) than the introduction rules license. To formally establish
$\#$
-reduction, we need to show that we can’t use an elimination rule to draw out more than was put in by the introduction rule. It suffices, then, to show that any proof where a constant is introduced and subsequently eliminated can be converted into a proof without this detour.Footnote
10
We call proof systems exhibiting this property normalizable.
W-logic is normalizable. Normalizability is typically established in three steps.Footnote
11
First, we show that derivations with local detours—immediate applications of
$\#$
-E following
$\#$
-I—can be converted into derivations without them. Since the propositional rules for
$\text {N}^{W}$
are the same as for
$\text {N}^{FOL}$
, the usual detour conversion procedures apply (see, e.g., [Reference Negri and Von Plato16]). We now verify that the usual detour conversion procedures also apply for the quantifiers. Suppose we are given the following derivation:

where, recall,
$F_x[b]$
is derived for each b in the open assumptions above
$F_x[a]$
not in
$\forall x F$
and, as in
$\text {N}^{FOL}$
,
$a'$
does not occur in the open assumptions above C. But then
$a'$
can be substituted for a in the derivation of
$F_x[a]$
so that the resulting derivation of
$F_x[a']$
can be composed with the derivation of C from
$F_x[a']$
.Footnote
12
Now suppose we are given the following derivation:

where, recall,
$\exists $
I requires that a is in the open assumptions
$\Gamma $
above
$F_x[a]$
or
$IC(\Gamma , \exists F_a[x])=\emptyset $
; and
$\exists $
E requires that
$a'\notin IC(\exists x F, \Delta , C)$
and, when
$\Delta $
are the open assumptions above C, either (1)
$IC(\exists x F)\subseteq IC(\Gamma , \Delta , C)$
and
$a'\in IC(\Gamma )$
or (2)
$IC(\Gamma ,\Delta ,C, \exists F)=\emptyset $
. But then
$a'$
can be substituted for a in the derivation of
$F_x[a]$
so that the resulting derivation of
$F_x[a']$
can be composed with the derivation of C from
$F_x[a']$
. Thus, the usual local detour conversion procedures also work for
$\text {N}^{W}$
.
However, since natural deduction is non-local, some detours may be “hidden” by intervening rule applications. This is a problem especially with
$\vee $
E and
$\exists $
E, each of which can be used to separate introductions from eliminations. For this reason, normalization requires that such derivations admit of a permutation conversion, i.e., that we can “permute” the end of the detour (an elimination rule) up in the derivation, past the intervening use of
$\vee $
E or
$\exists $
E. Since
$\vee $
E is the same in
$\text {N}^{FOL}$
and
$\text {N}^{W}$
, we need only consider
$\exists $
E. Suppose we are given the derivation:

It is possible that
$\#$
E, which has premises C and Q, corresponds to some
$\#$
I in the derivation of
$\exists x F$
, so we want to convert this into a derivation with
$\#$
E permuted above
$\exists $
E. The usual permutation conversion delivers

We only need to verify that the individual constants in C and Q are in the individual constants in open assumptions above
$\exists x F$
or C or those in D (or that D is a sentence and only one individual constant is in the open assumptions) to verify this is an
$\text {N}^{W}$
derivation. (I ignore relabeling of closed assumptions, though this may be required.) But if the tree given was a derivation, this is guaranteed. Thus, the usual permutation conversions also work for
$\text {N}^{W}$
.
This establishes the normalizability of
$\text {N}^{W}$
, hence its satisfaction of
$\#$
-reduction. It remains only to show that
$\text {N}^{W}$
satisfies
$\#$
-expansion. Recall that
$\#$
-expansion captures the sense in which an elimination rule draws “no less” from its major premise than the corresponding introduction rule puts in. Again, we need only consider the quantifiers. To show this, we only need to establish that the following are derivations:

It is obvious that the left is a derivation, since
$a'$
and
$a"$
can be chosen to play the role of a and b in the rule schema above for
$\forall $
I, i.e.,
$a'$
is not, and
$a"$
is, among the individual constants in the open assumptions of the derivation of
$\forall x F$
and
$a"$
is not among
$IC(\forall x F)$
. Similarly, the right is a derivation because
$a'$
does not occur in
$\exists x F$
and no individual constants are lost. This establishes the existence of expansion procedures for the
$\text {N}^{W}$
quantifiers.
Thus, we have established that the rules of
$\text {N}^{W}$
are harmonious in the sense that they satisfy
$\#$
-reduction and
$\#$
-expansion.
5 On what remains of identity’s use
While the last section established the logicality of variable coordination, it also undermined the use of this fact in any case for identity’s logicality. Since
$\text {N}^{W}$
, whose language is identity-free, suffices for variable coordination, identity is unnecessary for variable coordination. If we assume that only parts of canonical languages are apt to be logical (following, e.g., Quine [Reference Van Orman and Quine20]), then identity is not logical qua variable coordination. But perhaps the remaining uses of identity are logical. Our remaining candidates are:
-
(3)
$x=f;$
-
(4)
$a=b;$
-
(5)
$a=f;$
-
(6)
$f=g$
.
We consider these in what remains.
Recall from §2 that, for the inferentialist, an expression is logical when the validity of a deduction is preserved through arbitrary but uniform substitutions for the other particles occurring in it. This puts the emphasis on deductions featuring these formulas rather than the formulas themselves. Thus, to determine whether the remaining uses of identity are logical, we should ask of each: which deductions require the use and remain valid under arbitrary but uniform substitution for the other particles occurring in it? I will argue that the remaining uses either do not require the use of identity or else do not remain valid under all arbitrary but uniform substitutions for the other particles occurring in a derivation. This latter case will be especially emphasized for use 4.
Consider uses 3, 5, and 6, where identity relates a function term to a variable, an individual constant, and another function term, respectively. As above, it is easy to show that these uses are expressively superfluous, if not deductively (see, e.g., [Reference Boolos, Burgess and Jeffrey1, pp. 255–256]). This is done by first observing that any sentence of
$\text {FOL}^{=}$
is logically equivalent to one in which the function symbols occur only as above, i.e., all function terms f (i) flank identity and (ii) are of the form
$f_n(x_1,\dots ,x_n),$
where
$f_n$
is an n-place function symbol. But we can simply add an
$n+1$
-place predicate, with n terms corresponding to the n places in these function symbols and an additional place for the term to which they evaluate. For example, the commutative law of addition,
$\forall x \forall y x+y=y+x$
, can be rewritten in this way as
$\forall x \forall y \exists z(\text {Sum}(x,y,z) \iff \text {Sum}(y,x,z))$
.Footnote
13
We then observe that sentences in the new language (with predicates instead of function symbols) are satisfiable iff the corresponding sentence in the original language is. We could call this use of identity function evaluation, since the
$n+1$
-place predicate
$\rightarrow (x_1,\dots , x_n, x)$
corresponding to
$x=f$
reads as ‘arguments
$x_1,\dots , x_n$
evaluate to x’; however, I instead call this definitional substitution, following Klev.
Of course, this will not convince the inferentialist of the non-logicality of definitional substitution. For one, it feels non-inferentialist insofar as it seems to appeal to model-theoretic semantics for its justification. For another, such reasoning does nothing to assuage the worry that transferring definitional substitution’s inferential use onto predicates will result in rules that are harmonious. However, it’s worth pausing to ask why function evaluation should be considered logical. That is, which presumed-valid inferences require definitional substitution? The literature on the harmony of identity does not directly address this question. Nevertheless, Klev’s adoption of type-theoretic thinking is suggestive on this point [Reference Klev12, sec. 5]. As motivation for his rules, Klev observes that, in mathematics, we frequently prove theorems of the form ‘
$t=u$
’ for syntactically distinct t and u [Reference Klev13, p. 868]. While Klev himself does not go so far as to say inferences to theorems of this form are valid, they are nevertheless plausible candidates. Indeed, insofar as such inferences often rely on definitions specific to the mathematical theory—say, in Peano arithmetic, that
$1$
is the successor of
$0$
,
$1=s(0)$
—it is sensible to claim that given those definitions, inference to
$t=u$
is valid. This domain-specificity can be captured through type assumptions keyed to the context of that mathematical theory. Let us provisionally accept these (type-limited) inferences to
$t=u$
as valid. Does this imply that definitional-substitution uses (write
$t=_T u$
to mean ‘t is definitionally substitutable for u in the context of theory T’) of identity should count as logical?
Inferences to formulas containing subformulas like
$t=u$
do seem valid in some sense. In particular, inferring such subformulas is unproblematic when the background theory licenses the necessary substitutions. For example, deriving
$1+1=2$
from the definitions
$s(s(0))=2$
and
$s(0)=1$
, as well as the recursive definition of addition, is obviously valid in a Peano Arithmetic that includes these definitions as part of the theory (type context). However, it is unclear how this demonstrates the inferential logicality of definitional substitution. The approach faces two related problems. First, it is unclear from the start why identity per se—that is, a context-free equivalence relation, where the meaning does not vary with the domain of discourse—should be desirable, let alone necessary, for theorems of this sort. After all, if we only require definitional substitutability claims specific to the theory (type context) as premises for deriving a conclusion of the form
$t=u$
, theorems of the form
$t=u$
(with identity per se) must be derived via an inferentially-equivalent definitional identity
$t=_T u$
constituting part of the background theory T.Footnote
14
If we are committed to the meaning of identity being captured by its inferential use, the claim should be that theorems involving expressions of definitional substitutability, not identity per se, are valid and hence definitional substitutability is inferentially logical. Indeed, it is commonly observed that statements like
$1=\{\emptyset \}$
are context-sensitive. That is, we are not motivated to believe the statement is true because we believe
$1$
is identical to
$\{\emptyset \}$
; rather, we are so motivated because we want to mimic the operations of arithmetic in set theory, and one way of doing so is to define
$1$
as
$\{\emptyset \}$
(see, e.g., [Reference Kitcher11]). At best, then, deductions featuring substitution of
$1$
for
$\{\emptyset \}$
remain valid under arbitrary but uniform substitution of the other expressions—are a logical use of identity—only when the expressions substituted for have been identified as the relevant set-theoretic correlate.
But this leads us directly to the second problem: if definitional substitutability is inferentially logical, it is not so in the same manner as the propositional connectives or quantifiers. For the latter, their (inferential) meaning is given by rules for introduction which are written in that language. In particular, the grounds for their assertion are part of the broader inferential system. Suppose that the rule(s) for introducing
$t=_T u$
with t and u distinct are part of the formal system governing the propositional connectives and quantifiers. Either the premises for
$t=_T u$
-I are written in the language, hence statable in the system, or they are not. If they are, then every derivation using
$t=_T u$
-E to derive
$Ft$
from
$Fu$
contains premises using
$=_T$
non-trivially, i.e.,
$=_T$
-I and
$=_T$
-E fail
$\#$
-reduction. (
$=_T$
-E must be structurally equivalent to LL; see Klev’s bridge principle [Reference Klev13, p. 878].) So the premises are not part of the language of the system. But this implies that the rule for distinct terms is not in the formal system itself, hence its meaning is not captured by its introduction rule. In particular, any attempted justification of
$=_T$
-E on the basis of the I rules will be circular.Footnote
15
This means that there are no harmonious rules for
$=_T$
because there are no genuine introductions. Coincidentally, this may explain why Klev discusses theorems of the form
$t=u$
and not
$t=_T u$
. If the theorems were of the latter form,
$t=_T u$
meets disharmony in the same way Read’s updated rules for identity per se did. However, by using a bridge law to effectively separate the two languages, and thereby derivations in the two corresponding formal systems, identity appears to be harmonious. Nevertheless, its meaning is not given in the system of concern, and this prevents satisfaction of
$\#$
-expansion.Footnote
16
Given that we were motivated to justify
$=$
E by some
$=$
I precisely because we expect
$=$
I to exhaust its meaning, this bridge strategy seems inadmissible. For this to be acceptable, the inferentialist must admit as inferentially meaningful operators that cannot be captured by introduction rules in a single formal system and that logicality does not require harmony (in the sense discussed here). It is unclear how many inferentialists would be comfortable with this move.
The only use remaining of identity is
$a=b$
, i.e., co-reference for syntactically distinct constants a and b. Following [Reference Wehmeier33], I introduce ‘
$\equiv $
’ as the predicate symbol for co-reference so that the intended reading of ‘
$a\equiv b$
’ is ‘a and b co-refer’. Presumably, we expect co-reference to be inferentially logical because it is necessary to capture the validity of certain inferences. While there is no discussion of this use in isolation, one can be generated from [Reference Read21]. Read suggests that inferences from the property-wise indiscernibility of two constants to their identicalness are valid. There is an obvious candidate for such a rule: if there is a derivation of
$Fb$
from
$Fa$
for arbitrary F, then
$a=b$
. Yet this rule is not viable because it requires rules governing predicate variables [Reference Klev13, p. 874] [Reference Griffiths5]. But if we are helping ourselves to these resources from second-order logic, then we may as well explicitly define identity. This actually constitutes evidence internal to the inferentialist program that the inferences Read wants to capture as valid are not so within a unified first-order system of inference. In this way, Read faces the same problem as Klev.
But the evidence against the inferential logicality of co-reference doesn’t stop with the home-grown variety. While logical inferentialists have worked to find harmonious rules for co-reference to justify its meaning as inferentially logical, other philosophers have called attention to the strangeness of its meaning. As Fiengo and May note, co-reference statements seem to be of inferential use only when they are informative (see [Reference Fiengo and May2]). In extensional terms, co-reference tells us something about how our language hooks up to the domain, namely, that distinct constants pick out the same element of the domain. This is transparently about the connection between our expressions and the world. The meta-linguistic flavor of co-reference has been noted as far back as Frege (but see also [Reference Geach and Munitz3, Reference Quine19]), and it has featured prominently in recent discussions of identity’s logicality (see, e.g., [Reference Wehmeier33] and [Reference Pardey and Wehmeier18, Reference Vlasáková31]).
The meta-linguistic flavor of co-reference should make inferentialists question whether identity deserves to be called logical. First, the meaning of co-reference statements is comparatively plain when we help ourselves to non-inferential (extensional) resources:
$a \equiv b$
means that ‘a’ and ‘b’ each refer to an object u in the domain. Further, we know what is required to enforce the context-free interpretation (a metatheoretical uniqueness assumption), and the semantic need for this distinguishes identity from other operators and predicates (see, e.g., [Reference Vlasáková31, pp. 162–164]). Second, the meta-linguistic flavor lingers under an obvious translation into the inferential framework—say,
$a \equiv b$
means that ‘a’ and ‘b’ occur indistinguishably in derivations—because we have to refer to the “slots” in derivations to make sense of this.Footnote
17
We should therefore expect that ensuring co-reference’s interpretation is context-free amounts to a restriction on collections of derivations, not derivations themselves, i.e., it involves an importantly different definition of canonical derivation.
Third, the meta-linguistic flavor of co-reference is at the heart of the difficulties faced by the canonical neo-logicist approach to arithmetic (see, e.g., [Reference Hale and Wright8]). This approach drew heavily on Dummett’s inferentialism, particularly his interpretation of Frege and his want for ontological parsimony. The hope had been to avoid hypostatizing objects when drawing the (crucial) distinction between terms and predicates, and for this reason a separation in terms of inferential relations dominated. But this didn’t work because explaining criteria for identity—where an ability to do so in at least some cases is the test for being a singular term—inevitably turns on an account of what the objects picked out by singular terms are like (see [Reference Rumfitt25] and references therein). Indeed, more recent neo-logicist accounts of number make existence claims up front (e.g., the existence of a progression) and, by dint of being extensional and relational, the occurrence of identity in the account’s corresponding constructions is properly understood as co-reference (see, e.g., [Reference Tennant29]).
Finally, the home-grown evidence against the logicality of identity is, in a sense, telling us just what the extensional semantics does: the meaning of co-reference is not like the meaning of the logical operators or the other predicates. If we try to capture the inferential indistinguishability directly, as Read’s rules do, rules governing the behavior of predicates are required. If, on the other hand, we try to capture inferential indistinguishability indirectly via definitional substitutability, (canonical) derivations must be defined as equivalence classes of (canonical) derivations with respect to definitional substitution. In either case, the resources required to fix the meaning of co-reference are not available in the (standard) proof theory of FOL. And, in either case, the elimination rule remains the inferentially operative rule: in the former case as the only non-trivial instances of a rule, and in the latter as axioms for sub-derivations of definitional equivalence (in typographical disguise as
$=_T$
, of course).
The evidence to date therefore suggests calling off the search for harmonious rules for the definitional substitutability and co-reference uses of identity. Even if a notion of harmony were constructed to capture these uses as such, the meanings they implicate are distinct enough from, e.g., the propositional connectives, quantifiers, and variable coordination that the sufficiency of such a notion of harmony for inferential logicality is dubious. Along the same lines, it appears to be the elimination rule that confers meaning on these uses of identity (within a closed system of inference). Yet since the entire purpose of harmonious rules is to bring symmetry, harmony would obscure the inferential asymmetry of definitional substitutability and co-reference.
Given that inferentialism aims to capture meaning by use, to bring harmony where none is apparent would run counter to the intuitive appeal of inferentialist semantics. At present, a healthier attitude toward (what remains of) the concept of identity would seem to be the one Wilson espouses more generally [Reference Wilson36, pp. 134–135] (original emphasis):
the wisest policy, in my opinion, is to resist the impulse to consider “concepts” as well-defined entities at all, and instead confine our attention to the shifting manners in which our everyday standards of conceptual evaluation operate over the lifetime of an evolving predicate (I believe that “concept” represents a term like “Napoleon’s personality”—it manifests a certain continuity over time but doesn’t stay precisely fixed). We must guard against our ur-philosophical predilections to espy a hazy invariance within these evolving opinions, rather than appreciating the natural alteration of standards that actually emerge.
Indeed, if we can easily describe a situation in which ‘being a water molecule’ and ‘being a molecule of
$H_2 O$
’ come apart, suggesting identity assertions such as ‘
$water=H_2 O$
’ are not context-free, why should we think there is a concept of identity precisely fixed across all uses? In effect, what I am suggesting is that Leibniz’s Law, too, is not generally valid—that is, even with our most trustworthy identity statements we do not expect every substitution for the other expressions in a derivation to preserve the derivation’s validity.Footnote
18
6 Conclusion
The central question of the literature to date has been whether a harmonious account of identity is possible. This has been viewed as the primary barrier to an account of identity as logical. This paper analyzed identity’s prospects by addressing its various uses separately. The paper began with identity’s most plausible claim to logicality—its use in variable coordination. Yet while it was established that variable coordination is, indeed, harmonious, identity was unnecessary for demonstrating this fact. This eliminated the strongest case for identity’s logicality. We then assessed the plausibility of identity’s logicality for its remaining uses, namely, definitional substitutability and co-reference. This assessment led us to an interesting dichotomy concerning what remained of identity: either it is not logical (by dint of not being harmonious) but its natural inferential use is transparent, or it is harmonious (hence, putatively logical) but its natural inferential use is obscured. In light of this dichotomy, whether identity can be given a harmonious account is the wrong question. Rather, the question for the inferentialist is why should identity be logical?
Appendix
Theorem A.1. Let F be a formula and
$a,a'$
be individual constants. Then for all structures
$\mathcal {U}$
:
Proof. Let F be a formula and
$a,a'$
be individual constants,
$\mathcal {U}$
a W-logical structure, and
$\sigma $
a
$\mathcal {U}$
-assignment. By induction on the complexity of F.
-
Base Case Let F be an atomic formula of the form
$P(a, a_1, \ldots , a_n)$
for
$a_1, \ldots , a_n \in \mathcal {A}$
and P a predicate of arity n+1. (We thus assume, without loss of generality, that a occurs in the first position.) By definition
$\mathcal {U} \Vdash P(a, a_1, \ldots , a_n) [\sigma \{a:=u\}]$
iff
$\langle \sigma (a), \sigma (a_1), \ldots , \sigma (a_n)\rangle \in P^{\mathcal {U}}$
. Since
$\sigma \{a:=u\}(a)=u$
, this is just
$\langle u, \sigma (a_1), \ldots , \sigma (a_n)\rangle \in P^{\mathcal {U}}$
. But by assumption
$\sigma (a')=u$
, so this is equivalent to
$\langle \sigma (a'), \sigma (a_1), \ldots , \sigma (a_n)\rangle \in P^{\mathcal {U}}$
. Again by Definition 2, this is the case iff
$\mathcal {U} \Vdash P(a', a_1, \ldots , a_n) [\sigma ]$
. Hence
$\mathcal {U} \Vdash P(a, a_1, \ldots , a_n) [\sigma \{a:=u\}]$
iff
$\mathcal {U} \Vdash P(a', a_1, \ldots , a_n) [\sigma ]$
, as required to prove. -
Ind. Hyp. Assume this holds for formulas A, B.
-
Ind. Step
-
→ By the
$\rightarrow $
clause of Definition 2
$\mathcal {U} \Vdash A \rightarrow B [\sigma \{a:=u\}]$
iff
$\mathcal {U} \nVdash A [\sigma \{a:=u\}]$
or
$\mathcal {U} \Vdash B [\sigma \{a:=u\}]$
. By application of the inductive hypothesis, this is equivalent to
$\mathcal {U} \nVdash A_a[a'] [\sigma ]$
or
$\mathcal {U} \Vdash B_a[a'] [\sigma ]$
. But by the
$\rightarrow $
clause of Definition 2, this is the case iff
$\mathcal {U} \Vdash (A \rightarrow B)_a[a'] [\sigma ]$
. Hence
$\mathcal {U} \Vdash A \rightarrow B [\sigma \{a:=u\}]$
iff
$\mathcal {U} \Vdash (A \rightarrow B)_a[a'] [\sigma ]$
, as required to prove. -
∨ By the
$\vee $
clause of Definition 2
$\mathcal {U} \Vdash A \vee B [\sigma \{a:=u\}]$
iff
$\mathcal {U} \Vdash A [\sigma \{a:=u\}]$
or
$\mathcal {U} \Vdash B [\sigma \{a:=u\}]$
. By application of the inductive hypothesis, this is equivalent to
$\mathcal {U} \Vdash A_a[a'] [\sigma ]$
or
$\mathcal {U} \Vdash B_a[a'] [\sigma ]$
. But by the
$\vee $
clause of Definition 2, this is the case iff
$\mathcal {U} \Vdash (A \vee B)_a[a'] [\sigma ]$
. Hence
$\mathcal {U} \Vdash A \vee B [\sigma \{a:=u\}]$
iff
$\mathcal {U} \Vdash (A \vee B)_a[a'] [\sigma ]$
, as required to prove. -
∧ By the
$\wedge $
clause of Definition 2
$\mathcal {U} \Vdash A \wedge B [\sigma \{a:=u\}]$
iff
$\mathcal {U} \Vdash A [\sigma \{a:=u\}]$
and
$\mathcal {U} \Vdash B [\sigma \{a:=u\}]$
. By application of the inductive hypothesis, this is equivalent to
$\mathcal {U} \Vdash A_a[a'] [\sigma ]$
and
$\mathcal {U} \Vdash B_a[a'] [\sigma ]$
. But by the
$\wedge $
clause of Definition 2, this is the case iff
$\mathcal {U} \Vdash (A \wedge B)_a[a'] [\sigma ]$
. Hence
$\mathcal {U} \Vdash A \wedge B [\sigma \{a:=u\}]$
iff
$\mathcal {U} \Vdash (A \wedge B)_a[a'] [\sigma ]$
, as required to prove. -
∀ We split into two cases, depending on whether
$a=b$
.-
∗ Assume
$a=b$
. Then
$\mathcal {U} \Vdash \forall x A_b[x] [\sigma \{a:=u\}]$
iff
$\mathcal {U} \Vdash \forall x A_a[x] [\sigma \{a:=u\}]$
. But
$a \notin IC(\forall x A_a[x])$
, so this happens iff
$\mathcal {U} \Vdash \forall x A_a[x] [\sigma ]$
. Observe that
$(\forall x A_a[x])_a[a']$
is just
$\forall x A_a[x]$
, hence
$\mathcal {U} \Vdash \forall x A_b[x] [\sigma \{a:=u\}]$
iff
$\mathcal {U} \Vdash (\forall x A_a[x])_a[a'] [\sigma ]$
. -
∗ Assume
$a \neq b$
. Then by the
$\forall $
clause of Definition 2,
$\mathcal {U} \Vdash \forall x A_b[x] [\sigma \{a:=u\}]$
iff
$\mathcal {U} \Vdash A [\sigma \{a:=u\}\{b:=u'\}]$
, for all
$u' \notin \sigma [IC(\forall x A_b[x])]$
. Since
$a \neq b$
, this is the same as
$\mathcal {U} \Vdash A [\sigma \{a:=u\}\{b:=u'\}]$
. By application of the inductive hypothesis, this is the case iff
$\mathcal {U} \Vdash A_a[a'] [\sigma \{b:=u'\}]$
. Hence by the
$\forall $
clause of Definition 2, this is the case iff
$\mathcal {U} \Vdash \forall x A_a[x] [\sigma ]$
, which as observed above is the same
$\mathcal {U} \Vdash (\forall x A_a[x])_a[a'] [\sigma ]$
. So
$\mathcal {U} \Vdash (\forall x A_a[x])_a[a'] [\sigma ]$
.
Hence
$\mathcal {U} \Vdash \forall x A_b[x] [\sigma \{a:=u\}]$
iff
$\mathcal {U} \Vdash (\forall x A_a[x])_a[a'] [\sigma ]$
, as required to prove. -
-
∃ Again we split into two cases, depending on whether
$a=b$
.-
∗ Assume
$a=b$
. Then
$\mathcal {U} \Vdash \exists x A_b[x] [\sigma \{a:=u\}]$
iff
$\mathcal {U} \Vdash \exists x A_a[x] [\sigma \{a:=u\}]$
. But
$a \notin IC(\forall x A_a[x])$
, so
$\mathcal {U} \Vdash \exists x A_a[x] [\sigma ]$
. Observe that
$(\exists x A_a[x])_a[a']$
is just
$\exists x A_a[x]$
, hence
$\mathcal {U} \Vdash (\exists x A_a[x])_a[a'] [\sigma ]$
. -
∗ Assume
$a \neq b$
. Then by the
$\exists $
clause of Definition 2,
$\mathcal {U} \Vdash \exists x A_b[x] [\sigma \{a:=u\}]$
iff
$\mathcal {U} \Vdash A [\sigma \{a:=u\}\{b:=u'\}]$
, for some
$u' \notin \sigma [IC(\exists x A_b[x])]$
. Since
$a \neq b$
, this is the same as
$\mathcal {U} \Vdash A [\sigma \{a:=u\}\{b:=u'\}]$
. By application of the inductive hypothesis, this is the case iff
$\mathcal {U} \Vdash A_a[a'] [\sigma \{b:=u'\}]$
. Hence by the
$\exists $
clause of Definition 2, this happens iff
$\mathcal {U} \Vdash \exists x A_a[x] [\sigma ]$
, which as observed above is the same
$\mathcal {U} \Vdash (\exists x A_a[x])_a[a'] [\sigma ]$
. So
$\mathcal {U} \Vdash (\exists x A_a[x])_a[a'] [\sigma ]$
.
Hence
$\mathcal {U} \Vdash \exists x A_b[x] [\sigma \{a:=u\}]$
iff
$\mathcal {U} \Vdash (\exists x A_a[x])_a[a'] [\sigma ]$
, as required to prove. -
-
Theorem A.2. Let
$a,b \in \mathcal {A}$
and
$\Sigma _a[b]$
be the result of replacing each occurrence of a in a formula of the derivation with b. Then given any derivation
$\Sigma $
of F,
$\Sigma _a[b]$
is a derivation of
$F_a[b]$
from
$\Gamma _a[b]$
.Footnote
19
Proof. By induction on the length n of a derivation
$\Sigma $
of F from open assumptions
$\Gamma $
.
-
Base Let
$n=1$
. Then trivially
$\Sigma a[b]$
constitutes a derivation. -
Ind. Hyp. Assume this holds for proofs of length
$\leq m-1$
. -
Ind. Step Let
$\Sigma $
be a proof of length m of F from open assumptions
$\Gamma $
. I will prove this by cases, according to the rule applied to generate the last line of the proof. Note that the propositional connectives are trivial. I will provide proofs of the cases for
$\rightarrow $
to demonstrate this.-
→ I Suppose
$\Sigma $
is Then
$\Sigma _a[b]$
is But clearly
$(A \rightarrow B)_a[b]$
is
$A_a[b] \rightarrow B_a[b]$
. So
$\Sigma _a[b]$
is By application of the inductive hypothesis to
$\Sigma _{0a}[b]$
it is a derivation. But the final application of
$\rightarrow $
I is correct, so that
$\Sigma a[b]$
is a derivation.
-
→ E Suppose
$\Sigma $
is Then
$\Sigma _a[b]$
is But clearly
$A \rightarrow B_a[b]$
is
$A_a[b] \rightarrow B_a[b]$
, so
$\Sigma _a[b]$
is By the inductive hypothesis applied to
$\Sigma _{0a}[b]$
and
$\Sigma _{1a}[b]$
each is a derivation. The final application of
$\rightarrow $
E is correct, so
$\Sigma _a[b]$
is obviously a derivation.
-
∀ I Suppose
$\Sigma $
is Note that the final application of
$\forall $
I closes d in
$\Sigma _0$
. Thus
$\Sigma _{0d}[e]$
is where
$e \neq b$
. (We make this extra substitution for d to ensure that b is not unintentionally closed in
$\Sigma _a[b]$
by
$b=e$
.) Hence
$\Sigma _a[b]$
is But then by the inductive hypothesis applied to
$\Sigma _{0d}[e]_a[b]$
, and since
$e \notin IC(\Gamma _{0a}[b])$
and
$e \notin IC(\forall x A_d[e]_e[x]_a[b])$
by
$e \neq b$
, the final application of
$\forall $
I is correct and
$\Sigma $
is a proof.
-
∀ E Suppose
$\Sigma $
is (the general-elimination form follows from this): then
$\Sigma _a[b]$
is Now
$(\forall x A)_a[b]$
is
$\forall x (A_a[b])$
, and
$A_x[d]_a[b]$
is
$A_a[b]_x[d]_a[b]$
. So
$\Sigma _a[b]$
is just: Hence by the inductive hypothesis applied to
$\Sigma _0$
,
$\Sigma _0$
is a derivation. But clearly
$d \notin IC(\forall x (A_a[b]))$
since b would have to be d, which isn’t the case because b occurs free in
$\Sigma _a[b]$
. Hence the final application of
$\forall $
E is correct, so that
$\Sigma _a[b]$
is a derivation.
-
∃ I Suppose
$\Sigma $
is Then
$\Sigma b_a[b]$
is Now
$\exists x A_a[b]$
is
$\exists x(A_a[b])$
and
$A_x[d]_a[b]$
is
$A_a[b]_x[d]_a[b]$
Footnote
20
. Thus
$\Sigma _a[b]$
is By the inductive hypothesis applied to
$\Sigma _{0a}[b]$
,
$\Sigma _{0a}[b]$
is a derivation. But since
$d \notin IC(\exists x A_a[b])$
and either
$d \in IC(\Gamma _{0a}[b])$
or
$IC(\Gamma _{0a}[b], \exists x (A_a[b])=\emptyset $
(according as
$d \in IC(\Gamma _{0})$
or
$IC(\Gamma _{0}, \exists x A)=\emptyset $
), the final application of
$\exists $
I is correct, so that
$\Sigma _a[b]$
is a derivation.
-
∃ E Suppose
$\Sigma $
is Note that the final application of
$\exists $
E closes d in
$\Sigma _1$
. Thus
$\Sigma $
is equivalently: where
$e\neq b$
and e doesn’t occur in
$\Sigma $
. Hence
$\Sigma _a[b]$
is But clearly
$\exists x A_a[b]$
is
$\exists x (A_a[b])$
, and since
$e \neq b$
and e doesn’t occur in
$\Sigma $
,
$A_x[e]_a[b]$
is
$A_a[b]_x[e]$
. Thus
$\Sigma _a[b]$
is By the inductive hypothesis applied to
$\Sigma _{0a}[b]$
and
$\Sigma _{1a}[b]$
, each are derivations. But since
$e \notin IC(\exists x (A_a[b]), C_a[b], \Gamma _{1a}[b])$
and either
$IC(\exists x(A_a[b]) \subseteq IC(\Gamma _{0a}[b], \Gamma _{1a}[b], C)$
and
$e \in IC(\Gamma _{0a}[b])$
, or
$IC(\Gamma _{0a}[b], \Gamma _{1a}[b], C)=\emptyset $
and
$|IC(\exists x (A_a[b]))|\leq 1$
(according as either
$IC(\exists x A \subseteq IC(\Gamma _{0}, \Gamma _{1}, C)$
and
$d \in IC(\Gamma _{0})$
, or
$IC(\Gamma _{0}, \Gamma _{1}, C)=\emptyset $
and
$|IC(\exists x A))|\leq 1$
),
$\Sigma _a[b]$
is a derivation.
-
Acknowledgments
Thanks to Kai Wehmeier for his suggestion of, and instrumental guidance on, an early version of this paper. Thanks to the members of the 2016 UC Irvine Logic Seminar for their feedback on the same. Thanks to Tim Button, Jeffrey Schatz, and especially Will Stafford for discussions that shaped the argument seen here. Finally, thanks to two referees from this journal and two from another, whose careful reading improved the argument’s presentation and caught multiple typos.
Funding
No funding to report.


















