1. Introduction
A remarkable result by Scarf [Reference Scarf11] provides an explicit solution to the problem of minimizing
${\mathbb{E}}(X \wedge c)$
, with c a positive constant, over the set of all positive random variables X with given mean and variance (see Theorem 4.1 below). The infimum is shown to have two different expressions depending on whether the parameter c is above or below a certain threshold and to be attained by a random variable taking only two values. About thirty years after the publication of [Reference Scarf11], Lo [Reference Lo7] noticed that Scarf’s result immediately implies an upper bound for
, with X and c as before. This has an obvious financial interpretation as an upper bound for the price at time zero of a European call option with strike c on an asset with value at maturity equal to X (in discounted terms, assuming that expectation is meant with respect to a pricing measure). More recently, de la Peña, Ibragimov, and Jordan [Reference de la Peña, Ibragimov and Jordan5] obtained, among other things, a sharp upper bound for
over the set of random variables X for which mean and variance as well as the probability
$\mathbb{P}(X \leqslant c)$
are known (see Theorem 3.1 below).
Our goal is to prove that the estimate by de la Peña, Ibragimov, and Jordan is stronger than Scarf’s in the sense that the former implies the latter. This may appear somewhat counterintuitive, as the former estimate requires the extra input
$\mathbb{P}(X \leqslant c)$
, while the latter has an extra positivity constraint.
The proof by Scarf, while relatively elementary, is quite ingenious. A different proof, also covering substantial generalizations, has been obtained in [Reference Bertsimas and Popescu2] using duality methods in semi-definite optimization. The arguments used in [Reference de la Peña, Ibragimov and Jordan5] are instead based on classical probabilistic inequalities and a ‘toy’ version of decoupling. The proof of Scarf’s estimate given here is entirely elementary and self-contained. Starting from an alternative proof of the relevant estimates in [Reference de la Peña, Ibragimov and Jordan5], another proof of Scarf’s result is obtained that is certainly not as deft as the original, but that would hopefully seem more natural to anyone who, like the author, would hardly ever come up with the ingenious idea used in [Reference Scarf11]. Our proof is based, roughly speaking, on a representation of the set of random variables with given mean and variance as a union of subsets of equivalent random variables, where two random variables
are equivalent if
$\mathbb{P}(X_1 \leqslant c) = \mathbb{P}(X_2 \leqslant c)$
. This allows to establish a link between the two inequalities and to reduce the problem of proving a version of Scarf’s result without the positivity constraint to the minimization of a function of one real variable. Finally, the positivity constraint is taken into account, thus establishing the full version of Scarf’s result. Moreover, the fact that optimizers exist and are given by two-point distributed random variables appears in a natural way and plays an important role in the proof.
The result proved in this article may also clarify, or at least complement, several qualitative remarks made in [Reference de la Peña, Ibragimov and Jordan5] about the relation between the two abovementioned inequalities. For instance, the authors note that their inequality is simpler than Lo’s in the sense that the right-hand side does not depend on the value of c. Here we show that this is only due to the positivity constraint in [Reference Scarf11], with very explicit calculations showing how the threshold value for c arises. Moreover, the sharpness of their inequality is proved in a much more natural way, i.e. showing that a two-point distributed random variable always attains the bound.
Apart from the application to bounds for option prices, estimates of (functions of)
$X \wedge c$
, sometimes called the Winsorization of X, are important in several areas of applied probability and statistics (see, e.g., [Reference Pinelis and Molzon10]). For results in this direction, as well as for an informative discussion with references to the literature, we refer to [Reference Pinelis9].
2. Preliminaries
be a probability space, on which all random elements will be defined. We shall write, for simplicity,
to denote
, and
for its norm. For any
$m \in \mathbb{R}$
$\sigma \in \mathbb{R}_+$
, the sphere of
of radius
centered in m will be denoted by
, and just by
. In other words,
stands for the set of random variables X such that
${\mathbb{E}} X=m$
$\operatorname{Var}(X) = {\mathbb{E}}(X-m)^2 = \sigma^2$
. It is clear that
$\mathscr{X}_{m,\sigma} = m + \sigma \mathscr{X}$
. The intersection of
with the cone of random variables bounded below by
$\alpha \in \mathbb{R}$
will be denoted by
. It is easily verified that, for any
$m \in \mathbb{R}$
$\sigma \in \mathbb{R}_+$
Recall that a random variable is said to be two-point distributed if it takes only two (distinct) values. The set of two-point distributed random variables belonging to
can be parametrized by the open interval
: let X take the values
$x,y \in \mathbb{R}$
$x < y$
, and set
$p \,:\!=\, \mathbb{P}(X=x)$
$1-p = \mathbb{P}(X=y)$
. Then
$X \in \mathscr{X}$
if and only if
$px + (1-p)y = 0$
$px^2 + (1-p)y^2 = 1$
, which implies
Note that
are not allowed, and hence
$p \in \mathopen]0,1\mathclose[$
. This is also obvious a priori, as there is no degenerate random variable with mean zero and variance one. The following simple observations, the proofs of which are immediate consequences of (2.2) and elementary algebra, will be useful.
Lemma 2.1. Let
$X \in \mathscr{X}$
be the
-valued random variable identified by the parameter
$p \in \mathopen]0,1\mathclose[$
, and
$c \in \mathbb{R}$
(i) If
$c \geqslant 0$ , then
$x < c \leqslant y$ if and only if
$p \geqslant {c^2}/({1+c^2})$ .
(ii) If
$c < 0$ , then
$x \leqslant c < y$ if and only if
$p \leqslant {1}/({1+c^2})$ .
We shall also need the following elementary lattice identities.
Lemma 2.2. Let
$a,b,c \in \mathbb{R}$
. The following hold:
$a + (b \wedge c) = (a+b) \wedge (a+c)$ ;
(ii) if
$a \geqslant 0$ , then
$a (b \wedge c) = (ab) \wedge (ac)$ ;
$(a-b)^+ = a - (a \wedge b)$ .
Proof. The identities in (i) and (ii) are clear. The identity in (iii) can be verified ‘case by case’, but it can also be deduced from the identity
$(a-b)^+ = (a-b) + (a-b)^-$
, where, using (i),
from which the claim follows immediately.
3. de la Peña–Ibragimov–Jordan bound
The following sharp estimates are proved in [Reference de la Peña, Ibragimov and Jordan5].
Theorem 3.1. (de la Peña, Ibragimov, and Jordan) Let
$X \in \mathscr{X}_{m,\sigma}$
$c \in \mathbb{R}$
. Setting
$p_0 \,:\!=\, \mathbb{P}(X>c)$
, we have
The proof of (3.1) in [Reference de la Peña, Ibragimov and Jordan5] is very elegant, the main idea being the introduction of an independent copy of the random variable X. Here we give an alternative, entirely elementary, proof.
Proof. Let us start with the lower bound. We can assume, without loss of generality, that
$m \geqslant c$
, otherwise there is nothing to prove. Since
it suffices to show that
${\mathbb{E}} X \mathbf{1}_{\{X>c\}} \geqslant m \mathbb{P}(X>c)$
. To this purpose, note that
where, thanks to the assumption
$m \geqslant c$
and hence
$ {\mathbb{E}} X \mathbf{1}_{\{X>c\}} \geqslant m - m\mathbb{P}(X \leqslant c) = m \mathbb{P}(X>c) $
To prove the upper bound, note that
hence, adding and subtracting
${\mathbb{E}} Xp_0 = mp_0$
on the right-hand side, the Cauchy–Schwarz inequality yields
thus completing the proof.
Remark 3.1. Even if
$m < c$
, the inequality
${\mathbb{E}} X \mathbf{1}_{\{X>c\}} \geqslant m \mathbb{P}(X>c)$
is still true. In fact,
$ m \mathbb{P}(X>c) \leqslant c \mathbb{P}(X>c) = {\mathbb{E}} c \mathbf{1}_{\{X>c\}} \leqslant {\mathbb{E}} X \mathbf{1}_{\{X>c\}} $
Theorem 3.1 implies useful one-sided Chebyshev-like bounds.
Corollary 3.1. Let
$X \in \mathscr{X}$
$c \in \mathbb{R}$
. The following hold:
(i) if
$c \geqslant 0$ , then
$\mathbb{P}(X \leqslant c) \geqslant \mathbb{P}(X < c) \geqslant {c^2}/({1+c^2})$ ;
(ii). if
$c<0$ , then
$\mathbb{P}(X \leqslant c) \leqslant {1}/({1+c^2})$ .
Proof. The proof of Theorem 3.1 remains valid with
$p_1 \,:\!=\, \mathbb{P}(X \geqslant c)$
in place of
; hence, as the right-hand side of (3.1) must be positive,
$ \sqrt{p_1(1-p_1)} \geqslant cp_1 $
. If
$c \geqslant 0$
, squaring both sides yields a linear inequality that is satisfied if and only if
$p_1 \leqslant 1/(1+c^2)$
. This proves (i).
For (ii), if
$ \mathbb{P}(X \leqslant c) = \mathbb{P}({-}X \geqslant -c) = 1 - \mathbb{P}({-}X < -c) $
. Since
$-X \in \mathscr{X}$
, (i) implies that
$ \mathbb{P}({-}X < -c) \geqslant {c^2}/({1+c^2}) $
, and hence
Remark 3.2. Let
$X \in \mathscr{X}$
$c \in \mathbb{R}_+$
. By reasoning entirely analogous to the proof of Corollary 3.1(ii), both
$\mathbb{P}(X > c)$
$\mathbb{P}(X < -c)$
are bounded above by
; hence
$\mathbb{P}(|{X}|>c) \leqslant 2/(1+c^2)$
, which is sharper than Chebyshev’s inequality
$\mathbb{P}(|{X}|>c) \leqslant 1/c^2$
4. Scarf–Lo bound
The following estimate is obtained in [Reference Scarf11].
Theorem 4.1. (Scarf) Let
be strictly positive real numbers. The infimum of the function
$X \mapsto {\mathbb{E}}(X \wedge c)$
on the set
is attained, i.e. it is a minimum, and is given by
Note that the constraint
$X \geqslant 0$
is dictated by the structure of the practical inventory problem considered by Scarf. It is not needed, however, to avoid the infimum being minus infinity. In fact,
$ \bigl({\mathbb{E}} X^2 \bigr)^{1/2} = \lVert{X}\rVert_2 = \lVert{X-m+m}\rVert_2 \leqslant \sigma + |{m}|$
, which implies
As observed in [Reference Lo7], Lemma 2.2(iii) immediately yields the following result.
Corollary 4.1. (Lo) Let
be strictly positive real numbers. The supremum of the function
$X \mapsto {\mathbb{E}}(X - c)^+$
on the set
is attained, i.e. it is a maximum, and is given by
The remainder of this section is dedicated to showing that Theorem 4.1, and hence also its corollary, are consequences of Theorem 3.1. We shall argue by a sequence of elementary lemmas and propositions. The first is a reduction step that, in spite of its simplicity, considerably reduces the burden of symbolic calculations. Throughout the section we assume that
$c,m,\sigma \in \mathbb{R}$
, with
. Further constraints (that do not imply any loss of generality) will be introduced as needed.
Lemma 4.1. Let
$ \widetilde{c} \,:\!=\, ({c-m})/{\sigma} $
. Then
Proof. Since
$\mathscr{X}^0_{m,\sigma} = m + \sigma \mathscr{X}^{-m/\sigma}$
by (2.1), we have
where, by Lemma 2.2,
which immediately yields the conclusion.
The lemma implies that it suffices to study the problem of minimizing the function
$X \mapsto {\mathbb{E}}(X \wedge c)$
over the set
. We shall first study the minimization problem without the lower-boundedness constraint, i.e. on
rather than on
. We shall need some more notation: the subset of
such that
$\mathbb{P}(X \leqslant c) = p$
is denoted by
. Note that, in view of Corollary 3.1, these sets are nonempty only for certain combinations of the parameters p and c. Let
$L_c \colon \mathopen]0,1\mathclose[ \mapsto \mathbb{R}$
be the function defined by
$ L_c\colon p \longmapsto - (p-p^2)^{1/2} + c(1-p)$
Lemma 4.2.
$ \inf_{X \in \mathscr{X}(c;p)} {\mathbb{E}}(X \wedge c) \geqslant L_c(p) $
Proof. Lemma 2.2 (iii) implies
$ {\mathbb{E}} (X-c)^+ = -{\mathbb{E}}(X \wedge c) $
for any X with mean zero; hence, by Theorem 3.1,
We are now going to show that the infimum in Lemma 4.2 is achieved, and that the minimizer is a two-point distributed random variable. We shall only consider, without loss of generality, those values of p such that
is nonempty, that is, by Corollary 3.1, setting
$p \in \Pi_c$
Lemma 4.3. Let
$p \in \Pi_c$
, and
$X_0 \in \mathscr{X}$
be the two-point distributed random variable with parameter p. Then
$ {\mathbb{E}}(X_0 \wedge c) = L_c(p) $
, and hence
Proof. Since
$p \in \Pi_c$
, Lemma 2.1 implies that the random variable
, taking the values x and y as defined in (2.2), is such that
$x \leqslant c \leqslant y$
. In particular,
$\mathbb{P}(X_0 \leqslant c) = p$
, that is,
$X_0 \in \mathscr{X}(p;c)$
, and an elementary computation finally shows that
${\mathbb{E}}(X_0 \wedge c) = L_c(p)$
The following result essentially shows that Theorem 3.1 implies Theorem 4.1 in the unconstrained case (i.e. without assuming that the minimizer should be bounded from below).
Proposition 4.1.
$ \inf_{X \in \mathscr{X}} {\mathbb{E}}(X \wedge c) = \inf_{p \in \Pi_c} L_c(p) $
Proof. The decomposition
$ \mathscr{X} = \bigcup_{p \in \Pi_c} \mathscr{X}(p;c) $
implies (see, e.g., [Reference Bourbaki3, p. III.11])
so that Lemma 4.3 implies the claim.
This clearly indicates that the next step should be to find the minimum of the function
Lemma 4.4. The function
satisfies the following properties:
(i) it is decreasing on the interval
$ \biggl] 0, \dfrac12 + \dfrac12 \dfrac{c}{(1+c^2)^{1/2}}\biggr] $ ;
(ii) it is increasing on the interval
$ \biggl[ \dfrac12 + \dfrac12 \dfrac{c}{(1+c^2)^{1/2}}, 1 \biggr[ $ ;
(iii) it admits a unique minimum point
$p_\ast$ defined by
$ p_\ast \,:\!=\, \dfrac12 + \dfrac12 \dfrac{c}{(1+c^2)^{1/2}} $ .
belongs to
$ L_c(p_\ast) = \frac12 c - \frac12 (1+c^2)^{1/2} $
Proof. The argument is elementary, so it will be sketched only (the details can be found in [Reference Marinelli8]). The function
is smooth and its derivative is the function
hence, standard calculus yields the claims (i)–(iii). It only remains to show that
$p_\ast \in \Pi_c$
; if
$c \geqslant 0$
, this is equivalent to
$x \,:\!=\, c/(1+c^2)^{1/2} \in [0,1\mathclose[$
, this reduces to
$1+x \geqslant 2x^2$
, which is satisfied if
$x \in [0,1]$
. If
belongs to
if and only if
that is, setting
$\langle c \rangle \,:\!=\, (1+c^2)^{1/2}$
for convenience, if and only if
$\langle c \rangle \lvert{c}\rvert \geqslant \langle c \rangle^2 - 2$
. This inequality is easily seen to be satisfied for every
$c \in [-1,0]$
. If
$c \leqslant -1$
, the inequality is equivalent to
$ \langle c \rangle^2 c^2 \geqslant (c^2-1)^2 $
, which simplifies to
$3c^2 \geqslant 1$
, i.e. it is verified for every
$\lvert{c}\rvert \geqslant 1/\sqrt{3} \vee 1$
, that is for every
$c \leqslant -1$
. Finally, the expression for
follows by elementary algebra.
We have thus solved the problem of of minimizing the function
$X \mapsto {\mathbb{E}}(X \wedge c)$
Proposition 4.2. Let
be defined as in Lemma 4.4, and
The two-point distributed random variable
$\mathbb{P}(X_0 = x_\ast) = p_\ast$
$\mathbb{P}(X_0 = y_\ast) = 1-p_\ast$
is a minimizer of
$\inf_{X \in \mathscr{X}} {\mathbb{E}}(X \wedge c)$
, i.e.
If the parameters of the problem are such that
, as defined in Proposition 4.2, is bounded below by
, the original minimization problem is clearly solved. We shall assume, until further notice, that
$m\geqslant 0$
. This comes at no loss of generality, as
is empty if
, and the problem degenerates if
$c \leqslant -m$
, in the sense that
${\mathbb{E}}(X \wedge c) = c$
for every
$X \geqslant -m$
Corollary 4.2. Let
be defined as in Proposition 4.2. Then
if and only if
$ c \geqslant ({1-m^2})/{2m} $
Proof. By definition of
, the lower bound
$X_0 \geqslant -m $
holds if and only if
$p_\ast \geqslant 1/(1+m^2)$
, i.e. if and only if
or, equivalently,
where the left-hand side takes values in the interval
. Elementary algebra shows that the inequality
is verified if and only if
finally implies that
$X_0 \geqslant -m$
if and only if
from which the claim follows immediately.
In view of the corollary, we only need to consider the problem under the condition
Note that this implies
$c < 1/m$
Let us start by observing that, for any
$X \geqslant -m$
where Y is a random variable taking values in
$y \geqslant c$
, with
In order for the random variable Y to belong to
, it is sufficient and necessary, in view of (2.2) and Lemma 2.1, that
and either
$c \leqslant 0$
$c \geqslant 0$
In other words, Y belongs to
and takes values in
$y \geqslant c$
if and only if
$c \leqslant 0$
As this inequality is satisfied if and only if
$cm \leqslant 1$
, which holds by assumption, Y satisfies the abovementioned conditions if and only if
$\mathbb{P}(X \leqslant c) = 1/(1+m^2)$
. Let us then define the random variable
$Y_0 \in \mathscr{X}^{-m}$
as the (unique) random variable in
identified by the parameter
$p_m = {1}/({1+m^2})$
. We are going to show that
is in fact the minimizer of the problem.
Proposition 4.3. If
$c < (1-m^2)/(2m)$
, then
Proof. Let us rewrite (4.1) as
$ {\mathbb{E}}(X \wedge c) \geqslant -c + (m+c) \mathbb{P}(X \leqslant c) $
, which holds for every
$X \in \mathscr{X}^{-m}$
. Since
by assumption, the function
$p \mapsto -c + (m+c)p$
is decreasing. Therefore, for every
$X \in \mathscr{X}^{-m}$
such that
$\mathbb{P}(X \leqslant c) \leqslant p_m$
, we have
$X \in \mathscr{X}^{-m}$
be such that
$p\,:\!=\,\mathbb{P}(X \leqslant c)>p_m$
. Then Lemma 4.2 yields
$p_* < p_m$
by assumption and the function
is increasing on
by Lemma 4.4, it follows that
$ {\mathbb{E}}(X \wedge c) \geqslant L_c(p) \geqslant L_c(p_m) = {\mathbb{E}}(X_0 \wedge c) $
, which concludes the proof.
We have therefore proved the main result, which reads as follows.
Theorem 4.2. Let
$c,m,\sigma \in \mathbb{R}$
$m \geqslant 0$
, and
. Then
Using the notation X(p) to denote a two-point distributed random variable in
with parameter p, we could write, more concisely,
The bound by Scarf, and hence the one by Lo, i.e. Theorem 4.1 and its corollary, follow immediately by the previous theorem and Lemma 4.1.
5. Applications to option prices
In order to also consider bounds for put options, we record an easy consequence of Theorem 3.1.
Corollary 5.1. Under the hypotheses of Theorem 3.1, let
$p \,:\!=\, \mathbb{P}(X \leqslant c)$
. Then
Proof. It immediately follows from the identities
${\mathbb{E}}(c-X)^+ = {\mathbb{E}}(X-c)^+ - (m - c)$
Let us also note that a simple but useful sharpening of the lower bound in Theorem 3.1 can be given: Jensen’s inequality implies that
as well as
${\mathbb{E}}(c-X)^+ \geqslant (c-m)^+$
Assume that the probability space
is equipped with a filtration
satisfying the so-called usual conditions. Let
be the price processes of two traded assets, the latter of which is strictly positive and is used as numéraire, so that
is the discounted price process of the former asset. We assume that no asset pays dividends. A classical result [Reference Delbaen and Schachermayer6] asserts that a suitable version of no-arbitrage holds (precisely, no free lunch with vanishing risk) if and only if there exists a probability measure
equivalent to
such that S is a
-martingale with respect to
. For simplicity of notation, let us assume that
is already an equivalent
-martingale measure that is used for pricing. We are then interested in upper and lower bounds for
We first establish an easy consequence of the
-martingale property of S.
Lemma 5.1.
(i) If S is a supermartingale (in particular, if S is bounded from below), then
$ \pi_p \geqslant \bigl( K {\mathbb{E}}\beta_T^{-1} - S_0 \bigr)^+ $ .
(ii) If S is a martingale, then
$ \pi_c \geqslant \bigl( S_0 - K {\mathbb{E}}\beta_T^{-1} \bigr)^+ $ .
Proof. Recall first that a
-martingale bounded from below is a local martingale thanks to the Ansel–Stricker lemma [Reference Ansel and Stricker1, p. 309], thus also a supermartingale by an application of Fatou’s lemma. This implies that
${\mathbb{E}} S_T \leqslant S_0$
, and hence, by Jensen’s inequality,
which proves (i). The proof of (ii) is entirely analogous.
Remark 5.1. The lower bound for
is in general not true without the hypothesis that S is a martingale. This is essentially equivalent to the failure of put–call parity for asset prices with so-called bubbles (cf., e.g., [Reference Cox and Hobson4]).
It is clear that the estimates of Theorem 3.1 and Corollary 4.1 cannot be directly applied, as
is a random variable. In some cases, however, this is indeed possible. For instance, apart from the trivial case where the price process
of the numéraire is non-random, if the random variables
belong to
, and
are uncorrelated, so that
estimates on
can be obtained by Corollary 4.1 in terms of the mean of
and the mean and variance of
. In order to apply Theorem 3.1, the value of the distribution function of
at K is also needed. In the following we make the stronger assumption that the random variables
are independent. Furthermore, we assume that S is a supermartingale. By independence,
and hence
${\mathbb{E}}\widehat{S}_T \leqslant S_0 / {\mathbb{E}}\beta_T^{-1}$
. Setting
$k \,:\!=\, K {\mathbb{E}}\beta_T^{-1}$
$\widehat{\sigma} \,:\!=\, \lVert{\widehat{S}_T - {\mathbb{E}}\widehat{S}_T}\rVert_2$
, we thus have the bounds
If S is a martingale, then
${\mathbb{E}} S_T=S_0$
, so the bounds for
assume a more symmetric form.
Adapting Lo’s estimate to option prices can be done along the same lines. It does not seem possible, though, to exploit the supermartingale property of S to obtain bounds involving
rather than
${\mathbb{E}} S_T$
. On the other hand, if S is a martingale, and if the constraint
$\widehat{S}_T \geqslant 0$
is not enforced, it follows from the proofs in Section 4 and elementary computations that
from which a corresponding upper bound for put options can be obtained by put–call parity.