Rough multi-factor volatility for SPX and VIX options

Antoine Jacquier; Aitor Muguruza; Alexandre Pannier

doi:10.1017/apr.2024.45

Rough multi-factor volatility for SPX and VIX options

Part of: Stochastic analysis Mathematical finance Stochastic processes

Published online by Cambridge University Press: 16 December 2024

Antoine Jacquier

Aitor Muguruza and

Alexandre Pannier

Show author details

Antoine Jacquier*: Affiliation:
Imperial College London and the Alan Turing Institute
Aitor Muguruza*: Affiliation:
Imperial College London and Kaiju Capital Management
Alexandre Pannier*: Affiliation:
Université Paris Cité, Laboratoire de Probabilités, Statistique et Modélisation
*: *Postal address: Department of Mathematics, Imperial College London, South Kensington, London SW7 2AZ, United Kingdom.
*Postal address: Department of Mathematics, Imperial College London, South Kensington, London SW7 2AZ, United Kingdom.
****Postal address: 15 rue Hélène Brion, 75013 Paris, France. Email address: pannier@lpsm.paris

Article contents

Abstract
Introduction
Framework
Main results
Asymptotic results in the VIX case
The stock smile under multi-factor models
Proofs
Funding information
Competing interests
References

Rights & Permissions

Abstract

We provide explicit small-time formulae for the at-the-money implied volatility, skew, and curvature in a large class of models, including rough volatility models and their multi-factor versions. Our general setup encompasses both European options on a stock and VIX options, thereby providing new insights on their joint calibration. The tools used are essentially based on Malliavin calculus for Gaussian processes. We develop a detailed theoretical and numerical analysis of the two-factor rough Bergomi model and provide insights on the interplay between the different parameters for joint SPX–VIX smile calibration.

Keywords

Rough volatility multi-factor asymptotics VIX Malliavin calculus

MSC classification

Primary: 60G15: Gaussian processes

Secondary: 60G22: Fractional processes, including fractional Brownian motion 60H07: Stochastic calculus of variations and the Malliavin calculus 91G20: Derivative securities

Information

Type: Original Article
Information: Advances in Applied Probability , Volume 57 , Issue 2 , June 2025 , pp. 524 - 565

DOI: https://doi.org/10.1017/apr.2024.45 [Opens in a new window]
Creative Commons: This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution, and reproduction in any medium, provided the original work is properly cited.
Copyright: © The Author(s), 2024. Published by Cambridge University Press on behalf of Applied Probability Trust

1. Introduction

Exposure to the uncertain dynamics of volatility is a desirable feature of most trading strategies and has naturally generated wide interest in volatility derivatives. From a theoretical viewpoint, an adequate financial model should reproduce the volatility dynamics accurately and consistently with those of the asset price; any discrepancy may otherwise lead to arbitrage opportunity. Despite extensive research, implied volatility surfaces from options on the VIX and the S&P 500 index still display discrepancies, betraying the lack of a proper modelling framework. This issue is well known as the SPX–VIX joint calibration problem and has motivated a number of creative modelling innovations in the past fifteen years. Reconciling both implied volatilities requires additional factors to enrich the variance curve dynamics, as argued by Bergomi [Reference Bergomi13, Reference Bergomi14], who proposed the multi-factor model

(1)

\begin{align}\frac{\mathrm{d} \xi_t(T)}{\xi_t(T)}= \sum_{i=1}^{N}c_i \mathrm{e}^{-\kappa_i(T-t)} \mathrm{d} W_t^i,\qquad 0\leq t\leq T,\end{align}

for the forward variance, with $W^1,\ldots, W^N$ correlated Brownian motions and coeffients $c_1,\kappa_1,\ldots, c_N, \kappa_N \gt 0$ . Gatheral [Reference Gatheral26] recognised the importance of the additional factor to disentangle different aspects of the implied volatility and to allow humps in the variance curve, and introduced a mean-reverting version—the double CEV model—where the instantaneous mean of the variance follows a CEV model itself. Although promising, these attempts fell short of reproducing jointly the short-time behaviour of the SPX and VIX implied volatilities. A variety of new models were suggested to tackle this issue, both with continuous paths [Reference Barletta, Nicolato and Pagliarani7, Reference Fouque and Saporito24, Reference Goutte, Ismail and Pham29] and with jumps [Reference Baldeaux and Badran6, Reference Carr and Madan18, Reference Cont and Kokholm19, Reference Kokholm and Stisen40, Reference Pacati, Pompa and Reno43, Reference Papanicolaou and Sircar47], incorporating novel ideas and increased complexity such as regime-switching volatility dynamics. Model-free bounds were also obtained in [Reference De Marco and Henry-Labordère21, Reference Guyon31, Reference Guyon, Menegaux and Nutz32, Reference Papanicolaou46], shedding light on the links between VIX and SPX and the difficulty of capturing them both simultaneously.

Getting rid of the restraining Markovian assumption that burdens classical stochastic volatility models has permitted the emergence of rough volatility models, which consistently agree with stylised facts under both the historical and the pricing measures [Reference Alòs, León and Vives4, Reference Bayer8, Reference Bayer, Friz and Gatheral9, Reference Bennedsen, Lunde and Pakkanen11, Reference Fukasawa25, Reference Gatheral, Jaisson and Rosenbaum27]. A large portion of the toolbox developed for Markovian diffusion models is not available any longer, and asymptotic methods [Reference Guennoun, Jacquier, Roome and Shi30, Reference Horvath, Jacquier and Lacombe33, Reference Jacquier, Pakkanen and Stone36, Reference Jacquier and Pannier37]—and more recently path-dependent PDE methods [Reference Bayer, Qiu and Yao10, Reference Bonesini, Jacquier and Pannier16, Reference Jacquier and Zuric38, Reference Pannier44, Reference Pannier and Salvi45, Reference Viens and Zhang50]—thus play a prominent role in understanding the theoretical properties and numerical aspects of these models. Since the fit of the spot implied volatility skew is extremely accurate under this class of models [Reference Gatheral, Jaisson and Rosenbaum27], it seems reasonable to expect good results when calibrating VIX options. Moreover, the newly established hedging formula by Viens and Zhang [Reference Viens and Zhang50] shows that a rough volatility market is complete if it also contains a proxy of the volatility of the asset; this acts as an additional motivation for our work. Still, [Reference Jacquier, Martini and Muguruza35] showed that the rough Bergomi model is too close to lognormal to jointly calibrate both markets. Its younger sister [Reference Horvath, Jacquier and Tankov34] added a stochastic volatility-of-volatility component, generating a smile sandwiched between the bid–ask prices when calibrating VIX, but the joint calibration is not provided. By incorporating a Zumbach effect, the quadratic rough Heston model [Reference Gatheral, Jaisson and Rosenbaum28] achieves good results for the joint calibration at one given date. Further numerical methods were developed in [Reference Bonesini, Callegaro and Jacquier15, Reference Bourgey and De Marco17, Reference Rosenbaum and Zhang49]. However, the lack of analytical tractability of rough volatility models is holding back the progress of theoretical results on the VIX, with the notable exception of large deviations results from [Reference Forde, Gerhold and Smith23, Reference Lacombe, Muguruza and Stone41] and the small-time asymptotics of [Reference Alòs, García-Lorite and Muguruza2].

In the latter, $\mathcal{F}_T$ -measurable random variables (with volatility derivatives in mind) are written in the form of exponential martingales thanks to the Clark–Ocone formula, allowing the application of established asymptotic methods from [Reference Alòs, León and Vives4]. An expression for the short-time limit at-the-money (ATM) implied volatility skew is derived, yielding an analytical criterion that a model should satisfy to reproduce the correct short-time behaviour. The proposed mixed rough Bergomi model does meet the requirement of positive skew of the VIX implied volatility, backing its implementation with theoretical evidence. And indeed, the fits are rather satisfying. This model is built by replacing the exponential kernels of the Bergomi model (1) ( $t\mapsto \mathrm{e}^{-\kappa t}$ ) with fractional kernels of the type $t\mapsto t^{H-\frac{1}{2}}$ with $H\in(0,\frac{1}{2})$ , but is limited to a single factor, i.e. $W^1=W^2$ . As a result, numerical computations under this model induce a linear smile, or equivalently a null curvature, unfortunately inconsistent with market observations. To remedy this, we incorporate Bergomi’s and Gatheral’s insights on multi-factor models (integrated by [Reference De Marco20, Reference Lacombe, Muguruza and Stone41] into rough volatility models) and extend [Reference Alòs, García-Lorite and Muguruza2] to the multi-factor case; we also compute the short-time ATM implied volatility curvature, deriving a second criterion for a more accurate model choice. In summary, the present paper goes beyond [Reference Alòs, García-Lorite and Muguruza2] in three ways:

• We consider multi-factor models, far more efficient for VIX calibration, which complicate the analysis.
• We compute the second derivative of the implied volatility to discriminate better between models; this turns out to be considerably more technical than the skew.
• We provide detailed proofs of all of our results at three levels: abstract model, generic rough volatility model for the VIX, and two-factor rough Bergomi model, checking carefully that all the assumptions are satisfied, proving technical lemmas applicable to our setting, and exhibiting definite formulas at all three levels of generality.

We gather in Section 2 our abstract framework and assumptions. The main results, which provide the short-time limits of the implied volatility level, skew, and curvature, are contained in Section 3. Our framework covers a wide range of underlying assets, including VIX (Section 4) and stock options (Section 5); see in particular in Propositions 1 and 4. We provide further a detailed analysis of the two-factor rough Bergomi model (1). Closed-form expressions that depend explicitly on the parameters of the model are provided in Proposition 3 for the VIX and Corollary 1 for the stock. These expressions give insight into the interplay between the different parameters, and make the calibration task easier by allowing us to fit some stylised facts before performing numerical computations. For instance, different combinations of parameters can yield positive or negative curvature. All the proofs are gathered in Section 6, starting with useful lemmas and then following the order of the sections.

Notation. For an integer $N\in\mathbb{N}$ and a vector $\mathbf{x}\in\mathbb{R}^N$ , we define $|\mathbf{x}| \,:\!=\, \sum_{i=1}^{N}x_i$ and ${\left\|{\mathbf{x}}\right\|}^2 \,:\!=\, \sum_{i=1}^{N}x_i^2$ . We fix a finite time horizon $T \gt 0$ and let $\mathbb{T}\,:\!=\,[0,T]$ . For all $p\ge1$ , $L^p$ stands for the space $L^p(\Omega)$ for some reference sample space $\Omega$ . As we consider rough volatility models, the Hurst parameter $H\in (0,\frac{1}{2})$ is a fundamental quantity and we shall write $H_+\,:\!=\,H+\frac{1}{2}$ and $H_-\,:\!=\,H-\frac{1}{2}$ .

2. Framework

We consider a square-integrable strictly positive process $ (A_t)_{t\in\mathbb{T}}$ , adapted to the natural filtration $ (\mathcal{F}_t)_{t\in\mathbb{T}}$ of an N-dimensional Brownian motion $\mathbf{W}=(W^1,...,W^N)$ defined on a probability space $(\Omega,\mathcal{F},\mathbb{P})$ . We further introduce the true $(\mathcal{F}_t)_{t\in\mathbb{T}}$ -martingale conditional expectation process

$$M_{t}\,:\!=\,\mathbb{E}_t[A_T]\,:\!=\,\mathbb{E}[A_T| \mathcal{F}_t], \qquad\text{for all } t\in\mathbb{T}.$$

The set $\mathbb{D}^{1,2}$ will denote the domain of the Malliavin derivative operator D with respect to the Brownian motion $\mathbf{W}$ , while $\mathrm{D}^i$ indicates the Malliavin derivative operator with respect to $W^i$ . It is well known that $\mathbb{D}^{1,2}$ is a dense subset of $L^{2}(\Omega)$ and that D is a closed and unbounded operator from $L^{2}(\Omega)$ into $L^{2}(\mathbb{T}\times\Omega)$ . Analogously we define the sets of Malliavin differentiable processes $\mathbb{L}^{n,2}\,:\!=\,L^{2}(\mathbb{T};\mathbb{D}^{n,2})$ . We refer to [Reference Nualart42] for more details on Malliavin calculus. Assuming $A_T\in\mathbb{D}^{1,2}$ , the Clark–Ocone formula [Reference Nualart42, Theorem 1.3.14] reads, for each $t\in\mathbb{T}$ ,

(2)

\begin{align}M_{t}= \mathbb{E}[M_{t}] + (\mathbf{m}\bullet\mathbf{W})_t\,:\!=\, \mathbb{E}[M_t] + \sum_{i=1}^N \int_0^t m^i_s \mathrm{d} W^i_s, \\[-20pt] \nonumber\end{align}

where each component of $\mathbf{m}$ is $m^i_s\,:\!=\,\mathbb{E}\left[\mathrm{D}^{i}_s A_T |\mathcal{F}_s\right]$ . Since M is a martingale, we may rewrite (2) as

(3)

\begin{align}M_{t} = M_{0} + (M\boldsymbol\phi\bullet \mathbf{W})_t,\end{align}

where $\boldsymbol\phi_{s} \,:\!=\, \mathbf{m}_s / M_s$ is defined whenever $M_s\neq0$ almost surely. If $\boldsymbol\phi=(\phi^1,...,\phi^N)$ belongs to $\mathbb{L}^{n,2}$ , then the following processes are well defined for all $t \lt T$ :

(4)

\begin{align}Y_t&\,:\!=\,\int_t^T {\left\|{\boldsymbol\phi_r}\right\|}^2 \mathrm{d}r, \qquad \mathfrak{u}_{t} \,:\!=\, \sqrt{Y_t}, \qquad u_{t} \,:\!=\, \frac{\mathfrak{u}_t}{\sqrt{T-t}} ; \nonumber \\[3pt] \Theta^i _{t}&\,:\!=\,\left(\displaystyle \int_{t}^{T}\mathrm{D}^{i}_{t} {\left\|{\boldsymbol\phi_r}\right\|}^2 \mathrm{d} r\right)\phi^i_{t}, \qquad \mathrm{and} \qquad |\boldsymbol\Theta| \,:\!=\, \sum_{i=1}^n \Theta^i.\end{align}

Note that all the processes depend implicitly on T, which will be crucial when we study the short-time limit as T tends to zero.

2.1. Level, skew, and curvature

Since M is a strictly positive martingale process, we can use it as an underlying to introduce options. A standard practice is to work with its logarithm $\mathfrak{M} \,:\!=\,\log(M)$ , so that $\mathfrak{M}_T = \log\mathbb{E}_T[A_T] = \log(A_T)$ and $\mathfrak{M}_0 = \log\mathbb{E}[A_T]$ . Under no-arbitrage arguments, the price $\Pi_t$ at time t of a European call option with maturity T and log-strike $k\geq 0$ is equal to

(5)

\begin{align}\Pi_t(k) \,:\!=\, \mathbb{E}_t\left[\left(M_T-\mathrm{e}^k\right)^+\right] = \mathbb{E}_t\left[\left(A_T-\mathrm{e}^k\right)^+\right],\end{align}

and the ATM value is denoted by $\Pi_t\,:\!=\,\Pi_t(\mathfrak{M}_0) = \mathbb{E}_t[(A_T - M_t)^+]$ . We adapt the usual definitions of ATM implied volatility level, skew, and curvature to the case where the underlying is a general process (later specified for the VIX and the S&P). Denote by $\mathrm{BS}(t,x,k,\sigma)$ the Black–Scholes price of a European call option at time $t\in\mathbb{T}$ , with maturity T, log-stock x, log-strike k, and volatility $\sigma$ . Its closed-form expression reads

(6)

\begin{align}\mathrm{BS}(t,x,k,\sigma )=\left\{\begin{array}{l@{\quad}l}\mathrm{e}^{x}\mathcal{N}(d_{+}(x,k,\sigma ))-\mathrm{e}^{k}\mathcal{N}(d_{-}(x,k,\sigma )), & \text{if }\sigma\sqrt{T-t} \gt 0,\\[3pt] \left(\mathrm{e}^x - \mathrm{e}^k\right)^+, & \text{if }\sigma\sqrt{T-t}=0,\end{array}\right.\end{align}

with $d_{\pm }(x,k,\sigma) \,:\!=\,\frac{x-k}{\sigma \sqrt{T-t}}\pm\frac{\sigma \sqrt{T-t}}{2}$ , where $\mathcal{N}$ denotes the Gaussian cumulative distribution function.

Definition 1.

• For any $k\in\mathbb{R}$ , the implied volatility $\mathcal{I}_{T}(k)$ is the unique non-negative solution to $\Pi_0(k)=\mathrm{BS}\big(0,\mathfrak{M}_0, k, \mathcal{I}_{T}(k)\big)$ ; we omit the k-dependence when considering it ATM ( $k=\mathfrak{M}_0$ ).
• The ATM implied skew $\mathcal{S}$ and curvature $\mathcal{C}$ at time zero are defined as
$$ \mathcal{S}_{T}\,:\!=\,\left|\partial_{k} \mathcal{I}_{T}(k)\right|_{k=\mathfrak{M}_0} \qquad\text{and}\qquad \mathcal{C}_{T}\,:\!=\, \left|\partial_{k}^2 \mathcal{I}_{T}(k)\right|_{k=\mathfrak{M}_0}. $$

2.2. Examples

The framework (3) encompasses a large class of models, including stochastic volatility models ubiquitous in quantitative finance. Consider a stock price process $(S_t)_{t\in\mathbb{T}}$ satisfying

$$\frac{dS_t}{S_t} = \sqrt{v_t}\,\mathrm{d} B_t = \sqrt{v_t}\sum_{i=1}^N \rho_i\,\mathrm{d} W^i_t,$$

where v is a stochastic process adapted to $(\mathcal{F}_t)_{t\in\mathbb{T}}$ , $\boldsymbol\rho\,:\!=\,(\rho_1,\cdots,\rho_N)\in[{-}1,1]^N$ with $\boldsymbol\rho\boldsymbol\rho^\top =1$ .

2.2.1. Asset price

For $N=2$ , the model (3) corresponds to a one-dimensional stochastic volatility model under the identification $A=M=S$ , $\phi^1= \rho_1 \sqrt{v}$ , and $\phi^2 = \rho_2 \sqrt{v}$ , and v is a process driven by $W^1$ . Our analysis generalises [Reference Alòs, León and Vives4, Equation (2.1)] to the multi-factor case (in the continuous-path case). We refer to Section 5 for the details in the multi-factor setting and the analysis of the implied volatility.

2.2.2. VIX

The VIX is defined as $\mathrm{VIX}_{T}=\sqrt{\frac{1}{\Delta}\int_T^{T+\Delta}\mathbb{E}_T[v_t ] \mathrm{d}t}$ , where $\Delta$ is one month. The representation (2) yields that the underlying is the VIX future

$$M_{t}^{\mathrm{VIX}}\,:\!=\,\mathbb{E}_t[\mathrm{VIX}_T]=\mathbb{E}[\mathrm{VIX}_T]+(\mathbf{m}\bullet\mathbf{W})_t,\quad\text{with}\quad m^i_s=\frac{1}{2\Delta}\mathbb{E}_s\left[ \frac{1}{\mathrm{VIX}_T}\int_T^{T+\Delta}\mathrm{D}^{i}_s v_{r} \mathrm{d}r\right].$$

2.2.3. Asian options

For Asian options, the process of interest is $\mathcal{A}_T\,:\!=\,\frac{1}{T}\int_0^T S_t \mathrm{d}t$ . Using (2) we find

$$M^{\mathcal{A}}_{t}\,:\!=\,\mathbb{E}_t[\mathcal{A}_T]=\mathbb{E}[\mathcal{A}_T] + (\mathbf{m}\bullet\mathbf{W})_t,\qquad\text{with}\qquad m^i_s=\frac{1}{T} \int_s^T \mathbb{E}_s[\mathrm{D}^{i}_s S_{r}] \mathrm{d}r.$$

2.2.4. Multi-factor rough Bergomi

Rough volatility models can be written as $v_t=f(\mathbf{W}^H_t)$ , where $\mathbf{W}^H$ is an N-dimensional fractional Brownian motion with correlated components and $f\,:\,\mathbb{R}^N\to\mathbb{R}$ . For instance, in the two-factor rough Bergomi model,

\[v_t = v_0 \left( \chi \exp\left\{\nu W^{1,H}_t - \frac{\nu^2}{2} \mathbb{E}\left[\left(W^{1,H}_t\right)^2\right]\right\}+(1- \chi) \exp\left\{\eta W^{2,H}_t - \frac{\eta^2}{2} \mathbb{E}\left[\left(W^{2,H}_t\right)^2\right]\right\}\right),\]

with $\chi \in (0,1)$ , $\nu, \eta, v_0 \gt 0$ . In Example 2.2.2 we set $A=\mathrm{VIX}$ and hence $N=2$ , but in the asset price case we set $A=S$ and therefore $N=3$ even though the variance depends on only two factors.

2.3. General assumptions

We introduce the following broad assumptions, which are key to our entire analysis; in Section 4 we provide sufficient conditions to simplify them in the VIX case:

(H1) $A\in \mathbb{L}^{4,p}$ .
(H2) $\displaystyle\frac{1}{M_t}\in L^p$ , for all $p \gt 1$ , and all $t\in\mathbb{T}$ .
(H3) The term $\displaystyle \mathbb{E}_t\left[ \int_{t}^{T}\frac{|\boldsymbol\Theta_{s}|}{\mathfrak{u}_{s}^2}\mathrm{d}s\right]$ is well defined for all $t\in\mathbb{T}$ .
(H4) The term $\displaystyle \frac{1}{\sqrt{T}} \mathbb{E}\left[ \int_0^T \frac{|\boldsymbol\Theta_s|}{\mathfrak{u}_s^2} \mathrm{d}s\right]$ tends to zero as T tends to zero.
(H5) There exists $p\ge1$ such that $\sup_{T\in[0,1]}\mathfrak{u}_0^p \lt \infty$ almost surely and, for all random variables $Z\in L^p$ and all $i\in [\![ 1,N ]\!]$ , the following terms are well defined and tend to zero as T tends to zero:
$$ \int_0^T \mathbb{E}\left[Z \left( \mathbb{E}_s \left[ \frac{1}{u_0} \int_0^T \mathrm{D}^i_s {\left\|{\boldsymbol\phi_r}\right\|}^2 \mathrm{d}r \right]\right)^2\right] \mathrm{d}s. $$

There exists $\lambda\in({-}\frac{1}{2},0]$ such that the following hold:

$\boldsymbol{(\mathrm{H}}_{\boldsymbol{6}}^{\lambda}\boldsymbol{)}$ The following expressions converge to zero as T tends to zero:

$$ \frac{1}{T^{\frac{1}{2}+\lambda}} \mathbb{E}\left[\int_0^T \frac{|\boldsymbol\Theta_s|\int_s^T|\boldsymbol\Theta_r|\mathrm{d}r}{\mathfrak{u}_{s}^{6}}\mathrm{d}s\right] \quad \text{and} \quad \frac{1}{T^{\frac{1}{2}+\lambda}}\mathbb{E}\left[\int_0^T \frac{1}{\mathfrak{u}_{s}^{4}} \sum_{k=1}^N \left\{ \phi^k_s \mathrm{D}^k_s \left( \int_s^T |\boldsymbol\Theta_r| \mathrm{d}r \right) \right\} \mathrm{d}s\right].$$

$\boldsymbol{(\mathrm{H}}_{\boldsymbol{7}}^{\lambda}\boldsymbol{)}$ The random variable $\mathfrak{K}_T\,:\!=\,\displaystyle \frac{\int_0^T |\boldsymbol\Theta_s|\mathrm{d}s}{T^{\frac{1}{2}+\lambda} \mathfrak{u}_{0}^{3}}$ is such that $\mathbb{E}[\mathfrak{u}_0^2 \mathfrak{K}_T]$ tends to zero and $\mathbb{E}[\mathfrak{K}_T]$ has a finite limit as T tends to zero.

There exists $\boldsymbol{(\mathrm{H}}_{\boldsymbol{8}}^{\gamma}\boldsymbol{)}$ such that the following hold:

$\gamma\in({-}1,0]$ The following expressions converge to zero as T tends to zero:

\begin{align*} & \frac{1}{T^{\frac{1}{2}+\gamma}}\mathbb{E} \left[ \int_0^T \mathfrak{u}_{s}^{-10} |\boldsymbol\Theta_s| \left( \int_s^T |\boldsymbol\Theta_r|\left( \int_r^T |\boldsymbol\Theta_{y}|\mathrm{d}y \right) \mathrm{d}r \right) \mathrm{d}s \right], \\[3pt] & \frac{1}{T^{\frac{1}{2}+\gamma}} \mathbb{E} \left[ \int_0^T \mathfrak{u}_{s}^{-8} \sum_{j=1}^N \left( \phi^j_s \mathrm{D}^j_s \left( \int_s^T |\boldsymbol\Theta_r| \left( \int_r^T |\boldsymbol\Theta_{y}|\mathrm{d}y \right) \mathrm{d}r \right) \mathrm{d}s \right) \mathrm{d}s \right], \\[3pt] & \frac{1}{T^{\frac{1}{2}+\gamma}} \mathbb{E} \left[ \int_0^T \mathfrak{u}_{s}^{-8}|\boldsymbol\Theta_s| \int_s^T \sum_{j=1}^N \left\{ \phi^j_r \mathrm{D}^j_r \left( \int_r^T |\boldsymbol\Theta_{y}|\mathrm{d}y \right) \right\} \mathrm{d}r \mathrm{d}s \right], \\[3pt] & \frac{1}{T^{\frac{1}{2}+\gamma}} \mathbb{E} \left[ \int_0^T \mathfrak{u}_{s}^{-6 } \sum_{k=1}^N \left\{ \phi^k_s \mathrm{D}^k_s \left( \int_s^T \sum_{j=1}^N \left\{ \phi^j_r \, \mathrm{D}^j_r \left( \int_r^T |\boldsymbol\Theta_{y}| \mathrm{d}y \right) \right\} \mathrm{d}r \right) \right\} \mathrm{d}s \right].\end{align*}

$\boldsymbol{(\mathrm{H}}_{\boldsymbol{9}}^{\gamma}\boldsymbol{)}$ The random variables

\begin{align*} &\mathfrak{H}_T^1 \,:\!=\, \frac{1}{T^{\frac{1}{2}+\gamma} \mathfrak{u}_{0}^{7}}\int_0^T |\boldsymbol\Theta_s| \left(\int_s^T |\boldsymbol\Theta_r|\mathrm{d}r\right)\mathrm{d}s \\[3pt] \text{and} \quad & \mathfrak{H}_T^2 \,:\!=\, \frac{1}{T^{\frac{1}{2}+\gamma} \mathfrak{u}_{0}^{5}} \int_0^T \sum_{j=1}^N \left\{ \phi^j_s \mathrm{D}^j_s \left( \int_s^T |\boldsymbol\Theta_{r}| \mathrm{d}r \right) \right\} \mathrm{d}s \end{align*}

are such that $\mathbb{E}\big[(\mathfrak{u}_0^6+\mathfrak{u}_0^4+\mathfrak{u}_0^2) \mathfrak{H}^1_T + (\mathfrak{u}_0^4+\mathfrak{u}_0^2) \mathfrak{H}_T^2 \big]$ tends to zero and both $\mathbb{E}[\mathfrak{H}^1_T]$ and $\mathbb{E}[\mathfrak{H}^2_T]$ have a finite limit as T tends to zero.

Remark 1.

• $\boldsymbol{(\mathrm{H}_{1})}$ requires A to be four times Malliavin differentiable. This is necessary to prove the curvature formula using the Clark–Ocone formula (2) and using the anticipative Itô formula three times.
• When the underlying is the stock price (as in Section 2.2.1), it satisfies Equation (3) where $\phi$ corresponds to its volatility $\sqrt{v}$ . One can then directly make assumptions on the variance process, as in [Reference Alòs and León3–Reference Alòs and Shiraya5]. We make this explicit in Proposition 4 for example. In the case of the VIX (Section 4.1) we refrain from doing the same, since $\phi$ is much more intricate. Nevertheless, sufficient conditions are given by $\boldsymbol{(\overline{\mathrm{C}}}\boldsymbol{)}$ .

3. Main results

We gather here our main asymptotic results for the general framework above, with the proofs postponed to Section 6.2 to ease the flow. The first theorem states that the small-time limit of the implied volatility is equal to the limit of the forward volatility. This is well known for Markovian stochastic volatility models [Reference Alòs and Shiraya5, Reference Berestycki, Busca and Florent12] and in a one-factor setting [Reference Alòs, García-Lorite and Muguruza2]. To streamline the call to the assumptions, we shall group them using mixed subscript notation; for example $\boldsymbol{(\mathrm{H}_{123})}$ corresponds to $\boldsymbol{(\mathrm{H}_{1})}$ - $\boldsymbol{(\mathrm{H}_{2})}$ - $\boldsymbol{(\mathrm{H}_{3})}$ , and we further write $\boldsymbol{(\overline{\mathrm{H}}}^{\lambda}\boldsymbol{)}$ to mean $\boldsymbol{(\mathrm{H}_{12345})}$ - $\boldsymbol{(\mathrm{H}}_{\boldsymbol{67}}^{\lambda}\boldsymbol{)}$ and $\boldsymbol{(\overline{\mathrm{H}}}^{\lambda\gamma}\boldsymbol{)}$ as short for $\boldsymbol{(\mathrm{H}_{12345})}$ - $\boldsymbol{(\mathrm{H}}_{\boldsymbol{67}}^{\lambda}\boldsymbol{)}$ - $\boldsymbol{(\mathrm{H}}_{\boldsymbol{89}}^{\gamma}\boldsymbol{)}$ .

Theorem 1. If $\boldsymbol{(\mathrm{H}_{12345})}$ hold, then

$$ \lim_{T\downarrow 0} \Big(\mathcal{I}_{T} - \mathbb{E}[u_{0}]\Big)=0. $$

Note that we did not assume the limit of $\mathbb{E}[u_0]$ to be finite. The proof, in Section 6.2.1, builds on arguments from [Reference Alòs and Shiraya5, Proposition 3.1]. We then turn our attention to the ATM skew, defined in 1. This short-time asymptotic is reminiscent of [Reference Alòs, León and Vives4, Proposition 6.2] and [Reference Alòs, García-Lorite and Muguruza2, Theorem 8].

Theorem 2. If there exists $\lambda\in({-}\frac{1}{2},0]$ such that $\boldsymbol{(\overline{\mathrm{H}}}^{\lambda}\boldsymbol{)}$ are satisfied, then

(7)

\begin{align} \lim_{T\downarrow 0} \frac{\mathcal{S}_T}{T^{\lambda}} = \frac{1}{2}\lim_{T\downarrow 0 } \mathbb{E}\left[\frac{1}{T^{\frac{1}{2}+\lambda}}\frac{ \int_0^T|\boldsymbol\Theta_s|\mathrm{d}s}{\mathfrak{u}_{0}^3}\right]. \end{align}

Note that (7) still holds without $\boldsymbol{(\mathrm{H}}_{\boldsymbol{7}}^{\lambda}\boldsymbol{)}$ , but in that case both sides are infinite. In the rough-volatility setting of Section 2.2.1 with $v_t=f(\mathbf{W}^H_t)$ , $\lambda$ corresponds to $H-\frac{1}{2}$ so that (7) matches the slope of the observed ATM skew of SPX implied volatility. We prove this theorem in Section 6.2.2. We also provide the short-term curvature, in the following theorem, which is proved in Section 6.2.3.

Theorem 3. If there exist $\lambda\in({-}\frac{1}{2},0]$ and $\gamma\in({-}1,\lambda]$ ensuring $\boldsymbol{(\overline{\mathrm{H}}}^{\lambda\gamma}\boldsymbol{)}$ , then

(8)

\begin{align} \lim_{T\downarrow 0}\frac{\mathcal{C}_T}{T^{\gamma}} = \lim_{T\downarrow 0} \frac{1}{T^{\gamma}} \Bigg\{& \mathcal{S}_T -\frac{15}{2\sqrt{T}} \mathbb{E} \left[ \frac{1}{\mathfrak{u}_0^7} \int_0^T |\boldsymbol\Theta_r|\left( \int_r^T |\boldsymbol\Theta_{y}|\mathrm{d}y \right) \mathrm{d}r \right] \nonumber\\[3pt] &+ \frac{3}{2\sqrt{T}}\mathbb{E} \left[ \frac{1}{\mathfrak{u}_0^5} \int_0^T \sum_{j=1}^N \left\{ \phi^j_s \mathrm{D}^j_s \left( \int_s^T |\boldsymbol\Theta_{y}|\mathrm{d}y \right) \right\} \mathrm{d}s \right] \Bigg\}. \end{align}

The limit still holds without $\boldsymbol{(\mathrm{H}}_{\boldsymbol{9}}^{\gamma}\boldsymbol{)}$ , but in that case the second and third terms are infinite.

Note that $\boldsymbol{(\mathrm{H}}_{\boldsymbol{7}}^{\lambda}\boldsymbol{)}$ with $\lambda\ge\gamma$ guarantees that $T^{-\gamma} \mathcal{S}_T$ converges. By Theorem 2,

\begin{equation*} \lim_{T\downarrow0} \frac{\mathcal{S}_T}{T^{\gamma}} = \left\{ \begin{array}{l@{\quad}l} 0, \quad & \text{if } \lambda \gt \gamma,\\[3pt] \displaystyle\frac{1}{2}\lim_{T\downarrow 0 } \mathbb{E}\left[\frac{1}{T^{\frac{1}{2}+\lambda}}\frac{ \int_0^T|\boldsymbol\Theta_s|\mathrm{d}s}{\mathfrak{u}_{0}^3}\right], & \text{if } \lambda=\gamma, \\[3pt] +\infty, & \text{if } \lambda \lt \gamma. \end{array} \right. \end{equation*}

4. Asymptotic results in the VIX case

As advertised, our framework includes the VIX case where

$$ A_T =\mathrm{VIX}_T=\sqrt{\frac{1}{\Delta}\int_T^{T+\Delta}\mathbb{E}_T[v_r] \mathrm{d}r},$$

for $v_r\in\mathbb{D}^{3,2}$ for all $r\in[0,T+\Delta]$ , and we provide simple sufficient conditions for $\boldsymbol{(\overline{\mathrm{H}}}^{\lambda\gamma}\boldsymbol{)}$ to hold.

4.1. A generic volatility model

Consider the following four conditions, which we gather under the notation $\boldsymbol{(\overline{\mathrm{C}}}\boldsymbol{)}$ . There exist $H \in (0,\frac{1}{2})$ and $X\in L^p$ for all $p \gt 1$ such that the following hold:

(C ₁ ) For all $t\ge0$ , $\frac{1}{M_t} \le X$ almost surely.
(C ₂ ) For all $i,j,k\in [\![ 1, N ]\!]$ and $t\le s\le y\le T \le r$ , we have, almost surely
- • $v_r\le X$ ,
- • $\mathrm{D}^i_y v_r \le X (r-y)^{H_{-}}$ ,
- • $\mathrm{D}^j_s \mathrm{D}^i_y v_r \le X (r-s)^{H_{-}} (r-y)^{H_{-}}$ ,
- • $\mathrm{D}^k_t \mathrm{D}^j_s \mathrm{D}^i_y v_r \le X (r-t)^{H_{-}} (r-s)^{H_{-}} (r-y)^{H_{-}}$ .
(C ₃ ) For all $p \gt 1$ , $\mathbb{E}[u_s^{-p}]$ is uniformly bounded in s and T, with $s\le T$ .
(C ₄ ) For all $ i,j,k\in [\![1, N ]\!]$ and $r\ge0$ , the mappings $y\mapsto\mathrm{D}^i_y v_r$ , $s\mapsto \mathrm{D}_s^j \mathrm{D}^i_y v_r$ , and $t\mapsto \mathrm{D}^k_t \mathrm{D}^j_s \mathrm{D}^i_y v_r$ are almost surely continuous in a neighbourhood of zero.

Recall the notation $H_-$ and $H_+$ from the introduction. We compute the level, skew, and curvature of the VIX implied volatility in a model which satisfies the sufficient conditions. Let us define the following limits:

(9)

\begin{align}J_i \,:\!=\, \int_0^\Delta \mathbb{E}[\mathrm{D}^i_0 v_r] \mathrm{d}r ,\qquad G_{ij} \,:\!=\, \int_0^\Delta \mathbb{E}\big[\mathrm{D}^j_0 \mathrm{D}^i_0 v_r \big] \mathrm{d}r, \quad \text{for all } i,j\in [\![1,N ]\!].\end{align}

Proposition 1. Under $\boldsymbol{(\overline{\mathrm{C}}}\boldsymbol{)}$ , the following limits hold:

\begin{equation*}\begin{array}{r@{\quad}l@{\quad}l}\displaystyle\lim_{T\downarrow 0} \mathcal{I}_{T} & = \displaystyle\frac{{\left\|{\boldsymbol J}\right\|}}{2\Delta\mathrm{VIX}_0^2}, & \text{if }H\in\left(0,\frac{1}{2}\right),\\[3pt] \displaystyle\lim_{T\downarrow 0} \mathcal{S}_T & = \displaystyle\frac{\sum_{i,j=1}^N J_i J_j \left(G_{ij} - \frac{J_i J_j}{\Delta\mathrm{VIX}_0^2}\right)}{2{\left\|{\boldsymbol J}\right\|}^3}, & \text{if }H\in\left(0,\frac{1}{2}\right),\\[3pt] \displaystyle\lim_{T\downarrow 0} \frac{\mathcal{C}_T}{T^{3H-\frac{1}{2}}} & = \displaystyle\frac{2\Delta \mathrm{VIX}_0^{2}}{3{\left\|{\boldsymbol J}\right\|}^5} \sum_{i,j,k=1}^N J^i J^j J^k \,\lim_{T\downarrow0} \frac{\int_T^{T+\Delta} \mathbb{E}\left[\mathrm{D}^k_0 \mathrm{D}^j_0\mathrm{D}^i_0 v_r\right]\mathrm{d}r}{T^{3H-\frac{1}{2}}}, & \text{if }H\in\left(0,\frac{1}{6}\right).\end{array}\end{equation*}

Remark 2. Our results stand under the fairly general set of assumptions ${\bf (\overline{C})}$ . If v is a reasonably well-behaved function of an N-dimensional Gaussian Volterra process $(W^{1,H},\cdots, W^{N,H})$ , then these should be relatively easy to check, as Proposition 2 suggests. For other rough stochastic volatility models, such as the rough Heston model [Reference El Euch and Rosenbaum22, Reference Jaisson and Rosenbaum39], it might be harder to verify the assumptions. Indeed, the latter is not even known to be Malliavin differentiable to this day, and thus does not lie within the scope of the present study.

We split the proof into two steps, collected in Section 6.3. First we show that $\boldsymbol{(\mathrm{C}_{1})}$ , $\boldsymbol{(\mathrm{C}_{2})}$ , $\boldsymbol{(\mathrm{C}_{3})}$ are sufficient to apply our main theorems, as they imply $\boldsymbol{(\overline{\mathrm{H}}}^{\lambda\gamma}\boldsymbol{)}$ for any $\lambda\in({-}\frac{1}{2},0]$ and $\gamma\in({-}1, 3H-\frac{1}{2}]$ . Thanks to $\boldsymbol{(\mathrm{C}_{4})}$ we can also compute the limits—after a rigorous statement of convergence results—starting with $\mathcal{I}_T$ and the skew with $\lambda=0$ . Restricting H to $(0,1/6)$ , which is the most relevant regime for rough volatility models, we can set $\gamma=3H-\frac{1}{2} \lt \lambda$ and compute the short-time curvature, with only the second term in $\boldsymbol{(\mathrm{H}}_{\boldsymbol{9}}^{\gamma}\boldsymbol{)}$ contributing to the limit. The curvature limit in Proposition 1 is finite by the last item of $\boldsymbol{(\mathrm{C}_{2})}$ .

Remark 3. In the regime $H\in[1/6,1/2)$ , the rescaling becomes $\gamma=0$ , and many more terms that would just vanish when $H \lt 1/6$ now make a non-trivial contribution in the limit. Informally (that is, without a proof), the limit reads

\begin{align*}\lim_{T\downarrow0} \mathcal{C}_T=& \lim_{T\downarrow0}\mathcal{S}_T - \frac{15\Delta\mathrm{VIX}^2_0}{{\left\|{\boldsymbol J}\right\|}^7}\left(\sum_{i,j=1}^N J_i J_j \left(G_{ij} - \frac{J_i J_j}{\Delta\mathrm{VIX}_0^2}\right)\right)^2 \\[3pt] & + \frac{12\Delta \mathrm{VIX}_0^2}{{\left\|{\boldsymbol J}\right\|}^5}\sum_{i,j,k=1}^N J_i \left( G_{jk}-\frac{J_j J_k}{\Delta \mathrm{VIX}_0^2}\right) \left( G_{ik}-\frac{J_i J_k}{\Delta \mathrm{VIX}_0^2}\right) \\[3pt] & + \frac{1}{{\left\|{\boldsymbol J}\right\|}^5} \sum_{i,j,k=1}^N \left\{ \frac{9 \big( J_i J_j J_k \big)^2}{2 \Delta \mathrm{VIX}_0^2} - 6 J_i J_j J_k \left(G_{ij} J_k + G_{ik} J_j + G_{jk} J_i \right) \right\} \\[3pt] & + \frac{2\Delta \mathrm{VIX}_0^{2}}{3{\left\|{\boldsymbol J}\right\|}^5} \sum_{i,j,k=1}^N J^i J^j J^k \,\int_0^{\Delta} \mathbb{E}\left[\mathrm{D}^k_0 \mathrm{D}^j_0\mathrm{D}^i_0 v_r\right]\mathrm{d}r.\end{align*}

4.2. The two-factor rough Bergomi

We consider the two-factor exponential model

(10)

\begin{align}v_t = v_0 \left[ \chi \mathcal{E}\left(\nu W^{1,H}_t\right) + \overline{\chi} \mathcal{E}\left(\eta\left(\rho W^{1,H}_t + \overline{\rho} W^{2,H}_t\right)\right) \right]=: v_0\big( \chi\mathcal{E}_t^1 + \overline{\chi} \mathcal{E}_t^2\big),\end{align}

where $H\in(0,\frac{1}{2}]$ , $W^{i,H}_t= \int_0^t (t-s)^{H_{-}} \mathrm{d} W^i_s$ , $W^1,W^2$ are independent Brownian motions, the Wick exponential is defined as $\mathcal{E}(X)\,:\!=\,\exp\{X-\frac{1}{2} \mathbb{E}[X^2]\}$ for any random variable X, and $\chi\in[0,1]$ , $\overline{\chi}\,:\!=\,1-\chi$ , $v_0,\nu,\eta \gt 0$ , $\rho\in[{-}1,1]$ , $\overline{\rho}=\sqrt{1-\rho^2}$ . This model is an extension of the Bergomi model [Reference Bergomi14], where the exponential kernel is replaced by a fractional one and an extension of the rough Bergomi model [Reference Bayer, Friz and Gatheral9] to the two-factor case. It combines Bergomi’s insights on the need for several factors with the benefits of rough volatility. As proved in Section 6.4.1, it satisfies our conditions.;

Proposition 2. If $\rho\in({-}\sqrt{2}/2,1]$ , the model (10) satisfies $\boldsymbol{(\overline{\mathrm{C}}}\boldsymbol{)}$ .

The restriction of the range of $\rho$ is equivalent to $\rho+\overline{\rho} \gt 0$ , a necessary requirement in the proof. Proposition 1 therefore applies and we obtain the following limits, as proved in Section 6.4.2.

Proposition 3. Let $\psi(\rho,\nu,\eta,\chi)\,:\!=\,\sqrt{ (\chi\nu+\overline{\chi}\eta\rho)^2+ \overline{\chi}^2\eta^2\overline{\rho}^2}$ . If $H\in(0,\frac{1}{6})$ and $\rho\in({-}\frac{\sqrt{2}}{2},1]$ , then

\begin{align*} \lim_{T\downarrow 0} \mathcal{I}_{T} & = \frac{\Delta^{H_{-}}}{2H+1} \psi(\rho,\nu,\eta,\chi), \nonumber\\[3pt] \lim_{T\downarrow0} \mathcal{S}_T & = \frac{H_+\Delta^{H_{-}}}{2\psi(\rho,\nu,\eta,\chi)^{3}} \Bigg\{(\chi\nu+\overline{\chi}\eta\rho)^2 \left[ \frac{\chi\nu^2+\overline{\chi}\eta^2\rho^2}{2H}-\left(\frac{\chi\nu+\overline{\chi}\eta\rho}{H_{+}}\right)^2\right]\\[3pt] & \quad + 2 (\chi\nu+\overline{\chi}\eta\rho)\overline{\chi}^2\eta^2\overline{\rho}^2 \left[ \frac{\eta\rho}{2H} - \frac{\nu+\eta\rho}{H_+^2}\right] + \overline{\chi}^3 \eta^4\overline{\rho}^4 \left(\frac{1}{2H}-\frac{1}{H_+^2}\right) \Bigg\}, \end{align*}

\begin{align*} \lim_{T\downarrow 0} \frac{\mathcal{C}_T}{T^{3H-\frac{1}{2}}} & = \frac{128 \Delta^{-2H} H_+^2}{3\psi(\rho,\nu,\eta,\chi)^{5}(1-6H)} \Big\{ (\chi\nu+\overline{\chi}\eta\rho)^3 (\chi\nu^3+\overline{\chi}\eta^3\rho^3) \\[3pt] &\quad + 3 (\chi\nu+\overline{\chi}\eta\rho)^2 \overline{\chi}^2 \eta^4\overline{\rho}^2 \rho^2 + 3 (\chi\nu+\overline{\chi}\eta\rho)\overline{\chi}^3 \eta^5\overline{\rho}^4 \rho + \overline{\chi}^4\eta^6\overline{\rho}^6 \Big\}. \\[-25pt] \end{align*}

The limits depend explicitly on the parameters of the model $(H,\chi,\nu,\eta,\rho)$ and can be used to gain insight on their impact over the quantities of interest.

Remark 4.

• In the case $\rho=1$ (and hence $\overline{\rho}=0$ ), the above limits simplify to
\begin{align*} \lim_{T\downarrow 0} \mathcal{I}_{T} & = \frac{\Delta^{H_{-}}}{2H_+}\left(\chi\nu+\overline{\chi}\eta\right),\\[3pt] \lim_{T\downarrow 0} \mathcal{S}_T & =\frac{1}{2} \frac{H_+\Delta^{H_{-}}}{\chi\nu+\overline{\chi}\eta} \left[\frac{\chi\nu^2+\overline{\chi}\eta^2}{2H}-\left(\frac{\chi\nu+\overline{\chi}\eta}{H_{+}}\right)^2\right],\\[3pt] \lim_{T\downarrow 0} \frac{\mathcal{C}_T}{T^{3H-\frac{1}{2}}} & = \frac{128 \Delta^{-2H} H_+^2}{3-18H} \frac{\chi\nu^3+\overline{\chi}\eta^3}{(\chi\nu+\overline{\chi}\eta)^2}. \end{align*}
• If we set $\rho=0$ (and hence $\overline{\rho}=1$ ), we obtain
\begin{align*} \lim_{T\downarrow 0} \mathcal{I}_{T} & = \frac{\Delta^{H_{-}}}{2H_+}\sqrt{\chi^2\nu^2+\overline{\chi}^2\eta^2},\\[3pt] \lim_{T\downarrow0} \mathcal{S}_T & = \frac{H_+\Delta^{H_{-}}}{2(\chi^2\nu^2+\overline{\chi}^2\eta^2)^{3/2}} \\[3pt] & \qquad \Bigg\{\chi^3\nu^4 \left[ \frac{1}{2H}-\frac{\chi}{H_+^2}\right]- \frac{2 \chi\nu^2\overline{\chi}^2\eta^2}{H_+^2} + \overline{\chi}^3 \eta^4\left(\frac{1}{2H}-\frac{1}{H_+^2}\right) \Bigg\},\\[3pt] \lim_{T\downarrow 0} \frac{\mathcal{C}_T}{T^{3H-\frac{1}{2}}} & = \frac{128 \Delta^{-2H} H_+^2}{3-18H} \frac{\chi^4\nu^6+\overline{\chi}^4\eta^6}{(\chi^2\nu^2+\overline{\chi}^2\eta^2)^{5/2}}. \end{align*}
• When $\rho=-1$ (not covered per se by the proposition), the above limits simplify to
\begin{align*} \lim_{T\downarrow0} \mathcal{I}_T & = \frac{\Delta^{H_{-}}}{2H+1} \lvert\chi\nu-\overline{\chi}\eta\lvert,\\[3pt] \lim_{T\downarrow0} \mathcal{S}_T &= \frac{H_{+}\Delta^{H_{-}}}{2\lvert\chi\nu-\overline{\chi}\eta\lvert} \left[ \frac{\chi\nu^2+\overline{\chi}\eta^2}{2H} - \left(\frac{\chi\nu-\overline{\chi}\eta}{H_{+}}\right)^2\right], \\[3pt] \lim_{T\downarrow 0} \frac{\mathcal{C}_T}{T^{3H-\frac{1}{2}}} & = \frac{128 \Delta^{-2H} H_+^2}{3-18H} \frac{\chi\nu^3-\overline{\chi}\eta^3}{(\chi\nu-\overline{\chi}\eta)^2} \,{\rm sgn}(\chi\nu-\bar{\chi}\eta). \end{align*}

Some tedious yet straightforward manipulations allow us to obtain some information about the sign of the limiting curvature.

Lemma 1. For any $\eta,\nu \gt 0$ , $\chi \in [0,1]$ , there exists $\rho^*_{\chi,\nu\,\eta} \lt 0$ such that $\lim_{T\downarrow 0} \frac{\mathcal{C}_T}{T^{3H-\frac{1}{2}}}$ is strictly positive when $\rho \gt \rho^*_{\chi,\nu\,\eta}$ and strictly negative when $\rho \lt \rho^*_{\chi,\nu\,\eta}$ . When

$$\left(\chi\nu - \overline{\chi}\eta\right)\left(\chi\nu^3 - \overline{\chi}\eta^3\right) \gt 0,$$

we have $\rho^*_{\chi,\nu\,\eta} \lt -1$ , and hence the limiting curvature is strictly positive for all $\rho \in [{-}1,1]$ .

Proof. The expression we are interested in, given in Proposition 3, and ignoring the obviously strictly positive multiplicative factor, reads

\begin{align*}\Phi(\rho) & = [\chi\nu+\overline{\chi}\eta\rho]^3 \left(\chi\nu^3+\overline{\chi}\eta^3\rho^3\right) + 3\{\chi\nu+\overline{\chi}\eta\rho\}^2\overline{\chi}^2\eta^4\boldsymbol\rho^2\rho^2 \\[3pt] & \qquad + 3(\chi\nu+\overline{\chi}\eta\rho)\overline{\chi}^3\eta^5\boldsymbol\rho^4\rho + \overline{\chi}^4\eta^6\boldsymbol\rho^6\\[3pt] & = \Big[\chi^3\nu^3 + 3\chi^2\nu^2 \overline{\chi}\eta\rho + 3\chi\nu\overline{\chi}^2\eta^2\rho^2 + \overline{\chi}^3\eta^3\rho^3\Big]\left(\chi\nu^3+\overline{\chi}\eta^3\rho^3\right) \\[3pt] & \quad + 3\Big\{\chi^2\nu^2 + 2\chi\nu\overline{\chi}\eta\rho +\overline{\chi}^2\eta^2\rho^2\Big\}\overline{\chi}^2\eta^4\boldsymbol\rho^2\rho^2 + 3(\chi\nu+\overline{\chi}\eta\rho)\overline{\chi}^3\eta^5\boldsymbol\rho^4\rho + \overline{\chi}^4\eta^6\boldsymbol\rho^6\\[3pt] & = \Big[\chi^3\nu^3 + 3\chi^2\nu^2 \overline{\chi}\eta\rho + 3\chi\nu\overline{\chi}^2\eta^2\rho^2 + \overline{\chi}^3\eta^3\rho^3\Big]\left(\chi\nu^3+\overline{\chi}\eta^3\rho^3\right)\\[3pt] &\quad + 3\Big\{\chi^2\nu^2 + 2\chi\nu\overline{\chi}\eta\rho +\overline{\chi}^2\eta^2\rho^2\Big\}\overline{\chi}^2\eta^4\left(1-\rho^2\right)\rho^2 \\[3pt] &\quad + 3(\chi\nu+\overline{\chi}\eta\rho)\overline{\chi}^3\eta^5\left(1-2\rho^2+\rho^4\right)\rho + \overline{\chi}^4\eta^6\left(1 - 3\rho^2 + 3\rho^4 - \rho^6\right)\\[3pt] & =: \sum_{i=0}^{6}\alpha_i \rho^i,\end{align*}

where

\begin{equation*}\begin{array}{r@{\quad}l@{\quad}l}\alpha_0 & = \chi^4\nu^6 + \overline{\chi}^4\eta^6 & = \chi^4\nu^6 + \overline{\chi}^4\eta^6,\\[3pt]\alpha_1 & = 3\chi^3\nu^5\overline{\chi}\eta +3\chi\nu\overline{\chi}^3\eta^5 & = 3\left(\chi^2\nu^4 + \overline{\chi}^2\eta^4\right)\chi\nu\eta\overline{\chi},\\[3pt] \alpha_2 & = 3\chi^2\overline{\chi}^2\eta^2\nu^4 +3\chi^2\nu^2\overline{\chi}^2\eta^4 +3\overline{\chi}^4\eta^6 & = 3\Big(\nu^2 + \eta^2\Big)\chi^2\overline{\chi}^2\eta^2\nu^2,\\[3pt] & \qquad -3\overline{\chi}^4\eta^6 = 3\chi^2\overline{\chi}^2\eta^2\nu^4 +3\chi^2\nu^2\overline{\chi}^2\eta^4 &\\[3pt] \alpha_3 & = \chi^3\nu^3\overline{\chi}\eta^3+\overline{\chi}^3\eta^3\chi\nu^3 +6\chi\nu\overline{\chi}^3\eta^5 & = \left(\chi^2+\overline{\chi}^2\right)\chi\overline{\chi}\nu^3\eta^3,\\[3pt] & \qquad -6\chi\nu\overline{\chi}^3\eta^5= \left(\chi^2+\overline{\chi}^2\right)\chi\overline{\chi}\nu^3\eta^3 & \\[3pt] \alpha_4 & = 3\chi^2\nu^2 \overline{\chi}^2\eta^4 -3\chi^2\nu^2\overline{\chi}^2\eta^4 + 3\overline{\chi}^4\eta^6 -6\overline{\chi}^4\eta^6 +3\overline{\chi}^4\eta^6 & = 0,\\[3pt] \alpha_5 & = 3\overline{\chi}^3\eta^5\chi\nu - 6\overline{\chi}^3\eta^5\chi\nu + 3\chi\nu\overline{\chi}^3\eta^5 & = 0,\\[3pt] \alpha_6 & = \overline{\chi}^4\eta^6- 3\overline{\chi}^4\eta^6 + 3\overline{\chi}^4\eta^6-\overline{\chi}^4\eta^6 & = 0.\\[3pt] \end{array}\end{equation*}

These surprising simplifications show that $\Phi$ is in fact a polynomial in $\rho$ of order three, with a strictly positive leading coefficient $\alpha_3$ , so that $\Phi^{\prime\prime}$ is linear and increasing in $\rho$ and is such that

$$\Phi^{\prime\prime}(\rho) = 0\quad\text{if and only if}\quad\rho = -\frac{(\eta^2+\nu^2)\overline{\chi}\chi}{\nu\eta(\chi^2+\overline{\chi}^2)}=:\rho_{\bullet}(\chi, \nu, \eta).$$

Now,

\begin{align*} & \Phi^{\prime}(\rho_{\bullet}(\chi, \nu, \eta))\\[3pt] & = 3\alpha_3\rho^*(\eta,\nu\chi)^2 + 2 \alpha_2\rho^*(\eta,\nu\chi) + \alpha_1\end{align*}

\begin{align*} & = 3\left(\chi^2+\overline{\chi}^2\right)\chi\overline{\chi}\nu^3\eta^3\rho^*(\eta,\nu\chi)^2 + 6\left(\nu^2 + \eta^2\right)\chi^2\overline{\chi}^2\eta^2\nu^2\rho^*(\eta,\nu\chi)\\[3pt] & \qquad + 3\left(\chi^2\nu^4 + \overline{\chi}^2\eta^4\right)\chi\nu\eta\overline{\chi}\\[3pt] & = 3\left(\chi^2+\overline{\chi}^2\right)\chi\overline{\chi}\nu^3\eta^3\left[\frac{(\eta^2+\nu^2)\overline{\chi}\chi}{\nu\eta(\chi^2+\overline{\chi}^2)}\right]^2 - 6\left(\nu^2 + \eta^2\right)\chi^2\overline{\chi}^2\eta^2\nu^2\frac{(\eta^2+\nu^2)\overline{\chi}\chi}{\nu\eta(\chi^2+\overline{\chi}^2)} \\[3pt] & \qquad + 3\left(\chi^2\nu^4 + \overline{\chi}^2\eta^4\right)\chi\nu\eta\overline{\chi}\\[3pt] & = 3\chi^3\overline{\chi}^3\nu\eta\frac{(\eta^2+\nu^2)^2}{\chi^2+\overline{\chi}^2} - 6\chi^3\overline{\chi}^3\eta\nu\frac{(\eta^2+\nu^2)^2}{\chi^2+\overline{\chi}^2} + 3\left(\chi^2\nu^4 + \overline{\chi}^2\eta^4\right)\chi\nu\eta\overline{\chi}\\[3pt] & =\frac{1}{\chi^2+\overline{\chi}^2}\Big({-}3\chi^3\overline{\chi}^3\nu\eta(\eta^2+\nu^2)^2 + 3(\chi^2+\overline{\chi}^2)\left(\chi^2\nu^4 + \overline{\chi}^2\eta^4\right)\chi\nu\eta\overline{\chi}\Big)\\[3pt] & =\frac{3\nu\eta\chi\overline{\chi}}{\chi^2+\overline{\chi}^2}\Big((\chi^2+\overline{\chi}^2)\left(\chi^2\nu^4 + \overline{\chi}^2\eta^4\right)-\chi^2\overline{\chi}^2(\eta^2+\nu^2)^2\Big)\\[3pt] & =\frac{3\nu\eta\chi\overline{\chi}}{\chi^2+\overline{\chi}^2}\Big(\chi^4\nu^4 + \chi^2\overline{\chi}^2\eta^4+ \chi^2\overline{\chi}^2\nu^4 + \overline{\chi}^4\eta^4-\chi^2\overline{\chi}^2\left(\eta^4+2\eta^2\nu^2+\nu^4\right)\Big)\\[3pt] & =\frac{3\nu\eta\chi\overline{\chi}}{\chi^2+\overline{\chi}^2}\Big(\chi^4\nu^4 + \overline{\chi}^4\eta^4-2\chi^2\overline{\chi}^2\eta^2\nu^2\Big) =\frac{3\nu\eta\chi\overline{\chi}}{\chi^2+\overline{\chi}^2}\left(\chi\nu + \overline{\chi}\eta\right)^2\left(\chi\nu - \overline{\chi}\eta\right)^2 \gt 0.\end{align*}

Since $\Phi^{\prime}$ is an upward parabola with strictly positive minimum, it is always strictly positive; hence $\Phi$ is a strictly increasing function (of $\rho$ ), and the lemma follows. Let $\rho^{*}_{\chi, \nu, \eta}$ denote the unique solution to $\Phi(\rho^{*}_{\chi, \nu, \eta})=0$ ; there is an explicit closed-form expression for this solution, but its exact representation is messy and not particularly informative. We can, however, provide an upper bound. Indeed,

\begin{align*} \Phi({-}1) & = \alpha_0 - \alpha_1 + \alpha_2 - \alpha_3\\[3pt] & = \chi^4\nu^6 + \overline{\chi}^4\eta^6 - 3\left(\chi^2\nu^4 + \overline{\chi}^2\eta^4\right)\chi\nu\eta\overline{\chi} + 3\left(\nu^2 + \eta^2\right)\chi^2\overline{\chi}^2\eta^2\nu^2 \\[3pt] & \qquad -\left(\chi^2+\overline{\chi}^2\right)\chi\overline{\chi}\nu^3\eta^3\\[3pt] & = \Big(\chi^3\nu^3 - 3\chi^2\nu^2\eta\overline{\chi} + 3\chi\overline{\chi}^2\eta^2\nu -\overline{\chi}^3\eta^3\Big)\chi\nu^3 \\[3pt] & \qquad + \Big(\overline{\chi}^3\eta^3 - 3\overline{\chi}^2\eta^2\chi\nu + 3\chi^2\overline{\chi}\eta\nu^2 -\chi^3\nu^3\Big)\overline{\chi}\eta^3\\[3pt] & = \left(\chi\nu - \overline{\chi}\eta\right)^3\left(\chi\nu^3 - \overline{\chi}\eta^3\right). \end{align*}

As soon as $\Phi({-}1) \gt 0$ , clearly $\rho^{*}_{\chi, \nu, \eta} \lt -1$ , so the limiting curvature is always strictly positive. The sign of $\Phi({-}1)$ is given by that of $\left(\chi\nu - \overline{\chi}\eta\right)\left(\chi\nu^3 - \overline{\chi}\eta^3\right)$ , which is an upward parabola in $\chi$ .

5. The stock smile under multi-factor models

We use the setting of Section 2.2.1 to apply our results to an asset price of the form

\[S_t = S_0 + \int_0^t S_r \sqrt{v_r} \,\mathrm{d} B_r,\]

where B is correlated with the other N Brownian motions as $B= \sum_{i=1}^{N} \rho_i W^i$ with $\sum_{i=1}^N \rho_i^2=1,\,\rho_i\in[{-}1, 1]$ for all $i\in [\![1,N ]\!]$ . The volatility is a function of $(N-1)$ Brownian motions, such that the stock price features one additional and independent source of randomness. To fit this model into (3) we set $A=S$ and identify $\phi^i$ with $\rho_i \sqrt v$ . We modify the notation slightly to differentiate from the VIX framework: the implied volatility is denoted by $\widehat{\mathcal{I}}_{T}$ and the skew by $\widehat{\mathcal{S}}_{T}$ . We do not consider the curvature in this setting, for lack of an explicit formula. The proof of this proposition and the following corollary are postponed to Section 6.5.

Proposition 4. Assume that there exist $H\in(0,\frac{1}{2})$ and a random variable X such that, for all $0\le s\le y$ , $j\in [\![ 1,N ]\!]$ , and $p\ge1$ , $X\in L^p$ , the following hold:

(i) $v_s\le X$ ;
(ii) $\mathrm{D}_s^j v_y\le X (y-s)^{H_{-}}$ ;
(iii) $\sup_{s\le T} \mathbb{E}[u_s^{-p}] \lt \infty$ ;
(iv) $\limsup_{T\downarrow0} \mathbb{E}\big[(\sqrt{v_T/v_0}-1)^2 \big]=0$ .

Then the short-time limits of the implied volatility and skew are

$$\lim_{T\downarrow0} \widehat{\mathcal{I}}_{T} = \sqrt{v_0} \qquad \text{and} \qquad \lim_{T\downarrow0} \frac{\widehat{\mathcal{S}}_T}{T^{H_{-}}} = \frac{1}{2 v_{0}}\sum_{j=1}^{N} \rho_{j} \lim_{T\downarrow0} \frac{\int_{0}^{T} \int_{s}^{T} \mathbb{E}\left[ \mathrm{D}^{j}_{s} v_{y}\right]\mathrm{d}y\mathrm{d}s}{T^{H+3/2}}.$$

Remark 5.

• The second limit is finite because of the condition (ii).
• The one-dimensional version ( $N=2$ ) agrees with [Reference Alòs, León and Vives4, Theorem 6.3] up to the sign because the authors of [Reference Alòs, León and Vives4] derive with respect to the spot x and not to the log-strike k.

In the two-factor rough Bergomi model (10) we can compute the short-time skew more explicitly. Recall from Example 2.2.4 that, for all $t\ge0$ , it means setting $N=3$ and defining

\begin{equation*}\left\{\begin{array}{rl}S_t &= \displaystyle S_0 + \int_0^t S_r \sqrt{v_r} \,\mathrm{d} B_r,\\[3pt] v_t &= v_0 \left[ \chi \mathcal{E}\left(\nu W^{1,H}\right)_{t} + \overline{\chi} \mathcal{E}\left(\eta\left(\rho W^{1,H} + \overline{\rho} W^{2,H}\right)\right)_{t} \right],\end{array}\right.\end{equation*}

where $W^{i,H}_t = \int_0^t (t-s)^{H_{-}} \,\mathrm{d} W^i_s$ , for $i=1,2$ and $B = \sum_{i=1}^3 \rho_i W^i$ , with $W^1,W^2,W^3$ being independent Brownian motions. Hence $W^3$ influences only the asset price, not the variance.

Corollary 1. In the two-factor rough Bergomi model we have the short-time skew limit

(11)

\begin{align} \lim_{T\downarrow0} \frac{\widehat{\mathcal{S}}_T}{T^{H_{-}}} = \frac{\rho_{1} \chi\nu + \eta \overline{\chi} (\rho_{1} \rho+\rho_{2}\overline{\rho})}{2H_{+}(1+H_{+})}. \end{align}

5.1. Tips for joint calibration in the two-factor rough Bergomi model

Assuming we can observe the short-time limits of the spot ATM implied volatility, it grants us $v_0$ for free, while the slope of its skew gives us H by (11). Next, we simplify the expressions from Proposition 3 in the case $\chi=\frac{1}{2}$ . Denote by $\mathcal{I}_0$ , $\mathcal{S}_0$ , and $\mathcal{C}_0$ the three limits of Proposition 3, and let $H_{\pm} \,:\!=\, H\pm\frac{1}{2}$ , $\alpha\,:\!=\,\eta\rho$ , $\beta\,:\!=\,\eta\overline{\rho}$ . Introduce further the normalised parameters

$$\widetilde\alpha \,:\!=\, \frac{\alpha}{\nu},\qquad\widetilde\beta \,:\!=\, \frac{\beta}{\nu},$$

so that, defining $\widetilde\psi(\widetilde\alpha, \widetilde\beta)\,:\!=\,\sqrt{(1+\widetilde\alpha)^2+\widetilde\beta^2}$ , we have, after simplifications,

\begin{equation*}\begin{array}{r@{\quad}l@{\quad}l}\mathcal{I}_0 & = \frac{\nu\Delta^{H_{-}}}{4H_+} \sqrt{(1+\widetilde\alpha)^2+\widetilde\beta^2} & =: \displaystyle \nu C_{I}\widetilde\psi(\widetilde\alpha, \widetilde\beta),\\[3pt] \mathcal{S}_0& = \frac{\nu H_+\Delta^{H_{-}}}{2} \frac{(1+\widetilde\alpha)^2 \left[ \frac{1+\widetilde\alpha^2}{2H}-\left(\frac{1+\widetilde\alpha}{H_{+}}\right)^2\right]+ 2 (1+\widetilde\alpha)\widetilde\beta^2 \left[ \frac{\widetilde\alpha}{2H} - \frac{1+\widetilde\alpha}{H_{+}}\right] + \widetilde\alpha^4 \left[\frac{1}{2H}-\frac{1}{H_+^2}\right]}{\big( (1+\widetilde\alpha)^2+\widetilde\beta^2 \big)^{3/2}} & =: \displaystyle \nu C_{\mathcal{S}}\frac{\Phi_{\mathcal{S}}(\widetilde\alpha, \widetilde\beta)}{\widetilde\psi(\widetilde\alpha, \widetilde\beta)^3},\\[3pt] \mathcal{C}_0& = \frac{128\nu H_+^2}{3\Delta^{2H}}\frac{\Big\{ (1+\widetilde\alpha)^3 (1+\widetilde\alpha^3) + 3 (1+\widetilde\alpha)^2 \widetilde\alpha^2 \widetilde\beta^2 + 3 (1+\widetilde\alpha) \widetilde\alpha \widetilde\beta^4 + \widetilde\beta^6 \Big\}}{\big( (1+\widetilde\alpha)^2+\widetilde\beta^2 \big)^{5/2}(1-6H)} & =: \displaystyle \nu C_{\mathcal{C}}\frac{\Phi_{\mathcal{C}}(\widetilde\alpha, \widetilde\beta)}{\widetilde\psi(\widetilde\alpha, \widetilde\beta)^5},\end{array}\end{equation*}

where the constants $C_I, C_\mathcal{S}, C_\mathcal{C}$ only depend on $\Delta$ and H. Provided we can observe an approximation of these three limits, we can numerically solve for $\nu,\widetilde\alpha,\widetilde\beta$ in a system with three equations. Alternatively, since all three quantities have the factor $\nu$ , any quotient of two of them is a function of only $\widetilde\alpha,\widetilde\beta$ , which we can plot and match to observed data. Both methods allow us to deduce $\nu,\widetilde\alpha,\widetilde\beta$ , in turn yielding $\eta$ and $\rho$ . Finally, we are left with $\rho_1$ and $\rho_2$ to play with so that the right-hand-side of (11) matches the market observations.

Remark 6. We are not here—as in fact in most papers related to asymptotics—advocating the use of these formulae for actual direct option pricing, since they are asymptotics. In particular, this raises several calibration issues (shared with most results on the topic): (i) very short-maturity options on the VIX are hardly available, and the computation of the curvature, in particular, is a matter of personal choice (the result will change drastically depending on the number of data points around the ATM), which is left to the trader; (ii) such asymptotic formulae serve to provide some intuition about the roles of the model parameters, in particular on which one helps for each part of the smile. One key message of our result, for example, is that the model is able to disentangle (over short time horizons) the role of H and that of $\nu, \eta,\rho$ , and to a certain extent the role of $\nu$ and that of $\eta,\rho$ . Compared to simpler models (one-factor (rough) Bergomi, classical Heston), we have more parameters here, and our results should be combined with more asymptotics (for smile wings and large expiries) to be fully meaningful. Unfortunately, these are not fully available yet, and we would rather leave a full-scale numerical calibration scheme to future endeavours.

6. Proofs

6.1. Useful results

We start by adapting to the multivariate case a well-known decomposition formula and then prove a lemma which will be used extensively in the rest of the proofs. Both proofs build on the multidimensional anticipative Itô formula [Reference Nualart42, Theorem 3.2.4].

Proposition 5. (Price decomposition.) Under $\boldsymbol{(\mathrm{H}_{123})}$ , the following decomposition formula holds, for all $t\in\mathbb{T}$ , for the price (5), with $u_t$ defined in (4) and $G\,:\!=\,(\partial_{x}^2 - \partial_{x})\mathrm{BS}$ :

$$\Pi_t(k) = \mathbb{E}_{t}\left[\mathrm{BS}(t,\mathfrak{M}_t,k,u_{t})\right]+ \frac{1}{2}\mathbb{E}_{t}\left[\int_{t}^{T}\partial_{x} G (s,\mathfrak{M}_s,k,u_{s})|\boldsymbol\Theta_{s}|\mathrm{d}s\right].$$

Proof. Define $\widehat{\mathrm{BS}}(t,x,k,\sigma^2 (T-t)) \,:\!=\, \mathrm{BS}(t,x,k,\sigma)$ and write for simplicity $\widehat{\mathrm{BS}}_t\,:\!=\,\widehat{\mathrm{BS}}\left(t,\mathfrak{M}_t,k, Y_t \right)=\mathrm{BS}\left(t,\mathfrak{M}_t,k, u_t \right)$ , where we recall that $Y_t=u_t^2(T-t)$ . Note that $\Pi_T= \widehat{\mathrm{BS}}_T$ ; hence $\Pi_t=\mathbb{E}_t\left[\widehat{\mathrm{BS}}_T\right]$ by no-arbitrage arguments. Thanks to $\boldsymbol{(\mathrm{H}_{1})}$ and $\boldsymbol{(\mathrm{H}_{2})}$ , we can then apply a multidimensional anticipative Itô’s formula [Reference Nualart42, Theorem 3.2.4] with respect to $(t,\mathfrak{M},Y)$ :

\begin{align*} \mathrm{BS}(T, \mathfrak{M}_T,k, u_{T}) = \widehat{\mathrm{BS}}_{T} &= \widehat{\mathrm{BS}}_{t} + \int_t^T \partial_{s} \widehat{\mathrm{BS}}_{s}\mathrm{d}s + \int_t^T \partial_{x} \widehat{\mathrm{BS}}_{s} \left(\mathrm{d}(\boldsymbol\phi\bullet\mathbf{W})_{s}-\frac{1}{2} \|\boldsymbol\phi_s\|^2\mathrm{d}s\right) \\[3pt] &- \int_t^T \partial_{y} \widehat{\mathrm{BS}}_{s}\|\boldsymbol\phi_s\|^2\mathrm{d}s + \frac{1}{2} \int_t^T \partial_{x}^2 \widehat{\mathrm{BS}}_{s}\|\boldsymbol\phi_s\|^2\mathrm{d}s + \int_t^T \partial_{xy} \widehat{\mathrm{BS}}_{s}|\boldsymbol\Theta_s|\mathrm{d}s, \end{align*}

with $\boldsymbol\Theta$ as in (4). The derivatives of the Black–Scholes price read as follows (for simplicity, we omit the argument):

$$ \partial_{s} \widehat{\mathrm{BS}}_s= \partial_{s} \mathrm{BS}_{s} + \frac{u_s \partial_{u}\mathrm{BS}}{2(T-s)}, \qquad \partial_{y} \widehat{\mathrm{BS}}_{s} = \frac{\partial_{u}\mathrm{BS}}{2 u_{s} (T-s)}, \qquad\text{and}\qquad G=\frac{\partial_{u}\mathrm{BS}}{u_{s} (T-s)}. $$

Putting everything together, using the gamma–vega–delta relation

(12)

\begin{align} \frac{\partial_{\sigma}\mathrm{BS}(t,x,k,\sigma)}{\sigma(T-t)} = ( \partial_{x}^2 - \partial_{x}) \mathrm{BS}(t,x,k,\sigma), \end{align}

and applying conditional expectation, we obtain

(13)

\begin{align} \Pi_t = \mathbb{E}_t\left[\mathrm{BS}(t,\mathfrak{M}_t, k,u_{t})\right] +\mathbb{E}_t \left[\int_{t}^{T}\mathcal{L}_{\mathrm{BS}} (s,u_{s})\mathrm{d}s\right] +\mathbb{E}_{t}\left[ \int_{t}^{T}\frac{\partial_{xu}\mathrm{BS}(s,\mathfrak{M}_s,k,u_{s})}{2u_{s}(T-s)}|\boldsymbol\Theta_{s}|\mathrm{d}s\right], \end{align}

where $\mathcal{L}_{\mathrm{BS}}(s,u_{s}) \,:\!=\, \frac{1}{2}\left[u_{s}^2\left(\partial_{x}^2-\partial_{x}\right)+\partial_{s}\right]\mathrm{BS}(s,\mathfrak{M}_s,k,u_{s})$ is the Black–Scholes operator applied to the Black–Scholes function. Since $\mathcal{L}_{\mathrm{BS}}(s,u_{s})=0$ by construction and

$$ \partial_{x} G(s,x,k,\sigma)=\frac{\mathrm{e}^x \mathcal{N}^{\prime}(d_+(x,k,\sigma))}{\sigma \sqrt{T-s}}\left(1-\frac{d_+(x,k,\sigma)}{\sigma\sqrt{T-s}}\right), $$

the last term in (13) is well defined by $\boldsymbol{(\mathrm{H}_{3})}$ and the proposition follows.

Lemma 2. For all $t\in\mathbb{T}$ , let $J_t \,:\!=\, \int_t^T a_s \mathrm{d}s$ , for some adapted process $a\in \mathbb{L}^{1,2}$ , and let $\mathfrak{L}\,:\!=\,\sum_{i=1}^n c_i \partial_x^i$ be a linear combination of partial derivatives, with weights $c_i\in\mathbb{R}$ . Then, writing for clarity $\mathrm{BS}_t\,:\!=\,\mathrm{BS}(t,\mathfrak{M}_t,\mathfrak{M}_0,u_{t})$ , we have

(14)

\begin{align} \mathbb{E} \left[ \int_0^T \mathfrak{L} \mathrm{BS}_s a_s \mathrm{d}s \right] & = \mathbb{E}\left[ \mathfrak L \mathrm{BS}_0 J_0 + \int_0^T \left(\partial_x^3-\partial_x^2\right) \mathfrak L \mathrm{BS}_s \lvert \boldsymbol\Theta_s\lvert J_s \mathrm{d}s \right. \nonumber\\ & \left. + \int_0^T \partial_x \mathfrak L \mathrm{BS}_s \sum_{k=1}^N \left( \phi_s^k \mathrm{D}^k_s J_s\right) \mathrm{d}s \right]. \end{align}

Remark 7. We will use this lemma freely below, with the justification that the condition $a\in \mathbb{L}^{1,2}$ is always satisfied thanks to $\boldsymbol{(\mathrm{H}_{1})}$ .

Proof. As in the proof of Proposition 5, we define $\widehat{\mathrm{BS}}(t,x,k,\sigma^2 (T-t)) \,:\!=\, \mathrm{BS}(t,x,k,\sigma)$ and write for simplicity $\widehat{\mathrm{BS}}_t\,:\!=\,\widehat{\mathrm{BS}}\left(t,\mathfrak{M}_t,\mathfrak{M}_0, Y_t \right)=\mathrm{BS}\left(t,\mathfrak{M}_t,\mathfrak{M}_0, u_t \right)$ . Define $\widehat{\mathrm{P}}(t,x,k,y,j)\,:\!=\,\mathfrak{L} \widehat {\mathrm{BS}}(t,x,k,u)j$ and denote $\widehat{\mathrm{P}}\left(t,\mathfrak{M}_s,\mathfrak{M}_0,u_s, J_s\right)$ by $\widehat{\mathrm{P}}_t$ for simplicity. We then apply the multidimensional anticipative Itô’s formula [Reference Nualart42, Theorem 3.2.4] with respect to $(t,\mathfrak{M},Y,J)$ :

\begin{align*} \widehat{\mathrm{P}}_{T} =& \,\widehat{\mathrm{P}}_{0} + \int_0^T \partial_{s} \widehat{\mathrm{P}}_{s}\mathrm{d}s + \int_0^T \partial_{x} \widehat{\mathrm{P}}_{s} \left(\mathrm{d}(\boldsymbol\phi\bullet\mathbf{W})_{s}-\frac{1}{2} {\left\|{\boldsymbol\phi_s}\right\|}^2\mathrm{d}s\right) - \int_t^T \partial_{y} \widehat{\mathrm{P}}_{s}{\left\|{\boldsymbol\phi_s}\right\|}^2\mathrm{d}s\\[3pt] & + \frac{1}{2} \int_t^T \partial_{x}^2 \widehat{\mathrm{P}}_{s}{\left\|{\boldsymbol\phi_s}\right\|}^2\mathrm{d}s + \int_t^T \partial_{xy} \widehat{\mathrm{P}}_{s}|\boldsymbol\Theta_s|\mathrm{d}s + \int_0^T \partial_j \widehat{\mathrm{P}}_s \,\mathrm{d} J_s + \int_0^T \partial_{xj} \widehat{\mathrm{P}}_s \sum_{k=1}^N \left(\phi^k_s \mathrm{D}^k_s J_s\right) \mathrm{d}s. \end{align*}

One first notices that $\widehat{\mathrm{P}}_0 = \mathfrak{L} \widehat{\mathrm{BS}}_0 J_0$ and $\widehat{\mathrm{P}}_T=0$ . Moreover we observe that $\int_0^T \partial_j \widehat{\mathrm{P}}_s \,\mathrm{d} J_s = -\int_0^T \mathfrak{L} \widehat{\mathrm{BS}}_s a_s \mathrm{d}s$ , which corresponds to the left-hand-side of (14), and

\[ \int_0^T \partial_{xj} \widehat{\mathrm{P}}_s \sum_{k=1}^N \left(\phi^k_s \mathrm{D}^k_s J_s\right) \mathrm{d}s = \int_{0}^{T} \partial_{x} \mathfrak{L}\widehat {\mathrm{BS}}_{s} \sum_{k=1}^N \left( \phi^{k}_{s} \, \mathrm{D}^k_{s} J_{s} \right) \mathrm{d}s. \]

Since $\mathfrak{L}$ is a linear operator, the partial derivatives in s, x, and u cancel as in the proof of Proposition 5. That means we are left with

\begin{align*} \int_{0}^{T} \mathfrak{L} \widehat{\mathrm{BS}}_{s} a_{s} \mathrm{d}s & = \mathfrak L \widehat{\mathrm{BS}}_{0} J_{0} + \int_{0}^T \left(\partial_{x}^{3}-\partial_{x}^2\right) \mathfrak{L} \widehat{\mathrm{BS}}_{s} \lvert \boldsymbol{\Theta}_{s}\lvert J_{s} \mathrm{d}s +\int_0^T \partial_{x} \widehat{\mathrm{P}}_{s} \mathrm{d}(\boldsymbol\phi \bullet{\mathbf{W}})_{s}\\[3pt] & \quad + \int_0^T \partial_{x} \mathfrak{L}\widehat {\mathrm{BS}}_{s} \sum_{k=1}^N \left( \phi^k_{s} \, \mathrm{D}^{k}_{s} J_{s} \right) \mathrm{d}s. \end{align*}

Since $\partial_x^n \mathrm{BS}(s,x,u)= \partial_x^n \widehat{\mathrm{BS}}(s,x,u^2(T-s))$ for any $n\in\mathbb N$ , summing everything and taking expectations imply the claim.

We adapt and clarify [Reference Alòs, León and Vives4, Lemma 4.1] to obtain a convenient bound for the partial derivatives of G. For notational simplicity, since $\sigma$ and $T-t$ are fixed, we write $\varsigma\,:\!=\,\sigma\sqrt{T-t}$ and $\mathfrak{G}(x,k,\varsigma)\,:\!=\,G(t,x,k,\sigma)$ .

Proposition 6. For any $n\in\mathbb{N}$ and $p\in\mathbb{R}$ , there exists $C_{n,p} \gt 0$ independent of x and $\varsigma$ such that, for all $\varsigma \gt 0$ and $x\in\mathbb{R}\setminus\left\{0, \frac{\varsigma^2}{2}\right\}$ ,

(15)

\begin{align} \partial_{x}^{n}\mathfrak{G}(x,k,\varsigma) \leq \frac{C_{n,p}\,\mathrm{e}^k}{\varsigma^{p+1}}. \end{align}

If $x=0$ , then for any $n\in\mathbb{N}$ the bound (15) holds with $p=n$ .

If $x=\frac{1}{2} \varsigma^2$ , there exists a strictly positive constant $C_{n}$ independent of $\varsigma$ such that

\begin{equation*} \partial_{x}^{n}\mathfrak{G}\left(\frac{\varsigma^2}{2},k,\varsigma\right) = \left\{ \begin{array}{l@{\quad}l} \displaystyle \frac{C_n\,\mathrm{e}^k}{\varsigma^{n+1}}, & \text{if } n\text{ is even},\\[3pt] 0, & \text{if } n\text{ is odd}. \end{array} \right. \end{equation*}

The following simplification (and extension) will be useful later.

Corollary 2. For any $n\in\mathbb{N}$ , there exists a non-negative $C_{n,k}$ independent of x and $\varsigma$ such that, for all $\varsigma \gt 0$ and $x\in\mathbb{R}$ ,

\begin{align*} \left|\partial_{x}^{n}\mathfrak{G}(x,k,\varsigma)\right| \leq \frac{C_{n,k}}{\varsigma^{n+1}}, \qquad \left|\partial_k \partial_{x}^{n}\mathfrak{G}(x,k,\varsigma)\right| \leq \frac{C_{n,k}}{\varsigma^{n+2}},\qquad\text{and}\qquad \left|\partial_k^2 \partial_{x}^{n}\mathfrak{G}(x,k,\varsigma)\right| \leq \frac{C_{n,k}}{\varsigma^{n+3}}. \end{align*}

Proof of Proposition 6. We first consider the case $k=0$ . Since

$$ \mathfrak{G}(x,0,\varsigma)\,:\!=\, (\partial_{xx} - \partial_{x}) \mathrm{BS}(t, x, 0,\varsigma) = \frac{1}{\varsigma\sqrt{2\pi}}\exp\left\{x -\frac{1}{2}d_+(x,\varsigma)^2\right\}, $$

where $d_+(x,\varsigma) \,:\!=\,d_+(x,0,\sigma)= \frac{x}{\varsigma}+\frac{\varsigma}{2}$ , direct computation (proof by recursion) yields, for any $n\in\mathbb{N}$ ,

(16)

\begin{align} \partial_{x}^{n}\mathfrak{G}(x,0,\varsigma) = \exp\left\{-\frac{(\varsigma^2-2x)^2}{8\varsigma^2}\right\}\sum_{j=0}^{n}\alpha_j\frac{P_j(x)}{\varsigma^{2j+1}}, \end{align}

where, for each j, $P_{j}$ is a polynomial of degree j independent of $\varsigma$ .

Since $d_+(\frac{\varsigma^2}{2},\varsigma)=\partial_x d_+(\frac{\varsigma^2}{2},\varsigma)=0$ , $\partial_x^2 d_+(\frac{\varsigma^2}{2},\varsigma) = -\frac{1}{\varsigma^{2}}$ , the induction simplifies to

\begin{equation*} \left.\partial_{x}^{n}\mathfrak{G}(x,0,\varsigma)\right|_{x=\frac{\varsigma^2}{2}} = \left\{ \begin{array}{l@{\quad}l} \displaystyle \frac{C_n}{\varsigma^{n+1}}, & \text{if } n\text{ is even},\\[3pt] 0, & \text{if } n\text{ is odd}, \end{array} \right. \end{equation*}

for some constant $C_n \gt 0$ independent of $\varsigma$ , proving the third statement in the proposition.

Similarly, if $x=0$ , simplifications occur which yield, for any $n\in\mathbb{N}$ ,

$$ \left.\partial_{x}^{n}\mathfrak{G}(x,0,\varsigma)\right|_{x=0} = \exp\left\{-\frac{\varsigma^2}{8}\right\} \sum_{j=0}^{n}\frac{\alpha_j}{\varsigma^{j+1}} = \frac{1}{\varsigma^{n+1}}\exp\left\{-\frac{\varsigma^2}{8}\right\} \sum_{j=0}^{n}\alpha_j \varsigma^{n-j}, $$

and the second statement in the proposition follows.

Finally, in the general case $x\in\mathbb{R}\setminus\left\{0,\frac{\varsigma^2}{2}\right\}$ , we can rewrite (16) for any $p\in\mathbb{R}$ as

\begin{align*} \partial_{x}^{n}\mathfrak{G}(x,0,\varsigma) & = \frac{1}{\varsigma^{p+1}}\exp\left\{-\frac{(\varsigma^2-2x)^2}{8\varsigma^2}\right\} \sum_{j=0}^{n}\alpha_k P_j(x) \varsigma^{p-2j} \\[3pt] & \qquad =: \frac{1}{\varsigma^{p+1}}\exp\left\{-\frac{(\varsigma^2-2x)^2}{8\varsigma^2}\right\} H_{n,p}(x,\varsigma). \end{align*}

For each $n \in \mathbb{N}, p\in\mathbb{N}$ , $H_{n,p}$ is a two-dimensional function consisting only of powers of $\varsigma^2$ and $x^2/\varsigma^2$ . Since the exponential factor contains these very same terms, there exists a strictly positive constant $C_{n,p}$ , independent of x and $\varsigma$ , such that

$$ \exp\left\{-\frac{(\varsigma^2-2x)^2}{8\varsigma^2}\right\} H_{n,p}(x,\varsigma) \leq C_{n,p},$$

proving the proposition in the case $k=0$ .

The case $k\in\mathbb{R}$ follows directly from the observation that $\mathfrak{G}(x,0,\varsigma) = \mathfrak{G}(x-k,0,\varsigma) \mathrm{e}^k$ . Finally, since $\partial_k d_+(x,k,\sigma) = - \partial_x d_+(x,k,\sigma)$ and $\partial_k^2 d_+(x,k,\sigma) = - \partial_x^2 d_+(x,k,\sigma)$ , the same simplifications occur if we take a partial derivative with respect to k instead of x.

6.2. Proofs of the main results

6.2.1. Proof of Theorem 1: level

To prove this result, we draw insights from the proofs of [Reference Alòs, García-Lorite and Muguruza2, Theorem 8] and [Reference Alòs and Shiraya5, Proposition 3.1]. By definition

$$\mathcal{I}_{T}= \mathrm{BS}^{\leftarrow}(0,\mathfrak{M}_0,\mathfrak{M}_0,\Pi_0) =: \overleftarrow{\mathrm{BS}}(\Pi_0),$$

and we write $\widetilde{\mathrm{BS}}(x) \,:\!=\, \mathrm{BS}(0,x,x,u_0)$ . Using Proposition 5 at time 0, we see that $\Pi_0=\Gamma_T$ , where

$$\Gamma_t \,:\!=\, \mathbb{E}\left[\widetilde{\mathrm{BS}}(\mathfrak{M}_0) + \frac{1}{2} \int_0^t \partial_{x} G(s,\mathfrak{M}_s,\mathfrak{M}_0,u_s)|\boldsymbol\Theta_{s}|\mathrm{d}s\right],\qquad\text{for }t\in\mathbb{T},$$

which is a deterministic path. The fundamental theorem of integration reads

\begin{align*}\mathcal{I}_{T}=\overleftarrow{\mathrm{BS}}(\Gamma_T)= \overleftarrow{\mathrm{BS}}(\Gamma_0) + \int_0^T \partial_{t} \overleftarrow{\mathrm{BS}}(\Gamma_t) \mathrm{d}t&= \overleftarrow{\mathrm{BS}}(\Gamma_0) + \int_0^T \overleftarrow{\mathrm{BS}}^{\prime}(\Gamma_t) \partial_{t} \Gamma_t \mathrm{d}t\nonumber \\[3pt] & = \overleftarrow{\mathrm{BS}}(\Gamma_0) + \frac{1}{2} \int_0^T \overleftarrow{\mathrm{BS}}^{\prime}(\Gamma_t) \, \mathbb{E}\big[|\boldsymbol\Theta_{t}|\partial_{x} G_t \big] \mathrm{d}t,\end{align*}

where $G_t\,:\!=\, G(t,\mathfrak{M}_t,\mathfrak{M}_0,u_t)$ . We can deal with the integral by computing $\overleftarrow{\mathrm{BS}}^{\prime}$ and $\partial_x G$ explicitly:

\begin{align*}& \overleftarrow{\mathrm{BS}}^{\prime}(\Gamma_t) = \left(\mathrm{e}^{\mathfrak{M}_0} \mathcal{N}^{\prime}\left(d_+\left(\mathfrak{M}_t,\mathfrak{M}_0,\overleftarrow{\mathrm{BS}}(\Gamma_t)\right)\right) \sqrt{T-t} \right)^{-1}, \\[3pt] & \partial_x G (s,x,k,\sigma) = \frac{\mathrm{e}^{x} \mathcal{N}^{\prime}\big(d_+(x,k,\sigma)\big)}{\sigma\sqrt{T-s}} \left( 1 -\frac{d_+(x,k,\sigma)}{\sigma \sqrt{T-s}}\right).\end{align*}

Since $\Gamma:\mathbb{R}_+\to\mathbb{R}$ and $\overleftarrow{\mathrm{BS}}:\mathbb{R}\to\mathbb{R}$ are continuous, the following is uniformly bounded for all $T\le1$ :

\[\frac{\mathcal{N}^{\prime} \big(d_+(\mathfrak{M}_s,\mathfrak{M}_0,u_0)\big)}{\mathcal{N}^{\prime}\left(d_+\left(\mathfrak{M}_s,\mathfrak{M}_0,\overleftarrow{\mathrm{BS}}(\Gamma_s) \right)\right)} = \exp\left\{\frac{1}{8}\left( (T-s) \,\overleftarrow{\mathrm{BS}}(\Gamma_s)^2 - \mathfrak{u}_0^2\right) \right\}.\]

Therefore, by $\boldsymbol{(\mathrm{H}_{4})}$ we obtain

\[\lim_{T\downarrow0} \mathbb{E}\left[\int_0^T \overleftarrow{\mathrm{BS}}^{\prime}(\Gamma_t) |\boldsymbol\Theta_{t}|\partial_{x} G_t \mathrm{d}t \right] = \lim_{T\downarrow0} \mathbb{E}\left[ \int_0^T \frac{\mathcal{N}^{\prime}\left(d_+\left(\mathfrak{M}_t,\mathfrak{M}_0,u_t\right) \right)}{\mathcal{N}^{\prime} \left(d_+\left(\mathfrak{M}_t,\mathfrak{M}_0,\overleftarrow{\mathrm{BS}}(\Gamma_t)\right) \right)} \frac{\lvert \boldsymbol\Theta_t\lvert}{2 \mathfrak{u}_t^2 \sqrt{T}} \mathrm{d}t\right] =0.\]

Since $\Gamma_0=\mathbb{E}\left[\widetilde{\mathrm{BS}}(\mathfrak{M}_0)\right]$ and $u_0=\overleftarrow{\mathrm{BS}}\left(\widetilde{\mathrm{BS}}(\mathfrak{M}_0)\right)$ , we have

(17)

\begin{align}\overleftarrow{\mathrm{BS}}(\Gamma_0)= \overleftarrow{\mathrm{BS}}\left(\mathbb{E}\left[\widetilde{\mathrm{BS}}(\mathfrak{M}_0)\right]\right) - \mathbb{E}[u_0 - u_0]= \mathbb{E}\left[ \overleftarrow{\mathrm{BS}}\Big(\mathbb{E}\left[\widetilde{\mathrm{BS}}(\mathfrak{M}_0)\right] \Big) - \overleftarrow{\mathrm{BS}}\Big(\widetilde{\mathrm{BS}}(\mathfrak{M}_0) \Big)\right] + \mathbb{E}[u_0].\end{align}

The Clark–Ocone formula yields

$$\widetilde{\mathrm{BS}}(\mathfrak{M}_0) =\mathbb{E}\left[\widetilde{\mathrm{BS}}(\mathfrak{M}_0)\right]+\sum_{i=1}^N\int_0^T \mathbb{E}_s\left[\mathrm{D}^{i}_s \widetilde{\mathrm{BS}}(\mathfrak{M}_0)\right] \mathrm{d} W^i_s,$$

and by the gamma–vega–delta relation (12) we have

(18)

\begin{align}\partial_\sigma \mathrm{BS} (0,x,x,\sigma) =\exp\left\{x-\frac{\sigma^2}{8}T\right\}\sqrt{\frac{T}{2\pi}},\end{align}

which in turn implies

(19)

\begin{align}U^i_s \,:\!=\, \mathbb{E}_s\left[\mathrm{D}^{i}_s \widetilde{\mathrm{BS}}(\mathfrak{M}_0)\right]= \mathbb{E}_s \left[ \frac{\partial_{\sigma} \widetilde{\mathrm{BS}}(\mathfrak{M}_0)}{2 u_0 \sqrt T} \int_0^T \mathrm{D}^i_s {\left\|{\boldsymbol\phi_r}\right\|}^2 \mathrm{d}r \right] = \mathbb{E}_s \left[ \frac{\mathrm{e}^{\mathfrak{M}_0 - \frac{1}{8}\mathfrak{u}_0^2}}{2\sqrt{2\pi} u_0} \int_0^T \mathrm{D}^i_s {\left\|{\boldsymbol\phi_r}\right\|}^2 \mathrm{d}r \right].\end{align}

Define $\Lambda_r\,:\!=\,\mathbb{E}_r \Big[\widetilde{\mathrm{BS}}(\mathfrak{M}_0)\Big]$ , so that the difference we are interested in from (17) reads, after we apply the standard Itô’s formula,

(20)

\begin{align}\overleftarrow{\mathrm{BS}}(\Gamma_0)-\mathbb{E}[u_0] & = \mathbb{E}\Big[ \overleftarrow{\mathrm{BS}} ( \Lambda_0 ) - \overleftarrow{\mathrm{BS}} (\Lambda_T ) \Big] \nonumber\\[3pt] & = -\sum_{i=1}^N \mathbb{E}\left[\int_0^T \overleftarrow{\mathrm{BS}}^{\prime}(\Lambda_s) U^i_s \mathrm{d} W^i_s+\frac{1}{2}\int_0^T\overleftarrow{\mathrm{BS}}^{\prime\prime}(\Lambda_s) (U^i_s)^2 \mathrm{d}s\right].\end{align}

The stochastic integral above has zero expectation by the same argument as used for [Reference Alòs and Shiraya5, Proposition 3.1]. Moreover, $\boldsymbol{(\mathrm{H}}_{\boldsymbol{5}}\boldsymbol{)}$ states that $\mathfrak{u}_0$ is dominated almost surely by $Z\in L^p$ , and therefore so are $\Lambda$ and

\[\overleftarrow{\mathrm{BS}}^{\prime\prime}(\Lambda_s) = \frac{\overleftarrow{\mathrm{BS}}(\Lambda_s)}{4 \mathrm{e}^{2\mathfrak{M}_s} \mathcal{N}^{\prime}\Big(d_+\big(\mathfrak{M}_s,\mathfrak{M}_0,\overleftarrow{\mathrm{BS}}(\Lambda_s)\big)\Big)^2},\]

by continuity. Plugging in the expression for $U^i$ from (19), we apply $\boldsymbol{(\mathrm{H}}_{\boldsymbol{5}}\boldsymbol{)}$ to conclude that the second integral of (20) tends to zero.

6.2.2. Proof of Theorem 2: skew

This proof follows from arguments similar to those of [Reference Alòs, León and Vives4, Proposition 5.1]. We recall that $\Pi_0(k)= \mathrm{BS}\big(0,\mathfrak{M}_0,k,\mathcal{I}_{T}(k)\big)$ . On the one hand, by the chain rule we have

(21)

\begin{align}\partial_{k} \Pi_0(k) = \partial_{k}\mathrm{BS}\big(0,\mathfrak{M}_0,k,\mathcal{I}_{T}(k)\big)+\partial_{\sigma}\mathrm{BS}\big(0,\mathfrak{M}_0,k,\mathcal{I}_{T}(k)\big) \partial_{k} \mathcal{I}_{T}(k).\end{align}

On the other hand, the decomposition obtained in Proposition 5 yields

(22)

\begin{align}\partial_{k} \Pi_0(k)= \mathbb{E}\big[\partial_{k}\mathrm{BS}(0,\mathfrak{M}_T,k,u_{0})\big]+\mathbb{E}\left[ \int_{0}^{T} \frac{1}{2} \partial_{xk} G(s,\mathfrak{M}_s,\mathfrak{M}_0,u_s)|\boldsymbol\Theta_{s}|\mathrm{d}s\right].\end{align}

Equating (21) and (22) gives

(23)

\begin{align}\partial_{k} \mathcal{I}_{T}(k) & = \frac{\mathbb{E}\left[\partial_{k} \mathrm{BS}(0,\mathfrak{M}_T,k,u_{0})\right] - \partial_{k} \mathrm{BS}(0,\mathfrak{M}_0,k,\mathcal{I}_{T}(k))}{\partial_{\sigma} \mathrm{BS}(0,\mathfrak{M}_0,k,\mathcal{I}_{T}(k))} \nonumber \\[3pt] & \qquad +\frac{\mathbb{E}\left[ \int_{0}^{T}\partial_{xk} G(s,\mathfrak{M}_s,\mathfrak{M}_0,u_s)|\boldsymbol\Theta_{s}|\mathrm{d}s\right]}{2\partial_{\sigma}\mathrm{BS}(0,\mathfrak{M}_0,k,\mathcal{I}_{T}(k))},\end{align}

which in particular also holds for $k=\mathfrak{M}_0$ . Performing simple algebraic manipulations and using the derivatives of the Black–Scholes function ATM as in [Reference Alòs, León and Vives4, Proposition 5.1], we find the following (remember we drop the k-dependence in $\mathcal{I}_{T}$ when ATM):

$$\mathbb{E}\big[\partial_{k}\mathrm{BS}(0,\mathfrak{M}_0,\mathfrak{M}_0,u_{0}) \big]- \partial_{k}\mathrm{BS}(0,\mathfrak{M}_0,\mathfrak{M}_0,\mathcal{I}_{T})= \frac{1}{2} \mathbb{E}\left[\int_0^T \frac{1}{2}\partial_{x} G(s,\mathfrak{M}_s,\mathfrak{M}_0,u_s)|\boldsymbol\Theta_{s}|\mathrm{d}s\right].$$

By (23), this in turn yields

(24)

\begin{align}\partial_{k} \mathcal{I}_{T}= \frac{\mathbb{E}\left[ \int_{0}^{T} L(s,\mathfrak{M}_s,\mathfrak{M}_0,u_s)|\boldsymbol\Theta_{s}|\mathrm{d}s\right]}{\partial_{\sigma}\mathrm{BS}\Big(0,\mathfrak{M}_0,\mathfrak{M}_0,\mathcal{I}_{T} \Big) },\end{align}

where $L\,:\!=\,(\frac{1}{2}+\partial_{k})\frac{1}{2}\partial_{x} G$ . We write $L_s\,:\!=\,L(s,\mathfrak{M}_s,\mathfrak{M}_0,u_s)$ for simplicity and apply Lemma 2 to $L_s\, \int_s^T |\boldsymbol\Theta_r|\mathrm{d}r$ , which yields

\begin{align*}\mathbb{E} \left[ \int_0^T L_s |\boldsymbol\Theta_s|\mathrm{d}s \right]& = \mathbb{E} \left[L_0 \int_0^T|\boldsymbol\Theta_s| \mathrm{d}s \right]+ \mathbb{E} \left[ \int_0^T (\partial_{x}^3 - \partial_{x}^2) L_s |\boldsymbol\Theta_s| \left( \int_s^T |\boldsymbol\Theta_r|\mathrm{d}r \right) \mathrm{d}s \right] \\[3pt] & + \mathbb{E} \left[ \int_0^T \partial_{x} L_s \sum_{j=1}^N \left\{ \phi^j_s \, \mathrm{D}^j_s \left( \int_s^T |\boldsymbol\Theta_r|\mathrm{d}r \right) \right\} \mathrm{d}s \right]=: R_1 + R_2 + R_3.\end{align*}

We combine (18) with the bound $\partial_k \partial_x^n G(t,x,k,\sigma) \le C \big(\sigma\sqrt{T-t}\big)^{-n-2}$ from Corollary 2 to obtain

\begin{align*}\frac{R_2}{\partial_\sigma \mathrm{BS}(0,\mathfrak{M}_0,\mathfrak{M}_0,\mathcal{I}_{T})}&\le \frac{C}{\sqrt{T}}\mathbb{E}\left[\int_0^T \frac{|\boldsymbol\Theta_s|}{\mathfrak{u}_s^{6}} \left( \int_s^T |\boldsymbol\Theta_r|\mathrm{d}r \right) \mathrm{d}s\right], \\[3pt] \frac{R_3}{\partial_\sigma \mathrm{BS}(0,\mathfrak{M}_0,\mathfrak{M}_0,\mathcal{I}_{T})}&\le\frac{C}{\sqrt{T}} \mathbb{E}\left[\int_0^T \frac{1}{\mathfrak{u}_s^{4}} \sum_{j=1}^N \left\{ \phi^j_s \mathrm{D}^j_s \left( \int_s^T |\boldsymbol\Theta_r|\mathrm{d}r \right) \right\} \mathrm{d}s\right],\end{align*}

and both converge to zero by $\boldsymbol{(\mathrm{H}}_{\boldsymbol{6}}^{\lambda}\boldsymbol{)}$ . We are left with $R_1$ . From Section 6.6, we have

$$L(0,x,x,u) = \frac{\exp\{x-\frac{u^2}{8}T\}}{u\sqrt{2\pi T}}\left( \frac14 + \frac{1}{2u^2 T} \right),$$

and therefore by (18),

$$\frac{L_0}{\partial_\sigma \mathrm{BS}(0,\mathfrak{M}_0,\mathfrak{M}_0,\mathcal{I}_{T})}= \left( \frac14 + \frac{1}{2u_0^2 T} \right) \frac{1}{u_0 T} \exp\left\{ -\frac{T}{8} \Big(u_0^2 - \mathcal{I}_{T}^2 \Big)\right\}.$$

This yields

\[\displaystyle \frac{R_1}{T^{\lambda}\partial_\sigma \mathrm{BS}(0,\mathfrak{M}_0,\mathfrak{M}_0,\mathcal{I}_{T})}= \mathbb{E}\left[ \left(\frac{\mathfrak{u}_0^2}{2}+1\right) \exp\left\{ -\frac{T}{8} \Big(u_0^2 - \mathcal{I}_{T}^2 \Big)\right\} \mathfrak{K}_T\right],\]

where $\displaystyle\mathfrak{K}_T\,:\!=\,\frac{\int_0^T |\boldsymbol\Theta_s|\mathrm{d}s}{2 T^{\frac{1}{2}+\lambda} \mathfrak{u}_0^3}$ . Furthermore,

\[\sup_{\omega\in\Omega} \left|\exp\left\{ -\frac{T}{8} \Big(u_0^2 - \mathcal{I}_{T}^2 \Big)\right\} -1\right|= \sup_{\omega\in\Omega} \left| \exp\left\{ \frac18 \left(T \mathcal{I}_{T}^2 - \mathfrak{u}_0^2 \right)\right\} -1\right|\]

not only is finite but converges to zero as T goes to zero. Hence,

\[\lim_{T\downarrow0} \mathbb{E}\left[ \left(\frac{\mathfrak{u}_0^2}{2}+1\right) \exp\left\{ -\frac{T}{8} \Big(u_0^2 - \mathcal{I}_{T}^2 \Big)\right\} \mathfrak{K}_T\right] = \lim_{T\downarrow0} \mathbb{E}\left[ \left(\frac{\mathfrak{u}_0^2}{2}+1\right) \mathfrak{K}_T\right].\]

We can finally conclude by $\boldsymbol{(\mathrm{H}}_{\boldsymbol{7}}^{\lambda}\boldsymbol{)}$ that

\[\displaystyle \lim_{T\downarrow 0} \frac{R_1}{T^{\lambda}\partial_\sigma \mathrm{BS}(0,\mathfrak{M}_0,\mathfrak{M}_0,\mathcal{I}_{T})}= \lim_{T\downarrow 0} \mathbb{E}[\mathfrak{K}_T],\]

which has a finite limit.

6.2.3. Proof of Theorem 3: curvature

Step 1. Let us start by simply taking a second derivative with respect to k (we write $\mathrm{BS}(\mathfrak{M}_0,\mathcal{I}_{T}(k))$ as short for $\mathrm{BS}(0,\mathfrak{M}_0,\mathfrak{M}_0,\mathcal{I}_{T}(k))$ ):

\begin{align*}\partial_{k} \Big( \partial_{\sigma}\mathrm{BS}\big(\mathfrak{M}_0,\mathcal{I}_{T}(k)\big) \,\partial_{k} \mathcal{I}_{T}(k) \Big) & = \partial_{\sigma}\mathrm{BS}\big(\mathfrak{M}_0,\mathcal{I}_{T}(k)\big) \partial_{k}^2 \mathcal{I}_{T}(k) \\[3pt] &\quad+\Big[ \partial_{k\sigma}\mathrm{BS}\big(\mathfrak{M}_0, \mathcal{I}_{T}(k)\big)+ \partial_{\sigma}^2\mathrm{BS}\big(\mathfrak{M}_0,\mathcal{I}_{T}(k)\big) \partial_{k} \mathcal{I}_{T}(k) \Big] \partial_{k} \mathcal{I}_{T}(k).\end{align*}

Taking the derivative with respect to k in (24) and equating with the above formula yields

\begin{align*}\partial_{k}^2 \mathcal{I}_{T}(k) \partial_{\sigma}\mathrm{BS}(\mathfrak{M}_0,\mathcal{I}_{T}) & =- \partial_{\sigma}^2\mathrm{BS}\big(\mathfrak{M}_0, \mathcal{I}_{T}(k)\big) \partial_{k} \mathcal{I}_{T}(k)^2 - \partial_{k\sigma}\mathrm{BS}\big(\mathfrak{M}_0, \mathcal{I}_{T}(k)\big) \partial_{k} \mathcal{I}_{T}(k)\\[3pt] &\quad + \mathbb{E} \left[ \int_0^T \partial_{k} L(s,\mathfrak{M}_s,\mathfrak{M}_0,u_{s} ) |\boldsymbol\Theta_s| \mathrm{d}s \right]=: T_1 + T_2 + T_3.\end{align*}

A similar expression is presented in [Reference Alòs and León3], and we notice that $T_1$ and $T_2$ in the expression above, after being multiplied by $T^{-\lambda}$ , are identical to those from [Reference Alòs and León3, Equation (25)] and can therefore be dealt with in the same way. Step 1 shows that $T^{-\lambda} T_1$ tends to zero as $T\downarrow 0$ , and Step 2 yields $T_2= - \frac{1}{2} \partial_{k} \mathcal{I}_{T}(k)$ .

Step 2. Recall that $L = \frac{1}{2} \left(\frac{1}{2} + \partial_{k} \right) \partial_{x} G$ . We need the anticipative Itô’s formula (Lemma 2) twice on $T_3$ . Indeed, even though the bound on $\partial_{x}^n G$ worsens as n increases, it is more than compensated for by the additional integrations. The terms with more integrals (i.e. more regularity) tend to zero as T goes to zero, by $\boldsymbol{(\mathrm{H}}_{\boldsymbol{8}}^{\gamma}\boldsymbol{)}$ , and we compute the others in closed form. For clarity we write $L_s = L (s,\mathfrak{M}_s,\mathfrak{M}_0,u_s )$ for all $s\ge0$ . By a first application of Lemma 2 on $\partial_{k} L_s \int_s^T |\boldsymbol\Theta_s| \mathrm{d}s$ we obtain

\begin{align*}T_3= & \, \mathbb{E} \left[\partial_{k} L_0 \int_0^T |\boldsymbol\Theta_s| \mathrm{d}s\right]+ \mathbb{E} \left[ \int_0^T (\partial_{x}^3 - \partial_{x}^2) \partial_{k} L_s |\boldsymbol\Theta_s| \left( \int_s^T |\boldsymbol\Theta_r|\mathrm{d}r \right) \mathrm{d}s \right] \\[3pt] & + \mathbb{E} \left[ \int_0^T \partial_{xk} L_s \sum_{j=1}^N \left\{ \phi^j_s \, \mathrm{D}^j_s \left( \int_s^T |\boldsymbol\Theta_r| \mathrm{d}r \right) \right\} \mathrm{d}s \right]=: S_1 + S_2 + S_3.\end{align*}

To deal with $S_2$ , we apply Lemma 2 again on $(\partial_{x}^3 - \partial_{x}^2) \partial_{k} L_s \int_s^T |\boldsymbol\Theta_r|\left( \int_r^T |\boldsymbol\Theta_{y}|\mathrm{d}y \right) \mathrm{d}r =: H_s Z_s$ , which yields

\[S_2= \mathbb{E} \left[ H_0 Z_0+ \int_0^T (\partial_{x}^3 - \partial_{x}^2) H_s |\boldsymbol\Theta_s| Z_s \mathrm{d}s+ \int_0^T \partial_{x} H_s \sum_{j=1}^N \left( \phi^j_s \mathrm{D}^j_s Z_s \right) \mathrm{d}s\right]=: S_2^a + S_2^b + S_2^c.\]

We will deal with these terms in the last step. For $S_3$ , we apply Lemma 2 once more to

$$\partial_{xk} L_s \int_s^T \sum_{j=1}^N \left\{ \phi^j_r \, \mathrm{D}^j_r \left( \int_r^T |\boldsymbol\Theta_{y}| \mathrm{d}y \right) \right\} \mathrm{d}r=: \widetilde{H}_s \widetilde{Z}_s,$$

and obtain

$$S_3= \mathbb{E} \left[ \widetilde{H}_0 \widetilde{Z}_0+ \int_0^T (\partial_{x}^3 - \partial_{x}^2) \widetilde{H}_s |\boldsymbol\Theta_s| \widetilde{Z}_s \mathrm{d}s+ \int_0^T \partial_{x} \widetilde{H}_s \sum_{j=1}^N \left( \phi^j_s \mathrm{D}^j_s \widetilde{Z}_s \right) \mathrm{d}s \right]=: S_3^a + S_3^b + S_3^c.$$

Step 3. We now evaluate the derivative at $k=\mathfrak{M}_0$ and drop the k-dependence. To summarise,

$$\partial_{k}^2 \mathcal{I}_{T} = \frac{T_1 + T_2 + S_1 + S_2^a + S_2^b + S_2^c + S_3^a + S_3^b + S_3^c }{\partial_\sigma \mathrm{BS} \big(0,\mathfrak{M}_0,\mathfrak{M}_0,\mathcal{I}_{T}\big)},$$

where

\begin{align*}S_1 & = \mathbb{E} \left[\partial_{k} L_0 \int_0^T |\boldsymbol\Theta_s| \mathrm{d}s \right],\\[3pt] S_2^a & = \mathbb{E} \left[H_0 \int_0^T |\boldsymbol\Theta_r| \left( \int_r^T |\boldsymbol\Theta_{y}|\mathrm{d}y \right) \mathrm{d}r\right], \\[3pt] S_3^a &= \mathbb{E} \left[\widetilde{H}_0 \int_0^T \sum_{j=1}^N \left\{ \phi^j_r \, \mathrm{D}^j_r \left( \int_r^T |\boldsymbol\Theta_{y}| \mathrm{d}y \right) \right\} \mathrm{d}r\right], \\[3pt] S_2^b &= \mathbb{E}\left[\int_0^T (\partial_{x}^3 - \partial_{x}^2) H_s |\boldsymbol\Theta_s|\left( \int_s^T \lvert \boldsymbol\Theta_r\lvert \left( \int_r^T |\boldsymbol\Theta_{y}|\mathrm{d}y \right) \mathrm{d}r \right) \mathrm{d}s\right], \\[3pt] S_2^c &= \mathbb{E} \left[\int_0^T \partial_{x} H_s \sum_{j=1}^N \left( \phi^j_s \mathrm{D}^j_s \left( \int_s^T |\boldsymbol\Theta_r| \left( \int_r^T |\boldsymbol\Theta_{y}|\mathrm{d}y \right) \mathrm{d}r \right) \mathrm{d}s \right) \mathrm{d}s\right],\\[3pt] S_3^b &= \mathbb{E} \left[\int_0^T (\partial_{x}^3 - \partial_{x}^2) \widetilde{H}_s |\boldsymbol\Theta_s|\int_s^T \sum_{j=1}^N \left\{ \phi^j_r \, \mathrm{D}^j_r \left( \int_r^T |\boldsymbol\Theta_{y}| \mathrm{d}y \right) \right\} \mathrm{d}r \mathrm{d}s\right], \\[3pt] S_3^c &= \mathbb{E} \left[\int_0^T \partial_{x} \widetilde{H}_s \sum_{k=1}^N \left\{ \phi^k_s \mathrm{D}^k_s \left( \int_s^T \sum_{j=1}^N \left\{ \phi^j_r \, \mathrm{D}^j_r \left( \int_r^T |\boldsymbol\Theta_{y}| \mathrm{d}y \right) \right\} \mathrm{d}r \right) \right\} \mathrm{d}s\right].\end{align*}

We recall once again the bound $\partial_x^n G(t,x,k,\sigma) \le C \big(\sigma\sqrt{T-t}\big)^{-n-1}$ as $T-t$ goes to zero. We observe that H and $\widetilde H$ consist of derivatives of G up to the sixth and the fourth order, respectively; therefore $S_2^b, S_2^c,S_3^b,S_3^c$ tend to zero by $\boldsymbol{(\mathrm{H}}_{\boldsymbol{8}}^{\gamma}\boldsymbol{)}$ . In order to deal with $S_1$ , $S_2^a$ , and $S_3^a$ , we use the explicit partial derivatives from Section 6.6 and (18); as in the proof of Theorem 2, $\boldsymbol{(\mathrm{H}}_{\boldsymbol{9}}^{\gamma}\boldsymbol{)}$ implies that only the higher derivatives of $\mathfrak{u}_0$ remain in the limit:

\begin{align*}\lim_{T\downarrow 0} \frac{S_1}{T^{\lambda}\partial_\sigma \mathrm{BS}\big(0,\mathfrak{M}_0,\mathfrak{M}_0,\mathcal{I}_{T}\big)}& = \lim_{T\downarrow 0} \frac{1}{T^{\lambda}}\frac{\mathbb{E}\left[ \partial_{k} L_0 \int_0^T |\boldsymbol\Theta_s| \mathrm{d}s \right]}{\partial_\sigma \mathrm{BS}\big(0,\mathfrak{M}_0,\mathfrak{M}_0,\mathcal{I}_{T}\big)}\\[3pt] & = \lim_{T\downarrow 0} \frac{1}{T^{\lambda}}\mathbb{E} \left[ \frac{1}{u_0 T} \left( \frac18 + \frac{1}{2 u_0^2 T}\right) \int_0^T |\boldsymbol\Theta_s| \mathrm{d}s \right] \\[3pt] & = \lim_{T\downarrow 0} \mathbb{E} \left[ \frac{1}{2 \mathfrak{u}_0^3 \,T^{\frac{1}{2}+\lambda}} \int_0^T |\boldsymbol\Theta_s| \mathrm{d}s \right]= \lim_{T\downarrow 0} \frac{\mathcal{S}_T}{T^{\lambda}},\\[3pt] \lim_{T\downarrow 0} \frac{S^a_2}{T^{\lambda}\partial_\sigma \mathrm{BS}\big((0,\mathfrak{M}_0,\mathfrak{M}_0,\mathcal{I}_{T}\big))}& = \lim_{T\downarrow 0} \frac{1}{T^{\lambda}} \mathbb{E} \left[ \frac{Z_0}{\mathfrak{u}_{0} \sqrt{T}} \left({-} \frac{15}{2\mathfrak{u}_0^6} - \frac{3}{2\mathfrak{u}_0^4} - \frac{5}{32\mathfrak{u}_0^2} - \frac{1}{64} \right) \right]\\[3pt] & = \lim_{T\downarrow 0} \mathbb{E} \left[ \frac{-15}{2\mathfrak{u}_0^7 T^{\frac{1}{2}+\lambda}} \int_0^T |\boldsymbol\Theta_r|\left( \int_r^T |\boldsymbol\Theta_{y}|\mathrm{d}y \right) \mathrm{d}r\right],\\[3pt] \lim_{T\downarrow 0} \frac{S^a_3 }{T^{\lambda}\partial_\sigma \mathrm{BS}\big(0,\mathfrak{M}_0,\mathfrak{M}_0,\mathcal{I}_{T}\big)}& = \lim_{T\downarrow 0} \frac{1}{T^{\lambda}}\mathbb{E} \left[ \frac{\widetilde{Z}_0}{\mathfrak{u}_0 \sqrt T} \left( \frac{3}{2 \mathfrak{u}_0^4} + \frac{3}{8\mathfrak{u}_0^2} + \frac{1}{16} \right) \right]\\[3pt] & = \lim_{T\downarrow 0} \frac32 \mathbb{E} \left[ \frac{1}{\mathfrak{u}_0^5 T^{\frac{1}{2}+\lambda}} \int_0^T \sum_{j=1}^N \left\{ \phi^j_r \, \mathrm{D}^j_r \left( \int_r^T |\boldsymbol\Theta_{y}|\mathrm{d}y \right) \right\} \mathrm{d}r \right].\end{align*}

Hence, to conclude, the claim follows from

\[\lim_{T\downarrow 0} \frac{\mathcal{C}_T}{T^{\lambda}}= \lim_{T\downarrow 0} \frac{T_2 + S_1 + S_2^a + S_3^a}{T^{\lambda} \partial_\sigma \mathrm{BS}(\mathfrak{M}_0,\mathcal{I}_{T})}.\]

6.3. Proof of Proposition 1: VIX asymptotics

In this section, we will repeatedly interchange the Malliavin derivative and conditional expectation, which is justified by [Reference Nualart42, Proposition 1.2.8].

Proposition 7. In the case where $A=\mathrm{VIX}$ , the conditions in $\boldsymbol{(\overline{\mathrm{C}}}\boldsymbol{)}$ imply the assumptions $\boldsymbol{(\overline{\mathrm{H}}}^{\lambda\gamma}\boldsymbol{)}$ for any $\lambda\in({-}\frac{1}{2},0]$ and $\gamma\in({-}1, 3H-\frac{1}{2}]$ .

Proof. We write $a\lesssim b$ when there exists $X\in L^p$ such that $a\le Xb$ almost surely, and $a\approx b$ if $a\lesssim b$ and $b\lesssim a$ . The assumption $\boldsymbol{(\mathrm{H}_{1})}$ is given by the first item of $\boldsymbol{(\mathrm{C}_{2})}$ , and $\boldsymbol{(\mathrm{H}_{2})}$ corresponds to $\boldsymbol{(\mathrm{C}_{1})}$ . Since $1/M$ is dominated, so is $1/\mathrm{VIX}$ . We then have, for $i=1,2$ and by Cauchy–Schwarz,

\[ m^i_y = \mathbb{E}_y\left[\frac{\int_T^{T+\Delta} \mathrm{D}^i_y v_r \mathrm{d}r}{2\Delta \mathrm{VIX}_T}\right] \lesssim \int_T^{T+\Delta} (r-y)^{H_{-}} \mathrm{d}r = \frac{(T+\Delta-y)^{H_{+}} - (T-y)^{H_{+}}}{H_{+}}. \]

If $H \lt \frac{1}{2}$ , then the incremental function $x\mapsto (x+\Delta)^{H_{+}} -x^{H_{+}}$ is decreasing by concavity. For $j=1,2$ and $t\le s$ , this implies by domination of $1/M$ that $\phi^i\approx m^i$ is also dominated and

(25)

\begin{align} \mathrm{D}^j_s \phi_y^i &= \frac{\mathrm{D}^j_s m^i_y}{M_y} - \frac{m^i_y \mathrm{D}^j_s M_y}{M_y^2} \nonumber \\[3pt] &\lesssim\int_T^{T+\Delta} (r-y)^{H_{-}} (r-s)^{H_{-}} \mathrm{d}r + \int_T^{T+\Delta} (r-y)^{H_{-}} \mathrm{d}r \int_T^{T+\Delta} (r-s)^{H_{-}} \mathrm{d}r \nonumber\\[3pt] &\le \frac{\Delta^{2H}}{2H} + \frac{\Delta^{H+1}}{H_+^2}. \end{align}

Combining these two estimates, we obtain

\[ \Theta_s^j =2 \phi_s^j \, \int_s^T \left(\sum_{i=1}^N \phi_y^i \mathrm{D}_s^j\phi^i_y \right) \mathrm{d}y \lesssim T-s. \]

It is clear by now that indices and sums do not influence the estimates, so we informally drop them for more clarity and continue with the higher derivatives:

\[ \mathrm{D}_t \Theta_s =\mathrm{D}_t \phi_s \int_s^T \phi_r \mathrm{D}_s\phi_r \mathrm{d}r + \phi_s \int_s^T \mathrm{D}_t \phi_r \mathrm{D}_s \phi_r \mathrm{d}r + \phi_s \int_s^T \phi_r \mathrm{D}_t\mathrm{D}_s \phi_r\mathrm{d}r, \]

where the first and second terms behave like $T-s$ . For $t\le s\le y\le T$ , we deduce from (25) that $\mathrm{D}_t\mathrm{D}_s \phi_y$ consists of five terms, of which four behave like $(T-s)$ , and only one features three derivatives:

\begin{align*} \mathrm{D}_t\mathrm{D}_s m_y \lesssim \int_T^{T+\Delta} (r-s)^{H_{-}} (r-t)^{H_{-}} (r-y)^{H_{-}}\mathrm{d}r &\le \int_T^{T+\Delta} (r-y)^{3H-\frac{3}{2}} \mathrm{d}y\\[3pt] &\approx (T+\Delta-y)^{3H-\frac{1}{2}} -(T-y)^{3H-\frac{1}{2}}. \end{align*}

If $H\ge \frac{1}{6}$ , then concavity implies $\mathrm{D}_t \Theta_s \lesssim (T-s) $ . Otherwise, if $H \lt \frac{1}{6}$ , then

\[ \mathrm{D}_t \Theta_s \lesssim (T-s) + \Big[(T+\Delta-s)^{3H+\frac{1}{2}}-\Delta^{3H+\frac{1}{2}}\Big] + (T-s)^{3H+\frac{1}{2}}\le (T-s) + 2(T-s)^{3H+\frac{1}{2}}. \]

In the second derivative of $\Theta$ , the first and second terms behave like $(T-s)$ and $\mathrm{D}_t\Theta_s\lesssim (T-s)+(T-s)^{(3H+\frac{1}{2})\wedge 1}$ respectively; hence we focus on $\int_s^T \mathrm{D}_w \mathrm{D}_t \mathrm{D}_s \phi_y \mathrm{d}y$ , where the new term is

\[ \mathrm{D}_w \mathrm{D}_t \mathrm{D}_s m_y \approx \int_T^{T+\Delta}\! (r-w)^{H_{-}} (r\!-\!s)^{H_{-}} (r\!-\!t)^{H_{-}} (r\!-\!y)^{H_{-}}\mathrm{d}r \lesssim (T+\Delta-y)^{4H-1}\!-\!(T\!-\!y)^{4H-1}. \]

If $H\ge \frac{1}{4}$ , then $\mathrm{D}_t \Theta_s \lesssim (T-s)$ by concavity. Otherwise, when $H \lt \frac{1}{4}$ ,

\begin{align*} \mathrm{D}_w \mathrm{D}_t \Theta_s &\lesssim (T-s) +(T-s)^{(3H+\frac{1}{2})\wedge 1} + \Big[(T+\Delta-s)^{4H}-\Delta^{4H}\Big]+ (T-s)^{4H}\\[3pt] &\le (T-s) + (T-s)^{(3H+\frac{1}{2})\wedge 1}+ 2 (T-s)^{4H}, \end{align*}

where the last inequality holds by yet again the same concavity argument.

This yields a rule for checking that the quantities in our assumptions indeed converge. We summarise the above estimates in the case $H\le\frac{1}{2}$ : there exists $Z\in L^p$ such that for $s\le T$ and T small enough,

\begin{align*} \boldsymbol\Theta_s \le Z (T-s), \qquad \mathrm{D} \boldsymbol\Theta_s \le Z (T-s)^{(3H+\frac{1}{2})\wedge1}, \qquad \mathrm{D} \mathrm{D} \boldsymbol\Theta_s \le Z (T-s)^{(4H)\wedge1} \end{align*}

hold almost surely. Thanks to the Cauchy–Schwarz inequality we can disentangle the numerators (integrals and derivatives of $\boldsymbol\Theta$ ) and denominators (powers of u) of the assumptions, which are both uniformly bounded in $L^p$ . We can easily deduce that $\boldsymbol{(\mathrm{H}_{3})}$ , $\boldsymbol{(\mathrm{H}_{4})}$ , $\boldsymbol{(\mathrm{H}}_{\boldsymbol{5}}\boldsymbol{)}$ , $\boldsymbol{(\mathrm{H}}_{\boldsymbol{6}}^{\lambda}\boldsymbol{)}$ , $\boldsymbol{(\mathrm{H}}_{\boldsymbol{8}}^{\gamma}\boldsymbol{)}$ are satisfied (convergence to zero). In $\boldsymbol{(\mathrm{H}}_{\boldsymbol{7}}^{\lambda}\boldsymbol{)}$ , $\mathbb{E}[\mathfrak{K}_T]$ behaves as $T^{-\lambda}$ , so it converges for any $\lambda\in({-}\frac{1}{2},0]$ , and the uniform $L^2$ bound is satisfied thanks to $\boldsymbol{(\mathrm{C}_{3})}$ . Moreover, in the limit the first term in $\boldsymbol{(\mathrm{H}}_{\boldsymbol{9}}^{\gamma}\boldsymbol{)}$ behaves as $T^{-\gamma}$ and the second behaves as $T^{3H-\frac{1}{2}-\gamma}$ ; therefore both assumptions are satisfied for any $\lambda\in({-}\frac{1}{2},0]$ and $\gamma\in({-}1, 3H-\frac{1}{2}]$ . Similarly, $\boldsymbol{(\mathrm{C}_{3})}$ ensures the uniform $L^2$ bounds.

6.3.1. Convergence lemmas

We require some preliminaries before we dive into the computations. We present three versions of integral convergence tailored to our purposes, which are essential for computing the limits in Theorems 1, 2, and 3. The conditions they require hold thanks to the continuity of $\boldsymbol{(\mathrm{C}_{4})}$ . Recall the local Taylor theorem: if a function $g({\cdot})$ is continuous on $[0, \delta]$ for some $\delta \gt 0$ , then there exists a continuous function $\varepsilon({\cdot})$ on $[0, \delta]$ with $\lim_{x\downarrow 0}\varepsilon(x)=0$ such that $g(x) = g(0) + \varepsilon(x)$ for any $x \in [0, \delta]$ .

Lemma 3. If $f\,:\,\mathbb{R}_+^2\to\mathbb{R}$ is such that $f(T,\cdot)$ is continuous on $[0,\delta_0]$ for some $\delta_0 \gt 0$ and $\lim\limits_{T\downarrow0}f(T,0)=f(0,0)$ , then

(26)

\begin{align} \lim_{T\downarrow0}\frac1T\int_0^T f(T,y)\,\mathrm{d}y =f(0,0). \end{align}

Proof. For $T \lt \delta_0$ , we can write

\begin{align*} \frac1T \int_0^T f(T,y) \mathrm{d}y = \frac1T \int_0^T [f(T,0) + \varepsilon_0(y) ] \mathrm{d}y = f(T,0) + \frac1T \int_0^T\varepsilon_0(y) \mathrm{d}y, \end{align*}

where the function $\varepsilon_0$ is continuous on $[0,\delta_0]$ and converges to zero at the origin. Hence, for any $\eta_0 \gt 0$ , there exists $\widetilde \delta_0 \gt 0$ such that, for any $y\le\widetilde \delta_0$ , $|\varepsilon_0(y)| \lt \eta_0$ . For all $T \lt \widetilde \delta_0 \wedge \delta_0$ ,

$$ \bigg\lvert\frac1T \int_0^T \varepsilon_0(y) \mathrm{d}y \bigg\lvert \le \eta_0. $$

Since $\eta_0$ can be taken as small as desired, the fact that $\lim_{T\downarrow0} f(T,0)=f(0,0)$ concludes the proof.

Lemma 4. Let $f\,:\,\mathbb{R}^3_+\to\mathbb{R}$ be such that, for each $y\le T$ , $f(T,y,\cdot)$ is continuous on $[0,\delta_0]$ with $\delta_0 \gt 0$ , $f(T,\cdot,0)$ is continuous on $[0,\delta_1]$ with $\delta_1 \gt 0$ , and $\lim_{T\downarrow0}f(T,0,0)=f(0,0,0)$ . Then

(27)

\begin{align} \lim_{T\downarrow0} \frac{1}{T^2} \int_0^T \int_0^y f(T,y,s) \mathrm{d}s \mathrm{d}y = \frac{f(0,0,0)}{2}. \end{align}

Proof. For $T \lt \delta_0\wedge\delta_1$ , we can write

\begin{align*} \frac{1}{T^2}\int_{0}^{T}\left\{\int_{0}^{y}f(T, y, s)\mathrm{d} s\right\}\mathrm{d} y & = \frac{1}{T^2}\int_{0}^{T}\left\{\int_{0}^{y}\left[f(T, y, 0) + \varepsilon_{0}(s)\right]\mathrm{d} s\right\}\mathrm{d} y\\[3pt] & = \frac{1}{T^2}\int_{0}^{T}\left\{f(T, y, 0)y + \int_{0}^{y}\varepsilon_{0}(s)\mathrm{d} s\right\}\mathrm{d} y\\[3pt] & = \frac{1}{T^2}\int_{0}^{T}\left\{\Big(f(T, 0, 0)+\varepsilon_{1}(y)\Big)y + \int_{0}^{y}\varepsilon_{0}(s)\mathrm{d} s\right\}\mathrm{d} y\\[3pt] & = \frac{f(T,0,0)}{2} + \frac{1}{T^2}\int_{0}^{T}\left\{\varepsilon_{1}(y)y + \int_{0}^{y}\varepsilon_{0}(s)\mathrm{d} s\right\}\mathrm{d} y, \end{align*}

where $\varepsilon_{1}({\cdot})$ is continuous on $[0,\delta_1]$ and $\varepsilon_{0}({\cdot})$ is continuous on $[0,\delta_0]$ , and both are null at the origin. For any $\eta_{1} \gt 0$ , there exists $\widetilde{\delta}_{1} \gt 0$ such that, for any $y \in [0, \widetilde{\delta}_{1}]$ , we have $|\varepsilon_{1}(y)| \lt \eta_{1}$ . Therefore, for the first integral, we have, for $T \lt \widetilde{\delta}_{1}\wedge\delta_0\wedge\delta_1$ ,

$$ \left|\frac{1}{T^2}\int_{0}^{T}\varepsilon_{1}(y)y\mathrm{d} y\right| \leq \frac{1}{T^2}\int_{0}^{T}\left|\varepsilon_{1}(y)\right|y\mathrm{d} y \leq \frac{1}{T^2}\int_{0}^{T}\eta_1 y\mathrm{d} y \leq \frac{\eta_{1}}{2}. $$

Likewise, since $\varepsilon_{0}({\cdot})$ tends to zero at the origin, for any $\eta_{0} \gt 0$ there exists $\widetilde{\delta}_{0} \gt 0$ such that, for any $y \in [0, \widetilde{\delta}_{0}]$ , we have $|\varepsilon_{0}(y)| \lt \eta_{0}$ . Therefore, for the second integral, we have, for $T \lt \widetilde{\delta}_{0}\wedge\delta_0\wedge\delta_1$ ,

$$ \left|\frac{1}{T^2}\int_{0}^{T}\int_{0}^{y}\varepsilon_{0}(s)\mathrm{d} s\mathrm{d} y\right| \leq \frac{1}{T^2}\int_{0}^{T}\int_{0}^{y}\left|\varepsilon_{0}(s)\right|\mathrm{d} s\mathrm{d} y \leq \frac{1}{T^2}\int_{0}^{T}\int_{0}^{y}\eta_{0} \mathrm{d} s \mathrm{d} y \leq \frac{\eta_{0}}{2}. $$

Since $\eta_1$ and $\eta_0$ can be taken as small as desired, taking the limit of f(T, 0, 0) as T goes to zero concludes the proof.

Lemma 5. Let $f\,:\,\mathbb{R}^4_+\to\mathbb{R}$ be such that, for all $0\le s\le y\le T$ , the functions $f(T,y,s,\cdot)$ , $f(T,y,\cdot,0)$ , $f(T,\cdot,0,0)$ are continuous on $[0,\delta_0]$ , $[0,\delta_1]$ , $[0,\delta_2]$ , respectively, for some $\delta_0,\delta_1,\delta_2 \gt 0$ , and $\lim_{T\downarrow0}f(T,0,0,0)=f(0,0,0,0)$ . Then the following limit holds:

(28)

\begin{align} \lim_{T\downarrow0} \frac{1}{T^3} \int_0^T \int_0^y \int_0^s f(T,y,s,t) \mathrm{d}t \mathrm{d}s \mathrm{d}y = \frac{f(0,0,0,0)}{6}. \end{align}

Proof. For $T \lt \delta_0\wedge\delta_1\wedge\delta_2$ , we can write

\begin{align*} \frac{1}{T^3}&\int_{0}^{T}\left\{\int_{0}^{y} \left( \int_0^s f(T, y, s,t)\mathrm{d}t \right) \mathrm{d}s\right\}\mathrm{d}y\\[3pt] &=\frac{1}{T^3}\int_{0}^{T}\left\{\int_{0}^{y} \left( \int_0^s [f(T, y, s,0)+\varepsilon_0(t)]\mathrm{d}t \right) \mathrm{d}s\right\}\mathrm{d}y\end{align*}

\begin{align*} & = \frac{1}{T^3}\int_{0}^{T}\left\{\int_{0}^{y} \left( f(T, y, s,0)s + \int_0^s \varepsilon_0(t)\mathrm{d}t \right) \mathrm{d}s\right\}\mathrm{d}y\\[3pt] & = \frac{1}{T^3}\int_{0}^{T}\left\{\int_{0}^{y} \left( \big[f(T, y, 0,0)+\varepsilon_1(s)\big]s + \int_0^s \varepsilon_0(t)\mathrm{d}t \right) \mathrm{d}s\right\}\mathrm{d}y\\[3pt] & = \frac{1}{T^3}\int_{0}^{T} \left\{f(T, y, 0,0) \frac{y^2}{2}+ \int_{0}^{y}\left( \varepsilon_1(s)s + \int_0^s \varepsilon_0(t)\mathrm{d}t \right) \mathrm{d}s\right\}\mathrm{d}y \\[3pt] & = \frac{1}{T^3}\int_{0}^{T} \left\{ \big[f(T, 0, 0,0)+\varepsilon_2(y)\big] \frac{y^2}{2}+ \int_{0}^{y}\left( \varepsilon_1(s)s + \int_0^s \varepsilon_0(t)\mathrm{d}t \right) \mathrm{d}s\right\}\mathrm{d}y\\[3pt] & = \frac{f(T,0,0,0)}{6} + \frac{1}{T^3}\int_{0}^{T} \left\{ \varepsilon_2(y) \frac{y^2}{2} +\int_{0}^{y}\left( \varepsilon_1(s)s + \int_0^s \varepsilon_0(t)\mathrm{d}t \right) \mathrm{d}s\right\}\mathrm{d}y, \end{align*}

where the function $\varepsilon_2$ is continuous on $[0,\delta_2]$ , the function $\varepsilon_{1}$ is continuous on $[0,\delta_1]$ , and the function $\varepsilon_{0}$ is continuous on $[0,\delta_0]$ , all converging to zero at the origin. By the same argument as in the previous proof, for any $\eta_0,\eta_1,\eta_2 \gt 0$ , there exists $\widetilde\delta \gt 0$ such that for all $T\le\widetilde \delta$ , we have $|\varepsilon_0(T)|\le \eta_0$ , $|\varepsilon_1(T)|\le \eta_1$ , and $|\varepsilon_2(T)|\le \eta_2$ . This implies

$$ \bigg\lvert\frac{1}{T^3}\int_{0}^{T} \left\{ \varepsilon_2(y) \frac{y^2}{2} +\int_{0}^{y}\left( \varepsilon_1(s)s + \int_0^s \varepsilon_0(t)\mathrm{d}t \right) \mathrm{d}s\right\}\mathrm{d}y \bigg\lvert \le \frac{\eta_2+\eta_1+\eta_0}{6}. $$

Since $\eta_2$ , $\eta_1$ and $\eta_0$ can be taken as small as desired, taking the limit of f(T,0,0,0) as T goes to zero concludes the proof.

To apply these lemmas, we will use a modified version of the martingale convergence theorem, which holds in our setting thanks to domination provided by $\boldsymbol{(\mathrm{C}_{1})}$ and $\boldsymbol{(\mathrm{C}_{2})}$ and the continuity of $\boldsymbol{(\mathrm{C}_{4})}$ .

Lemma 6. Let $(X_t)_{t\ge0}$ be almost surely continuous in a neighbourhood of zero, with $\sup_{t\le 1} |X_t|\le Z\in L^1$ . Then the conditional expectation process $(\mathbb{E}_t[X_t])_{t\ge0}$ is also almost surely continuous in a neighbourhood of zero. In particular,

\[ \lim_{t\downarrow0} \mathbb{E}_t[X_t] =\mathbb{E}[X_0]. \]

Remark 8. The process $(X_t)_{t\ge0}$ is not necessarily adapted.

Proof. All the limits are taken in the almost sure sense. Let $\delta \gt 0$ be such that X is continuous on $[0,\delta]$ , and fix $t \lt \delta$ . We set a sequence $\{t_n\}_{n\in\mathbb{N}}$ on $[0,\delta]$ which converges to t as n goes infinity. Assume first that $\{t_n\}_{n\in\mathbb{N}}$ is a monotone sequence. Since $\mathcal{F}_{t_n}$ tends monotonically to $\mathcal{F}_t$ and X is dominated, the classical martingale convergence theorem (MCT) asserts that $\lim_{n\uparrow\infty} \mathbb{E}_{t_n}[X_t] = \mathbb{E}_t[X_t]$ . For fixed $n\in\mathbb{N}$ and any $\mathfrak{q}\ge |t_n-t|$ ,

(29)

\begin{align} |X_{t_n}-X_t| \le \sup_{|\mathfrak{p}-t|\le \mathfrak{q}} |X_{\mathfrak{p}} -X_t|. \end{align}

Let us fix $\varepsilon \gt 0$ . By the MCT, there exists $n_0\in\mathbb{N}$ such that, if $n\ge n_0$ , then

$$ \bigg\lvert\mathbb{E}_{t_n} \bigg[\sup_{|\mathfrak{p}-t|\le \mathfrak{q}} |X_{\mathfrak{p}} -X_t| \bigg]- \mathbb{E}_t \bigg[\sup_{|\mathfrak{p}-t|\le \mathfrak{q}} |X_{\mathfrak{p}} -X_t| \bigg] \bigg\lvert \lt \varepsilon, $$

and by dominated convergence there exists $\delta^{\prime} \gt 0$ with $\mathbb{E}_t \Big[\sup_{|\mathfrak{p}-t|\le \delta^{\prime}} |X_{\mathfrak{p}} -X_t| \Big] \lt \varepsilon$ . There exists $n_1\in\mathbb{N}$ such that $|t_n-t|\le\delta^{\prime}$ for all $n\ge n_1$ ; thus if $n\ge n_0\vee n_1$ , then (29) yields $\mathbb{E}_{t_n}[|X_{t_n}-X_t|] \lt 2\varepsilon$ and

(30)

\begin{align} \lim_{n\uparrow\infty} \mathbb{E}_{t_n}[X_{t_n}] = \mathbb{E}_t[X_t]. \end{align}

Now we consider the general case where $\{t_n\}_{n\in\mathbb{N}}$ is not monotone. From every subsequence of $\{t_n\}_{n\in\mathbb{N}}$ , one can extract a further subsequence which is monotone. Let us call this sub-subsequence $\{t_{n_k}\}_{k\in\mathbb{N}}$ . Therefore, (30) holds with $t_{n_k}$ instead of $t_{n}$ . Since every subsequence of $(\mathbb{E}_{t_n}[X_{t_n}])_{n\in\mathbb{N}}$ has a further subsequence that converges to the same limit, the original sequence also converges to this limit.

For convenience, we use the following definition.

Definition 2. Let $k,n\in\mathbb{N}$ with $k\le n$ . For a function $f\,:\,\mathbb{R}_+^n\to\mathbb{R}$ , we define

$$ \lim_{0\le x_1\le x_2\le\cdots\le x_k\downarrow0} f(x_1,\cdots,x_n)\,:\!=\, \lim_{x_k\downarrow0} \cdots \lim_{x_2\downarrow0} \lim_{x_1\downarrow0} f(x_1,\cdots,x_n). $$

Notice that the right-hand sides of (26), (27), and (28) correspond to

$$\lim\limits_{y\le T\downarrow0} f(T,y), \qquad \displaystyle\frac{1}{2}\lim\limits_{s\le y\le T\downarrow0} f(T,y,s), \qquad\text{and} \quad \displaystyle\frac{1}{6}\lim\limits_{t\le s\le y\le T\downarrow0} f(T,y,s,t),$$

respectively.

6.3.2. Proof of Proposition 1

Let us recall some important quantities:

(31)

\begin{align}M_y &= \mathbb{E}_y\left[\mathrm{VIX}_T\right] = \mathbb{E}_y\left[ \sqrt{\frac1\Delta \int_T^{T+\Delta} \mathbb{E}_T v_r \mathrm{d}r}\right],\nonumber\\[3pt] m^i_y &= \mathbb{E}_y[\mathrm{D}^i_y M_y]= \mathbb{E}_y \left[\frac{\int_T^{T+\Delta} \mathrm{D}^i_y \mathbb{E}_T v_r \mathrm{d}r }{2\Delta \mathrm{VIX}_T}\right]= \mathbb{E}_y \left[\frac{\int_T^{T+\Delta} \mathrm{D}^i_y v_r \mathrm{d}r }{2\Delta \mathrm{VIX}_T}\right], \nonumber\\[3pt] \phi_y^i&= \frac{m_y^i}{M_y} = \frac{\mathbb{E}_y\left[\left(\int_T^{T+\Delta}\mathrm{D}_y^i v_r \mathrm{d}r\right)/(2\Delta \mathrm{VIX}_T)\right]}{\mathbb{E}_y[\mathrm{VIX}_T]}.\end{align}

We also recall that $J_i$ and $G_{ij}$ , $i,j\in [\![ 1,N ]\!]$ , were defined in (9). In this proof we will define $f(0)\,:\!=\,\lim_{x\downarrow0}f(x)$ , for every $f\,:\,\mathbb{R}_+\to\mathbb{R}$ , as soon as the limit exists and even if f is not actually continuous around zero. In this way we make it continuous, which allows us to apply the convergence lemmas.

Level. By $\boldsymbol{(\mathrm{C}_{1})}$ and the MCT, $\lim_{y\downarrow0}\mathbb{E}_y[\mathrm{VIX}_T]=\mathbb{E}[\mathrm{VIX}_T]$ and $(M_y)_{y\ge0}$ is continuous around zero, almost surely. By $\boldsymbol{(\mathrm{C}_{4})}$ and the dominated convergence theorem (DCT), we have $\lim_{y\downarrow0} \int_T^{T+\Delta} \mathrm{D}^i_y v_r\mathrm{d}r=\int_T^{T+\Delta} \mathrm{D}^i_0 v_r\mathrm{d}r$ and $\big(\int_T^{T+\Delta} \mathrm{D}^i_y v_r\mathrm{d}r)_{y\ge0}$ is continuous around zero, almost surely. Let $i\in [\![ 1,N ]\!]$ ; from $\boldsymbol{(\mathrm{C}_{1})}$ and $\boldsymbol{(\mathrm{C}_{2})}$ we also obtain that almost surely

$$\frac{1}{\mathrm{VIX}_T}\int_T^{T+\Delta}\mathrm{D}_y^i v_r \mathrm{d}r\le X^2 \Big\{(T+\Delta-y)^{H_{+}} - (T-y)^{H_{+}}\Big\},$$

for some $X\in L^2$ . Therefore it is dominated, and by Lemma 6, almost surely $m_y^i$ is continuous at zero and

$$\lim_{y\downarrow0} m^i_y = \mathbb{E}\left[\frac{\int_T^{T+\Delta} \mathrm{D}^i_0 v_r\mathrm{d}r}{2\Delta \mathrm{VIX}_T}\right].$$

Since $M_y \gt 0$ for all $y\le T$ , $\phi^i$ is also continuous at zero and $\lim_{y \le T\downarrow0} \phi^i_y = J_i / (2\Delta\mathrm{VIX}_0^2)$ . By virtue of Theorem 1 and Lemma 3, we obtain

\[\lim_{T\downarrow0} \mathcal{I}_{T} = \lim_{T\downarrow0} \mathbb{E}[u_0] = \lim_{y\le T\downarrow0} {\left\|{\boldsymbol\phi_y}\right\|}= \frac{{\left\|{\boldsymbol J}\right\|}}{2\Delta\mathrm{VIX}_0^2}.\]

Skew. To obtain the skew limit we need to compute a few Malliavin derivatives. For all $i,j\in [\![ 1,N ]\!]$ ,

\begin{align*}\mathrm{D}^j_s m^i_y &= \mathbb{E}_y \left[ \frac{\int_T^{T+\Delta}\mathrm{D}^j_s \mathrm{D}^i_y v_r \mathrm{d}r\, \mathrm{VIX}_T - \int_T^{T+\Delta} \mathrm{D}^i_y v_r \mathrm{d}r \, \mathrm{D}^j_s \mathrm{VIX}_T}{2\Delta \mathrm{VIX}_T^2}\right]\\[3pt] &= \mathbb{E}_y\left[\frac{\int_T^{T+\Delta}\mathrm{D}^j_s \mathrm{D}^i_y v_r \mathrm{d}r}{2\Delta\mathrm{VIX}_T} - \frac{\int_T^{T+\Delta}\mathrm{D}^i_y v_r \mathrm{d}r \int_T^{T+\Delta}\mathrm{D}^j_s v_r \mathrm{d}r}{4\Delta^2\mathrm{VIX}_T^3} \right],\end{align*}

which yields

\begin{align*}\mathrm{D}_s^j \phi_y^i & = \frac{\mathrm{D}^j_s m^i_y}{M_y} - \frac{ m^i_y \mathrm{D}^j_s M_y}{M_y^2}\\[3pt] & = \mathbb{E}_y\left[\frac{\int_T^{T+\Delta}\mathrm{D}^j_s \mathrm{D}^i_y v_r \mathrm{d}r}{2\Delta \mathrm{VIX}_T M_y} - \frac{ \int_T^{T+\Delta}\mathrm{D}_y^i v_r \mathrm{d}r \int_T^{T+\Delta}\mathrm{D}_s^j v_r \mathrm{d}r}{4\Delta^2 \mathrm{VIX}^3_T M_y} - \frac{ m_y^i \int_T^{T+\Delta} \mathrm{D}^j_s v_r \mathrm{d}r}{2\Delta\mathrm{VIX}_T M_y^2}\right]\\[3pt] & =: \mathbb{E}_y\left[A_T^{ij}(y,s) + B_T^{ij}(y,s) + C_T^{ij}(y,s)\right].\end{align*}

Based on $\boldsymbol{(\mathrm{C}_{1})}$ , $\boldsymbol{(\mathrm{C}_{2})}$ , and $\boldsymbol{(\mathrm{C}_{4})}$ , for each $T\ge0$ , $A^{ij}_T$ , $B^{ij}_T$ , and $C^{ij}_T$ are dominated and almost surely continuous in both arguments. For each $s\ge0$ , Lemma 6 and the DCT yield, almost surely, that $(\mathrm{D}^j_s \phi^i_y)_{y\ge0}$ and $(\mathrm{D}^j_s \phi^i_0)_{s\ge0}$ are continuous around zero. In particular,

\begin{align*}\lim_{s\downarrow0} \mathbb{E}_y\big[ A_T^{ij}(y,s)+B_T^{ij}(y,s)+C_T^{ij}(y,s)\big] &= \mathbb{E}\big[ A_T^{ij}(y,0)+B_T^{ij}(y,0)+C_T^{ij}(y,0)\big],\\[3pt] \lim_{y\downarrow0} \mathbb{E}_y\big[ A_T^{ij}(y,0)+B_T^{ij}(y,0)+C_T^{ij}(y,0)\big] &= \mathbb{E}\big[ A_T^{ij}(0,0)+B_T^{ij}(0,0)+C_T^{ij}(0,0)\big].\end{align*}

By the DCT again this yields

\begin{align*}\lim_{T\downarrow0}\mathbb{E}[ A_T^{ij}(0,0)] = \frac{G_{ij}}{2\Delta\mathrm{VIX}_0^2}\qquad \text{and}\qquad\lim_{T\downarrow0}\mathbb{E}[ B_T^{ij}(0,0)] =\lim_{T\downarrow0} \mathbb{E}[C_T^{ij}(0,0)]= -\frac{J_i J_j}{4\Delta^2 \mathrm{VIX}_0^4}.\end{align*}

Therefore $\phi_s^j \mathrm{D}_s^j (\phi_y^i)^2$ satisfies the continuity requirements of f(T, y, s) in Lemma 4. We combine this lemma with the limits above to see that, almost surely,

\begin{align*}\lim_{T\downarrow0} \frac{1}{T^2} \int_0^T \phi_s^j \int_s^T \mathrm{D}_s^j (\phi_y^i)^2 \mathrm{d}y \mathrm{d}s&= \lim_{T\downarrow0} \frac{1}{T^2} \int_0^T \int_0^y \phi_s^j \mathrm{D}_s^j (\phi_y^i)^2 \mathrm{d}s \mathrm{d}y= \frac{1}{2} \lim_{s\le y \le T\downarrow0} \phi_s^j \mathrm{D}_s^j (\phi_y^i)^2\\[3pt] &= \frac{J_j}{4\Delta \mathrm{VIX}_0^2}\left[ \frac{J_i G_{ij}}{2\Delta^2\mathrm{VIX}_0^4} - \frac{J_i^2 J_j}{2\Delta^3 \mathrm{VIX}_0^6} \right].\end{align*}

We also recall that $\lim_{T\downarrow0} u_0 =\frac{{\left\|{\boldsymbol J}\right\|}}{2\Delta\mathrm{VIX}_0^2}$ almost surely; hence, with $\boldsymbol{(\mathrm{C}_{2})}$ and $\boldsymbol{(\mathrm{C}_{3})}$ , the DCT implies

(32)

\begin{align}\lim_{T\downarrow0} \mathcal{S}_T=\sum_{i,j=1}^N\lim_{T\downarrow0} \frac{1}{2} \mathbb{E}\left[\frac{\int_0^T \phi_s^j \int_s^T \mathrm{D}_s^j (\phi_y^i)^2 \mathrm{d}y \mathrm{d}s}{u_0^3 T^{2}}\right]= \frac{1}{2{\left\|{\boldsymbol J}\right\|}^3}\sum_{i,j=1}^N J_i J_j \left(G_{ij} - \frac{J_i J_j}{\Delta\mathrm{VIX}_0^2}\right).\end{align}

Curvature. We now turn our attention to the curvature. By the same arguments as above we have

\begin{align*}\lim_{T\downarrow0} \mathbb{E}\left[ \frac{\left(\sum_{i,j=1}^N \int_0^T \phi_s^j \int_s^T \mathrm{D}^j_s (\phi_y^i)^2 \mathrm{d}y \mathrm{d}s\right)^2}{u_0^7 T^4}\right]&= \frac{2\Delta\mathrm{VIX}^2_0}{{\left\|{\boldsymbol J}\right\|}^7}\left(\sum_{i,j=1}^N J_i J_j \left(G_{ij} - \frac{J_i J_j}{\Delta\mathrm{VIX}_0^2}\right)\right)^2.\end{align*}

For the last term of (8) we need to go one step further and compute more Malliavin derivatives, since

\begin{align*}\mathrm{D}^k_t \Theta^j_s &= \sum_{i=1}^N\left(\mathrm{D}^k_t \phi_s^j \int_s^T \mathrm{D}^j_s (\phi_y^i)^2\mathrm{d}y+2 \phi^j_s \int_s^T \mathrm{D}^k_t \phi^i_y \mathrm{D}^j_s \phi^i_y \mathrm{d}y + 2 \phi_s^j \int_s^T \phi^i_y \mathrm{D}^k_t \mathrm{D}^j_s \phi^i_y \mathrm{d}y\right) \\[3pt] & =: \sum_{i=1}^N \int_s^T \Upsilon^{ijk}(t,s,y,T)\mathrm{d}y. \nonumber\end{align*}

Thus we zoom in on the last term of the display above:

\begin{align*}\mathrm{D}^k_t \mathrm{D}^j_s \phi_y^i &= \frac{\mathrm{D}^k_t \mathrm{D}^j_s m^i_y \, M_y -\mathrm{D}^j_s m^i_y \mathrm{D}^k_t M_y}{M_y^2}- \frac{\mathrm{D}^k_t m^i_y \mathrm{D}^j_s M_y + m_y^i \mathrm{D}^k_t \mathrm{D}^j_s M_y}{M_y^2} + \frac{2 m^i_y \mathrm{D}^j_s M_y \mathrm{D}^k_t M_y}{M_y^3}\\[3pt] &=:\sum_{n=1}^5 Q_n^{ijk}(t,s,y,T),\end{align*}

We zoom in again on $Q_1^{ijk}(t,s,y,T)$ :

\begin{align*}\mathrm{D}^k_t \mathrm{D}^j_s m_y^i = \mathrm{D}^k_t \mathrm{D}^j_s \mathrm{D}^i_y M_y & = \mathrm{D}^k_t \mathbb{E}_y\left[ \frac{\int_T^{T+\Delta} \mathrm{D}^j_s \mathrm{D}^i_y v_r \mathrm{d}r}{2\Delta\mathrm{VIX}_T} - \frac{\int_T^{T+\Delta} \mathrm{D}^i_y v_r\mathrm{d}r \int_T^{T+\Delta} \mathrm{D}^j_s v_r\mathrm{d}r}{4\Delta^2\mathrm{VIX}_T^3}\right] \\[3pt] & =: \mathbb{E}_y\left[ \alpha_T^{ijk}+\beta_T^{ijk}\right].\end{align*}

Some additional computations lead to

\begin{align*}\alpha_T^{ijk}&= \frac{\mathrm{VIX}_T \int_T^{T+\Delta} \mathrm{D}^k_t \mathrm{D}^j_s\mathrm{D}^i_y v_r\mathrm{d}r - \mathrm{D}^k_t \mathrm{VIX}_T \, \int_T^{T+\Delta} \mathrm{D}^j_s \mathrm{D}^i_y v_r\mathrm{d}r}{2\Delta \mathrm{VIX}_T^2}\\[3pt] &=\frac{\int_T^{T+\Delta} \mathrm{D}^k_t \mathrm{D}^j_s\mathrm{D}^i_y v_r\mathrm{d}r}{2\Delta \mathrm{VIX}_T} - \frac{\int_T^{T+\Delta} \mathrm{D}^k_t v_r\mathrm{d}r \, \int_T^{T+\Delta} \mathrm{D}^j_s\mathrm{D}^i_y v_r\mathrm{d}r}{4\Delta^2\mathrm{VIX}_T^3},\end{align*}

\begin{align*} \beta_T^{ijk} &=- \frac{\int_T^{T+\Delta} \mathrm{D}^k_t \mathrm{D}^i_y v_r\mathrm{d}r \, \int_T^{T+\Delta} \mathrm{D}^j_s v_r\mathrm{d}r + \int_T^{T+\Delta} \mathrm{D}^i_y v_r\mathrm{d}r \, \int_T^{T+\Delta} \mathrm{D}^k_t \mathrm{D}^j_s v_r\mathrm{d}r}{4\Delta^2 \mathrm{VIX}_T^3} \\[3pt] & \qquad\qquad + \frac{\mathrm{D}^k_t \mathrm{VIX}_T^3 \int_T^{T+\Delta} \mathrm{D}^i_y v_r\mathrm{d}r \, \int_T^{T+\Delta} \mathrm{D}^j_s v_r\mathrm{d}r}{4\Delta^2 \mathrm{VIX}_T^6} \\[3pt] &=- \frac{\int_T^{T+\Delta} \mathrm{D}^k_t \mathrm{D}^i_y v_r\mathrm{d}r \, \int_T^{T+\Delta} \mathrm{D}^j_s v_r\mathrm{d}r + \int_T^{T+\Delta} \mathrm{D}^i_y v_r\mathrm{d}r \, \int_T^{T+\Delta} \mathrm{D}^k_t \mathrm{D}^j_s v_r\mathrm{d}r}{4\Delta^2 \mathrm{VIX}_T^3}\\[3pt] &\qquad\qquad + \frac{3 \int_T^{T+\Delta} \mathrm{D}^k_t v_r\mathrm{d}r \, \int_T^{T+\Delta} \mathrm{D}^i_y v_r\mathrm{d}r \, \int_T^{T+\Delta} \mathrm{D}^j_s v_r\mathrm{d}r}{8\Delta^3 \mathrm{VIX}_T^5} .\end{align*}

We notice, crucially, that we have already justified the continuity of $\phi$ and $\mathrm{D}\phi$ around zero in the proofs of level and skew, respectively. Furthermore, by Lemma 6, the first two terms in $\Upsilon^{ijk}$ as well as $Q_2,Q_3,Q_4,Q_5$ all converge to some finite limit as $t\le s\le y\downarrow0$ and are continuous around zero, almost surely. Similarly, $\beta_T$ and the second term in $\alpha_T$ are almost surely continuous around zero, and their conditional expectation converges almost surely to some finite limit as $t\le s\le y\downarrow0$ by the DCT and Lemma 6. Taking the limit as T goes to zero afterwards, we see that all of the aforementioned terms tend to a finite limit. On the other hand, by $\boldsymbol{(\mathrm{C}_{4})}$ , the DCT, and Lemma 6 we know that the conditional expectation of the first term in $\alpha_T$ is almost surely continuous around zero, and its limit is

\[\lim_{t\le s\le y\downarrow0} \mathbb{E}_y\left[\frac{\int_T^{T+\Delta} \mathrm{D}_t^k \mathrm{D}_s^j \mathrm{D}^i_y v_r \mathrm{d}r}{2\Delta\mathrm{VIX}_T}\right] = \mathbb{E}\left[\frac{\int_T^{T+\Delta} \mathrm{D}_0^k \mathrm{D}_0^j \mathrm{D}^i_0 v_r \mathrm{d}r}{2\Delta\mathrm{VIX}_T}\right].\]

Since $\gamma \lt 0$ , only this term contributes in the limit:

\begin{align*}\lim_{t\le s\le y\le T\downarrow0} \frac{ \phi_t^k\,\Upsilon^{ijk}(t,s,y,T)}{T^{\gamma}}&= \lim_{t\le s\le y\le T\downarrow0} 2 \phi_t^k \phi^j_s \phi^i_y \mathbb{E}_y\left[\frac{\int_T^{T+\Delta} \mathrm{D}_t^k \mathrm{D}_s^j \mathrm{D}^i_y v_r \mathrm{d}r}{2 T^\gamma\Delta\mathrm{VIX}_T M_y}\right] \\[3pt] &= \frac{J_i J_j J_k}{8\Delta^4\mathrm{VIX}_0^8} \, \lim_{T\downarrow0} \frac{\mathbb{E}\left[\int_T^{T+\Delta} \mathrm{D}_0^k \mathrm{D}_0^j \mathrm{D}^i_0 v_r \mathrm{d}r\right]}{T^\gamma},\end{align*}

where we applied the DCT at the end. Moreover, we know by $\boldsymbol{(\mathrm{C}_{2})}$ that this limit is finite for $\gamma=3H-\frac{1}{2}$ ; hence the conditions of Lemma 5 are satisfied. We also recall that $\lim_{T\downarrow0} u_0 =\frac{{\left\|{\boldsymbol J}\right\|}}{2\Delta\mathrm{VIX}_0^2}$ almost surely; hence Lemma 5 yields the almost sure limit

\begin{align*}& \lim_{T\downarrow0}\frac{1}{u_0^5 T^{3+\gamma}} \int_0^T \sum_{k=1}^N \left\{ \phi^k_t \mathrm{D}^k_t \left( \int_t^T |\boldsymbol\Theta_{s}|\mathrm{d}s \right) \right\} \mathrm{d}t \\[3pt] & \qquad = \sum_{i,j,k=1}^N \lim_{T\downarrow0}\frac{1}{u_0^5 T^{3}} \int_0^T \int_0^y\int_0^s \frac{\phi^k_t\,\Upsilon^{ijk}(t,s,y,T)}{T^{3H-\frac{1}{2}}} \mathrm{d}y\mathrm{d}s\mathrm{d}t \\[3pt] & \qquad = \frac{2 \Delta \mathrm{VIX}_0^{2}}{3{\left\|{\boldsymbol J}\right\|}^5} \sum_{i,j,k=1}^N J_i J_j J_k \, \lim_{T\downarrow0} \frac{\mathbb{E}\left[\int_T^{T+\Delta} \mathrm{D}_0^k \mathrm{D}_0^j \mathrm{D}^i_0 v_r \mathrm{d}r\right]}{T^{3H-\frac{1}{2}}} .\end{align*}

The first two terms in (8) tend to zero since $\gamma \lt 0$ ; hence Theorem 3 and the DCT yield the final result:

\begin{align*}\lim_{T\downarrow0} \frac{\mathcal{C}_T}{T^{3H-\frac{1}{2}}}&=\frac{2\Delta \mathrm{VIX}_0^{2}}{3{\left\|{\boldsymbol J}\right\|}^5} \sum_{i,j,k=1}^N J_i J_j J_k \lim_{T\downarrow0} \,\frac{\int_T^{T+\Delta} \mathbb{E}\left[\mathrm{D}^k_0 \mathrm{D}^j_0\mathrm{D}^i_0 v_r\right]\mathrm{d}r}{T^{3H-\frac{1}{2}}}. \end{align*}

6.4. Proofs in the two-factor rough Bergomi model

6.4.1. Proof of Proposition 2

We start with a useful lemma for Gaussian processes.

Lemma 7. If B is a Gaussian process with ${\left\|{B}\right\|}_T\,:\!=\,\sup\limits_{t\le T}|B_t|$ , then $\mathbb{E}[\mathrm{e}^{p \|B\|_T}]$ is finite for all $p\in\mathbb{R}$ .

Proof. The Borell–TIS inequality asserts that $\mathbb{E}[{\left\|{B}\right\|}_T] \lt \infty$ and

$$\mathbb{P}({\left\|{B}\right\|}_T-\mathbb{E}{\left\|{B}\right\|}_T \gt x)\le \exp\left\{-\frac{x^2}{2\sigma^2_T}\right\},$$

where $\sigma^2_T\,:\!=\,\sup_{t\le T} \mathbb{E}[B_t^2]$ ; see [Reference Adler and Taylor1, Theorem 2.1.1]. We then follow the proof of [Reference Adler and Taylor1, Theorem 2.1.2]:

\begin{align*} \mathbb{E} \left[\mathrm{e}^{p {\left\|{B}\right\|}_T}\right] = \int_0^\infty \mathbb{P}\left( \mathrm{e}^{p{\left\|{B}\right\|}_T} \gt x\right) \mathrm{d}x \le \mathrm{e}^p + \mathbb{E}[{\left\|{B}\right\|}_T] + \int_{\mathrm{e}^p \vee \mathbb{E}[{\left\|{B}\right\|}_T]}^\infty \mathbb{P} \left({\left\|{B}\right\|}_T \gt \frac{\log(x)}{p} \right) \mathrm{d}x. \end{align*}

The Borell–TIS inequality in particular reads as follows:

\[ \mathbb{P} \left({\left\|{B}\right\|}_T \gt \log(x^{1/p}) \right) \le \exp\left\{-\frac{\left(\log(x^{1/p})-\mathbb{E}[{\left\|{B}\right\|}_T]\right)^2}{2\sigma_T^2}\right\}, \qquad \text{for all } u \gt \mathbb{E}[{\left\|{B}\right\|}_T]. \]

After a change of variable this yields

\[ \int_{\mathrm{e}^p \vee \mathbb{E}[{\left\|{B}\right\|}_T]}^\infty \mathbb{P} \left({\left\|{B}\right\|}_T \gt \frac{\log(x)}{p} \right) \mathrm{d}x \le \int_{\frac{\log\left(\mathrm{e}^p \vee \mathbb{E}[{\left\|{B}\right\|}_T]\right)}{p}}^\infty \exp\left\{-\frac{\left(x-\mathbb{E}[{\left\|{B}\right\|}_T]\right)^2}{2\sigma_T^2}\right\} p \mathrm{e}^{px} \mathrm{d}x, \]

which is finite as desired.

By the above lemma, ${\left\|{v}\right\|}_T\in L^p$ , so that we can compute its Malliavin derivatives

(33)

\begin{align}\mathrm{D}^1_y v_r = v_0 (r-y)^{H_{-}} \Big(\chi \nu \mathcal{E}^1_r + \overline{\chi} \eta \rho \mathcal{E}_r^2\Big)\qquad\text{and}\qquad\mathrm{D}^2_y v_r = v_0 \overline{\chi} \eta \overline{\rho} (r-y)^{H_{-}} \mathcal{E}_r^2.\end{align}

Without explicitly computing further derivatives, one notices that $\boldsymbol{(\mathrm{C}_{4})}$ holds and that there exist $C \gt 0$ and a random variable $X=C{\left\|{\mathcal{E}_r^1+\mathcal{E}_r^2}\right\|}_T\in L^p$ for all $p \gt 1$ such that $\mathrm{D}^i_y v_r \le X (r-y)^{H_{-}}$ , $\mathrm{D}^j_s \mathrm{D}^i_y v_r \le X (r-s)^{H_{-}} (r-y)^{H_{-}}$ , and $\mathrm{D}^k_t \mathrm{D}^j_s \mathrm{D}^i_y v_r \le X (r-t)^{H_{-}} (r-s)^{H_{-}} (r-y)^{H_{-}}$ , implying $\boldsymbol{(\mathrm{C}_{2})}$ . The following lemma yields $\boldsymbol{(\mathrm{C}_{1})}$ .

Lemma 8. In the two-factor rough Bergomi model (10) with $0\leq T_1 \lt T_2$ ,

$$ \mathbb{E}\left[ \sup_{y\le T_1}\left(\mathbb{E}_{y}\left[\frac{1}{T_2-T_1}\int_{T_1}^{T_2} v_{r} \mathrm{d}r\right]\right)^{-p} \right] $$

is finite for all $p \gt 1$ . In particular, $1/M$ is dominated in $L^p$ .

Proof. We first apply an $\exp$ – $\log$ identity, then Jensen’s inequality (using the concavity of the logarithm function), to obtain

\begin{align*}\mathbb{E}_{y}\left[\frac{1}{T_2-T_1}\int_{T_1}^{T_2} v_{r} \mathrm{d}r\right]^{-p} & = \exp\left\{-p \log \left(\frac{1}{T_2-T_1}\int_{T_1}^{T_2} v_{r}\mathrm{d}r \right)\right\}\\[3pt] & \le \exp\left\{-\frac{p}{T_2-T_1}\int_{T_1}^{T_2} \log(\mathbb{E}_y [v_r]) \mathrm{d}r\right\}.\end{align*}

We further bound $\log\mathbb{E}_y [v_r]$ as follows, using the concavity of the logarithm and (10):

(34)

\begin{align}-\log\mathbb{E}_y [v_r] \le -\frac{1}{2} \Big\{\log\left(2\chi v_0 \mathbb{E}_y[\mathcal{E}^1_r]\right) + \log\left(2\overline{\chi} v_0 \mathbb{E}_y[\mathcal{E}^2_r]\right) \Big\},\end{align}

which we now compute as

(35)

\begin{align} \mathbb{E}_y[\mathcal{E}^1_r]& = \exp\left\{-\frac{\nu^2 r^{2H}}{4H} + \nu \int_0^y (r-s)^{H_{-}}\mathrm{d} W^1_s\right\} \mathbb{E}_y\left[\exp\left\{\nu \int_y^r (r-s)^{H_{-}}\mathrm{d} W^1_s\right\}\right] \nonumber\\[3pt] & = \exp\left\{\frac{\nu^2}{4H} \left[(r-y)^{2H}-r^{2H}\right] + \nu \int_{0}^{y} (r-s)^{H_{-}} \mathrm{d} W^1_s\right\}, \nonumber \\[3pt] \mathbb{E}_y[\mathcal{E}^2_r] &=\exp\left\{\frac{\eta^2}{4H}\left[(r-y)^{2H}-r^{2H}\right] + \eta \int_0^{y} (r-s)^{H_{-}} \mathrm{d} (\rho W^1_s+\overline{\rho} W^2_s)\right\}.\end{align}

Let us deal with the first term of (34), as the second one is analogous. We have

\[ \int_{T_1}^{T_2} \left[(r-y)^{2H}-r^{2H}\right] \mathrm{d}r = \frac{(T_2-u)^{2H_+}-(T_1-y)^{2H_+}-T_2^{2H_+}+T_1^{2H_+}}{2H_+}, \]

which is clearly bounded below for all $0\le u\le T_1$ . Moreover, by Fubini’s theorem,

$$\int_{T_1}^{T_2} \int_0^y (r-t)^{H_{-}} \mathrm{d} W^1_t \mathrm{d}r=\int_0^y \int_{T_1}^{T_2} (r-t)^{H_{-}} \mathrm{d}r\mathrm{d} W^1_t= \int_0^y \frac{(T_2-t)^{H_{+}}-(T_1-t)^{H_{+}}}{H_{+}} \mathrm{d} W^1_t=: \overline{B}_t$$

is a Gaussian process. Since $\exp\{\cdot\}$ is increasing, $\sup_{t\in[0,T]} \exp\{\overline{B}_t\} = \exp\{\sup_{t\in[0,T]} \overline{B}_t\}$ ; thus

\[ \mathbb{E}\left[\sup_{y\le T_1} \exp\left({-}\frac{p}{T_2-T_1}\int_{T_1}^{T_2} \int_0^y (r-s)^{H_{-}} \mathrm{d} W^1_s\mathrm{d}r\right)\right] \le \mathbb{E}\left[\exp\left(\frac{p}{T_2-T_1} {\left\|{\overline{B}}\right\|}_T\right)\right] \lt \infty, \]

by Lemma 7, which concludes the proof.

Combining (33) and (35), we obtain $\mathbb{E}_y[\mathrm{D}^i_y v_r]$ , $i=1,2$ . The following lemma proves that $\boldsymbol{(\mathrm{C}_{3})}$ is satisfied.

Lemma 9. For any $p \gt 1$ , $ \mathbb{E}[u_s^{-p}]$ is uniformly bounded in s and T, with $s\le T$ .

Proof. Since $\nu,\eta,\rho+\overline{\rho} \gt 0$ , we have $\mathrm{D}^1_y v_r+\mathrm{D}^2_y v_r \gt 0$ almost surely for all $y\le r$ . Moreover, VIX and $1/\mathrm{VIX}$ are dominated by some $X \in L^p$ for all $p \gt 1$ , so, almost surely and independently of the sign of the numerator, we obtain

\[ m^i_y = \mathbb{E}_y\left[\frac{\int_T^{T+\Delta} \mathrm{D}^i_y v_r \mathrm{d}r}{2\Delta \mathrm{VIX}_T}\right] \ge \mathbb{E}_y\left[\frac{\int_T^{T+\Delta} \mathrm{D}^i_y v_r \mathrm{d}r}{2\Delta X}\right]. \]

Therefore, using that $1/M$ is dominated by X and Jensen’s inequality, we get

(36)

\begin{align} \frac{1}{u_s^2} = \frac{T-s}{\int_s^T \sum_{i=1}^N (\phi^i_y)^2\mathrm{d}y} & \le \frac{X^2(T-s)}{\int_s^T \sum_{i=1}^N (m^i_y)^2\mathrm{d}y} \le \frac{X^2 N(T-s)}{\int_s^T \left(\sum_{i=1}^N m^i_y \right)^2\mathrm{d}y} \le X^2 N \left( \frac{T-s}{\int_s^T \sum_{i=1}^N m^i_y\mathrm{d}y}\right)^2 \nonumber\\[3pt] &\le 4 X^2 N \left(\frac{\int_s^T\int_T^{T+\Delta}\sum_{i=1}^N \mathbb{E}_y\left[\mathrm{D}^i_y v_r /X\right] \mathrm{d}r\mathrm{d}y}{\Delta(T-s)}\right)^{-2}. \end{align}

Hence we turn our attention to

(37)

\begin{align} &\mathbb{E}\left[ \left(\frac{1}{\Delta(T-s)} \int_s^T \int_T^{T+\Delta} \mathbb{E}_y\left[\frac{\mathrm{D}^1_y v_r+\mathrm{D}^2_y v_r}{X}\right] \mathrm{d}r \mathrm{d}y\right)^{-p}\right] \nonumber\\[3pt] &= \mathbb{E}\left[ \exp\left\{-p\log \left(\frac{1}{\Delta(T-s)} \int_s^T \int_T^{T+\Delta} \mathbb{E}_y\left[\frac{\mathrm{D}^1_y v_r+\mathrm{D}^2_y v_r}{X}\right] \mathrm{d}r \mathrm{d}y\right)\right\}\right] \nonumber \\[3pt] &\le \mathbb{E}\left[ \exp\left\{-\frac{p}{\Delta(T-s)} \int_s^T \int_T^{T+\Delta} \mathbb{E}_y\left[\log \left(\mathrm{D}^1_y v_r+\mathrm{D}^2_y v_r\right) -\log(X)\right] \mathrm{d}r \mathrm{d}y \right\}\right] \nonumber\\[3pt] &\le \left(\mathbb{E}\left[ \exp\left\{-\frac{2p}{\Delta(T-s)} \int_s^T \int_T^{T+\Delta} \mathbb{E}_y\left[\log \left(\mathrm{D}^1_y v_r+\mathrm{D}^2_y v_r \right)\right] \mathrm{d}r \mathrm{d}y \right\}\right]\right)^\frac{1}{2} \sqrt{\mathbb{E}\left[X^{2p}\right]}, \end{align}

using Jensen’s inequality, the Cauchy–Schwarz inequality, and the fact that $\mathrm{e}^{p\mathbb{E}_y[\log(X)]}\le \mathbb{E}_y[X^p]$ . Convexity and (33) imply

\begin{align*} -\log \left(\mathrm{D}^1_y v_r+\mathrm{D}^2_y v_r\right) \le &-\frac12 \Big\{\log \left(2 v_0\chi \nu(r-y)^{H_{-}} \mathcal{E}^1_r \right) + \log\left(2 v_0 \overline{\chi} \eta(\rho+\overline{\rho})(r-y)^{H_{-}} \mathcal{E}_r^2\right) \Big\}. \end{align*}

We focus on the first term; the other can be treated identically. From (35) we have

(38)

\begin{align} \mathbb{E}_y\left[\log \left(2 v_0 \chi \nu(r-y)^{H_{-}} \mathcal{E}^1_r \right)\right] = \log\left(2 v_0 \chi \nu(r-y)^{H_{-}}\right) - \frac{\nu^2 r^{2H}}{4H} + \nu \int_0^y (r-t)^{H_{-}} \mathrm{d} W^1_t . \end{align}

Let us start with

\begin{align*} &\int_s^T \int_T^{T+\Delta} \log\left(2 v_0 \chi \nu(r-y)^{H_{-}}\right) \mathrm{d}r \mathrm{d}y \\[3pt] &= 2(T-s)\Delta v_0 \chi \nu + H_- \int_s^T \int_T^{T+\Delta} \log(r-y) \mathrm{d}r \mathrm{d}y \\[3pt] &= 2(T-s)\Delta v_0 \chi \nu + H_- \\[3pt] & \qquad \int_s^T \Big[(T+\Delta-y)\log(T+\Delta-y) - (T+\Delta-y)- (T-y)\log(T-y) + (T-y) \Big]\mathrm{d}y \\[3pt] & = 2(T-s)\Delta v_0 \chi \nu + H_- \left\{-\Delta (T-s) - \int_\Delta^{T+\Delta-s} x\log(x)\mathrm{d}x +\int_0^{T-s} x\log(x)\mathrm{d}x \right\} \end{align*}

\begin{align*} & = 2(T-s)\Delta v_0 \chi \nu + H_- \Bigg\{-\Delta (T-s)+ \left(\frac{(T-s)^2 \log(T-s)}{2} - \frac{(T-s)^2}{4}\right) \\[3pt] & \qquad\qquad\quad - \left(\frac{(T+\Delta-s)^2 \log(T+\Delta-s)}{2} - \frac{(T+\Delta-s)^2}{4}\frac{\Delta^2\log(\Delta)}{2} +\frac{\Delta^2}{4}\right) \Bigg\}\\[3pt] &= 2(T-s)\Delta v_0 \chi \nu +(T-s) H_- \Bigg\{ -\Delta + \left(\frac{(T-s) \log(T-s)}{2} - \frac{(T-s)}{4}\right) + \frac{2\Delta +(T-s)}{4}\\[3pt] & \quad \qquad\qquad\qquad - \Delta\log(T+\Delta-s) - \frac{(T-s)\log(T+\Delta-s)}{2} \Bigg\} \\[3pt] & \qquad\qquad\qquad\quad -\frac{\Delta^2}{2} \big(\log(T+\Delta-s)-\log(\Delta) \big). \end{align*}

By Taylor’s theorem, $\log(T+\Delta-s)-\log(\Delta) = \frac{T-s}{\Delta} + \varepsilon(T-s)$ , where $\varepsilon:\mathbb{R}_+\to\mathbb{R}_+$ is such that $\varepsilon(x)/x$ tends to zero at the origin. We conclude that

\[ -\frac{p}{2\Delta(T-s)} \int_s^T \int_T^{T+\Delta} \log\left(2 v_0 \chi \nu(r-y)^{H_{-}}\right) \mathrm{d}r \mathrm{d}y \]

is uniformly bounded. Now we study the second term of (38):

\[ - \int_s^T \int_T^{T+\Delta} r^{2H} \mathrm{d}r\mathrm{d}y = (T-s) \frac{T^{2H_+}-(T+\Delta)^{2H_+}}{2H_+}. \]

Therefore the following is uniformly bounded:

\[ \frac{p}{2\Delta(T-s)} \int_s^T \int_T^{T+\Delta} \frac{\nu^2 r^{2H}}{4H} \mathrm{d}r \mathrm{d}y. \]

For the last term, by the stochastic Fubini theorem [Reference Protter48, Theorem 65], we get

\begin{align*} \int_s^T \int_T^{T+\Delta} \int_0^y (r-t)^{H_{-}} \mathrm{d} W^1_t \mathrm{d}r\mathrm{d}y & = \int_s^T \int_0^y \int_T^{T+\Delta} (r-t)^{H_{-}} \mathrm{d}r \mathrm{d} W^1_t \mathrm{d}y\\[3pt] & =\int_0^T \int_{s\vee t}^T \frac{(T+\Delta-t)^{H_{+}} - (T-t)^{H_{+}}}{H_{+}} \mathrm{d}y \mathrm{d} W^1_t. \end{align*}

Standard Gaussian computations then yield

(39)

\begin{align} &\mathbb{E}\left[\exp\left\{-\frac{p}{4\Delta(T-s)} \int_s^T \int_T^{T+\Delta} \nu \int_0^y (r-t)^{H_{-}} \mathrm{d} W^1_t \mathrm{d}r\mathrm{d}y \right\}\right] \nonumber \\[3pt] &= \exp\left\{\frac{1}{2} \left(\frac{p\nu}{4\Delta(T-s)}\right)^2 \int_0^T \left( \int_{s\vee t}^T \frac{(T+\Delta-t)^{H_{+}} - (T-t)^{H_{+}}}{H_{+}} \mathrm{d}y\right)^2\mathrm{d}t \right\}. \end{align}

The incremental function $x\mapsto (x+\Delta)^{H_{+}}-x^{H_{+}}$ is decreasing by concavity; hence $(T+\Delta-t)^{H_{+}} - (T-t)^{H_{+}} \le \Delta^{H_{+}}$ , and we obtain

\[ \int_0^T (T-s\vee t)^2 \mathrm{d}t = \int_0^s (T-s)^2\mathrm{d}t + \int_s^T (T-t)^2\mathrm{d}t = s(T-s)^2 + \frac{(T-s)^3}{3}, \]

which implies that (39) is uniformly bounded. We have thus shown that (37) is uniformly bounded in s, T.

Coming back to (36), by the Cauchy–Schwarz inequality we have

\[ \mathbb{E}[u_s^{-p}]^2 \le 2^{p} \mathbb{E}[X^{p}] \, \mathbb{E}\left[ \left(\frac{T-s}{\int_s^T (m^1_y + m^2_y) \mathrm{d}y} \right)^{2p} \right], \]

which is uniformly bounded for all $s\le T$ , and this concludes the proof.

6.4.2. Proof of Proposition 3

Level. We start with the derivatives

\begin{align*}\mathrm{D}^1_s v_t = v_0\Big[\chi \nu(t-s)^{H_{-}} \mathcal{E}^1_t + \overline{\chi} \eta \rho (t-s)^{H_{-}} \mathcal{E}_t^2\Big]\qquad\text{and}\qquad\mathrm{D}^2_s v_t = v_0 \overline{\chi} \eta \overline{\rho} (t-s)^{H_{-}} \mathcal{E}_t^2,\end{align*}

and recall, from the definitions in (9),

$$ J_1 = \int_{0}^\Delta v_{0} {\mathbb{E}}\left[\chi \nu r^{H_{-}} \mathcal{E}^1_r +\overline{\chi}\eta \rho r^{H_{-}} \mathcal{E}^2_r \right] \mathrm{d}r = v_0(\chi\nu+\overline{\chi}\eta\rho)\frac{\Delta^{H_{+}}}{H_{+}}\quad \text{and}\quad J_{2} = v_0\overline{\chi} \eta \overline{\rho} \frac{\Delta^{H_{+}}}{H_{+}}.$$

We also note that $\mathbb{E}[\mathcal{E}^i_t]=1$ . This yields the norm

\[{\left\|{\boldsymbol J}\right\|} \,:\!=\, \left(J_1^2 + J_2^2 \right)^{\frac{1}{2}}= \frac{v_0 \Delta^{H_{+}}}{H_{+}}\sqrt{ (\chi\nu+\overline{\chi}\eta\rho)^2 + \overline{\chi}^2\eta^2\overline{\rho}^2}= \frac{v_0 \Delta^{H_{+}}}{H_{+}} \psi(\rho,\nu,\eta,\chi),\]

with the function $\psi$ defined in the proposition, which grants us the first limit by Proposition 1. To simplify the notation below, we introduce $\mathfrak{w} \,:\!=\, \chi\nu + \overline{\chi}\eta\rho$ .

Skew. We compute the further derivatives

\begin{align*}\mathrm{D}^1_0 \mathrm{D}^1_0 v_t &= v_0\left(\chi\nu^2 t^{2H-1} \mathcal{E}_t^1 + \overline{\chi}\eta^2 \rho^2 t^{2H-1} \mathcal{E}_t^2 \right),\\[3pt] \mathrm{D}^1_0 \mathrm{D}^2_0 v_t & = v_0\overline{\chi}\eta^2 \rho \overline{\rho} t^{2H-1} \mathcal{E}^2_t,\\[3pt] \mathrm{D}^2_0 \mathrm{D}^2_0 v_t & = v_0\overline{\chi} \eta^2 \overline{\rho}^2 t^{2H-1} \mathcal{E}^2_t.\end{align*}

Similarly to J, we recall that $G_{ij}= \int_0^{\Delta} \mathbb{E}\big[\mathrm{D}^j_0 \mathrm{D}^i_0 v_r \big]\mathrm{d}r$ , so that

\begin{align*}G_{11}=\frac{\Delta^{2H}}{2H} v_0 (\chi\nu^2 + \overline{\chi}\eta^2 \rho^2), \qquad G_{12} = \frac{\Delta^{2H}}{2H} v_0\overline{\chi} \eta^2\rho \overline{\rho}, \qquad G_{22}= \frac{\Delta^{2H}}{2H} v_0\overline{\chi} \eta^2 \overline{\rho}^2.\end{align*}

Notice that $\mathrm{VIX}^2_0=v_0$ ; thus we have

\begin{align*}J_1^2 \left(G_{11} -\frac{ J_1^2}{\Delta\mathrm{VIX}_0^2} \right)&= \frac{v_0^3\Delta^{4H+1}}{2HH_+^2}\mathfrak{w}^2 (\chi\nu^2+\overline{\chi}\eta^2\rho^2)- \frac{v_0^3 \Delta^{4H+1}}{H_+^4} \mathfrak{w}^4\\[3pt] &= v_0^3 \frac{\Delta^{4H+1}}{H_+^2} \mathfrak{w}^2\left[ \frac{\chi\nu^2+\overline{\chi}\eta^2\rho^2}{2H}-\frac{\mathfrak{w}^2}{H_+^2}\right],\end{align*}

\begin{align*} J_1 J_2 \left( G_{12} -\frac{J_1 J_2}{\Delta\mathrm{VIX}_0^2} \right)&= \frac{v_0^3\Delta^{4H+1}}{2HH_+^2}\mathfrak{w}\overline{\chi}^2\eta^3\rho\overline{\rho}^2- \frac{v_0^3\Delta^{4H+1}}{H_+^4} \mathfrak{w}^2\overline{\chi}^2\eta^2\overline{\rho}^2 \\[3pt] &= v_0^3 \frac{\Delta^{4H+1}}{H_+^2} \mathfrak{w}\overline{\chi}^2\eta^2\overline{\rho}^2 \left[ \frac{\eta\rho}{2H} - \frac{\mathfrak{w}}{H_+^2}\right],\\[3pt] J_2^2 \left( G_{22} -\frac{J_2^2}{\Delta\mathrm{VIX}_0^2} \right)&= v_0^3 \frac{\Delta^{4H+1}}{2HH_+^2} \overline{\chi}^3 \eta^4\overline{\rho}^4 - v_0^3 \overline{\chi}^4 \frac{\Delta^{4H+1}}{H_+^4} \eta^4\overline{\rho}^4 \\[3pt] &= v_0^3 \frac{\Delta^{4H+1}}{H_+^2}\overline{\chi}^3 \eta^4\overline{\rho}^4 \left(\frac{1}{2H}-\frac{\overline{\chi}}{H_+^2}\right). \end{align*}

Finally, by Proposition 1 we obtain

\begin{align*}\lim_{T\downarrow0} \mathcal{S}_T =& \frac{H_+\Delta^{H_{-}}}{2\psi(\rho,\nu,\eta,\chi)^{3}} \Bigg\{\mathfrak{w}^2 \left[ \frac{\chi\nu^2+\overline{\chi}\eta^2\rho^2}{2H}-\left(\frac{\mathfrak{w}}{H_{+}}\right)^2\right]\\[3pt] & + 2 \mathfrak{w}\overline{\chi}^2\eta^2\overline{\rho}^2 \left[ \frac{\eta\rho}{2H} - \frac{\nu+\eta\rho}{H_+^2}\right]+ \overline{\chi}^3 \eta^4\overline{\rho}^4 \left(\frac{1}{2H}-\frac{1}{H_+^2}\right) \Bigg\}.\end{align*}

Curvature. For the last step we go one step further:

\begin{alignat*}{2}\mathrm{D}^1_0\mathrm{D}^1_0 \mathrm{D}^1_0 v_t &= v_0 \left(\chi\nu^3 \mathcal{E}_t^1 + \overline{\chi}\eta^3 \rho^3 \mathcal{E}_t^2 \right)t^{3H_-}, \qquad\qquad&&\mathrm{D}^2_0 \mathrm{D}^1_0 \mathrm{D}^1_0 v_t = v_0\overline{\chi} \eta^3 \rho^2 \overline{\rho} t^{3H_-} \mathcal{E}_t^2, \\[3pt] \mathrm{D}^2_0 \mathrm{D}^2_0 \mathrm{D}^1_0 v_t &= v_0\overline{\chi} \eta^3 \rho \overline{\rho}^2 t^{3H_-} \mathcal{E}_t^2, \qquad\qquad&&\mathrm{D}^2_0 \mathrm{D}^2_0 \mathrm{D}^2_0 v_t = v_0\overline{\chi} \eta^3 \overline{\rho}^3 t^{3H_-} \mathcal{E}_t^2.\end{alignat*}

We notice that

\[\lim_{T\downarrow0} \frac{\int_T^{T+\Delta} r^{3H_-} \mathrm{d}r}{T^{3H-\frac{1}{2}}} = \lim_{T\downarrow0} \frac{(T+\Delta)^{3H-\frac{1}{2}} - T^{3H-\frac{1}{2}}}{T^{3H-\frac{1}{2}} (3H-\frac{1}{2})} = \frac{2}{1-6H}.\]

By the curvature limit in Proposition 1, we have

\begin{alignat*}{2}& \lim_{T\downarrow0} \frac{\mathcal{C}_T}{T^{3H-\frac{1}{2}}}&&= \frac{2\Delta v_0}{3\left(\frac{v_0 \Delta^{H_{+}}}{2H_+}\right)^5 \psi(\rho,\nu,\eta,\chi)^{5}} \Bigg\{v_0^3\frac{\Delta^{3H_+}}{H_+^3} \mathfrak{w}^3 v_0 \frac{\chi\nu^3+\overline{\chi}\eta^3\rho^3}{\frac{1}{2}-3H}\\[3pt] & + 3v_0^3\frac{\Delta^{3H_+}}{H_+^3}&& \mathfrak{w}^2 \overline{\chi}\eta\overline{\rho} v_0 \frac{\overline{\chi}\eta^3\rho^2\overline{\rho}}{\frac{1}{2}-3H}\!+\! 3v_0^3\frac{\Delta^{3H_+}}{H_+^3} \mathfrak{w} \overline{\chi}^2\eta^2\overline{\rho}^2 v_0\frac{\overline{\chi}\eta^3\rho\overline{\rho}^2}{\frac{1}{2}-3H}\!+\! v_0\frac{\Delta^{3H_+}}{H_+^3} \overline{\chi}^3 \eta^3\overline{\rho}^3 v_0 \frac{\overline{\chi}\eta^3\overline{\rho}^3}{\frac{1}{2}-3H} \Bigg\} \\[3pt] & &&= \frac{128 \Delta^{-2H} H_+^2}{3\psi(\rho,\nu,\eta,\chi)^{5}(1-6H)} \\[3pt] & && \qquad \quad \Big\{ \mathfrak{w}^3 (\chi\nu^3+\overline{\chi}\eta^3\rho^3)+ 3 \mathfrak{w}^2 \overline{\chi}^2 \eta^4\overline{\rho}^2 \rho^2+ 3 \mathfrak{w}\overline{\chi}^3 \eta^5\overline{\rho}^4 \rho+ \overline{\chi}^4\eta^6\overline{\rho}^6 \Big\},\end{alignat*}

which yields the claim.

6.5. Proofs for the stock price

6.5.1. Proof of Proposition 4

Since $\phi$ and $u^{-p}$ are dominated by the conditions (i) and (iii) respectively, with the same notation as in the proof of Proposition 7, we obtain by (ii), as T goes to zero,

\begin{align*}\mathrm{D} \phi_s \lesssim (T-s)^{H_{-}}, \qquad\boldsymbol\Theta_s \lesssim (T-s)^{H_{+}}, \qquad\mathrm{D} \boldsymbol\Theta_s \lesssim (T-s)^{2H}, \qquad\mathrm{D} \mathrm{D} \boldsymbol\Theta_s \lesssim (T-s)^{3H-\frac{1}{2}}.\end{align*}

Under our three assumptions it is straightforward to see that $\boldsymbol{(\mathrm{H}_{12345})}$ are satisfied. Moreover, the terms in $\boldsymbol{(\mathrm{H}}_{\boldsymbol{6}}^{\lambda}\boldsymbol{)}$ behave as $T^{2H-\lambda}$ and the one in $\boldsymbol{(\mathrm{H}}_{\boldsymbol{7}}^{\lambda}\boldsymbol{)}$ as $T^{H_{-}{-\lambda}}$ , which means that if we set $\lambda=H_-$ the former vanishes and the second yields a non-trivial behaviour.

Let us have a look at the short-time implied volatility. By Lemma 3 and the continuity of v we have $\lim_{T\downarrow0} u_0 = \sqrt{\sum_{i=1}^N v_0 \rho_i^2}=\sqrt{v_0}$ almost surely; hence by Theorem 1 and the DCT,

\[\lim_{T\downarrow0}\widehat{\mathcal{I}}_{T} = \lim_{T\downarrow0} \mathbb{E}[u_0] = \sqrt{v_0}.\]

We then turn our attention to the short-time skew. With $\lambda=H_-$ , Theorem 2 and the DCT imply

\begin{align*}\lim_{T\downarrow0} \frac{\widehat{\mathcal{S}}_T}{T^{H_{-}}}=\sum_{i,j=1}^N\lim_{T\downarrow0} \frac{1}{2} \mathbb{E}\left[\frac{\int_0^T \phi_s^j \int_s^T \mathrm{D}_s^j (\phi_y^i)^2 \mathrm{d}y \mathrm{d}s}{u_0^3 T^{\frac32+H}}\right]&= \sum_{j=1}^N \frac{\rho_j}{2 v_0^{3/2}} \mathbb{E}\left[ \lim_{T\downarrow0} \frac{\int_0^T \int_s^T \sqrt{v_s}\, \mathrm{D}^j_{s} v_y\mathrm{d}y\mathrm{d}s}{T^{\frac32+H}} \right],\end{align*}

where we used $\sum_{i=1}^N \rho_i^2=1$ . For any $j\in [\![ 1,N ]\!]c$ , the Cauchy–Schwarz inequality yields

\[\mathbb{E}\left[ \left( \sqrt{\frac{v_s}{v_0}}-1\right) \mathrm{D}^j_s v_y \right] \le \mathbb{E}\left[ \left( \sqrt{\frac{v_s}{v_0}}-1\right)^2\right]^\frac{1}{2} \, \mathbb{E}\big[ (\mathrm{D}^j_s v_y)^2 \big]^\frac{1}{2},\]

where $\mathbb{E}\big[ (\mathrm{D}^j_s v_y)^2 \big]^\frac{1}{2}\le C (y-s)^{H_{-}}$ for some finite constant C by (ii). Therefore,

\begin{align*}\lim_{T\downarrow0} \bigg(&\frac{\int_0^T \int_s^T \mathbb{E}[ \sqrt{v_s}\, \mathrm{D}^j_{s} v_y]\mathrm{d}y\mathrm{d}s}{\sqrt{v_0}\,T^{\frac32+H}} - \frac{\int_0^T \int_s^T \mathbb{E}[ \mathrm{D}^j_{s} v_y]\mathrm{d}y\mathrm{d}s}{T^{\frac32+H}} \bigg) \\[3pt] &\qquad\qquad\le C \lim_{T\downarrow0}\left( \sup_{t\le T} \mathbb{E}\left[ \left( \sqrt{\frac{v_t}{v_0}}-1\right)^2\right]^\frac{1}{2} \frac{ \int_0^T \int_s^T (y-s)^{H_{-}}\mathrm{d}y\mathrm{d}s}{T^{\frac32+H}}\right).\end{align*}

Since the fraction is equal to $((H+\frac32)H_+)^{-1}$ and $\limsup_{T\downarrow0} \mathbb{E}\big[(\sqrt{v_t/v_0}-1)^2 \big]$ is null by (iv), we obtain

\[\lim_{T\downarrow0} \frac{\widehat{\mathcal{S}}_T}{T^{H_{-}}}= \sum_{j=1}^N \frac{\rho_j}{2 v_0} \mathbb{E}\left[ \lim_{T\downarrow0} \frac{\int_0^T \int_s^T \mathrm{D}^j_{s} v_y\mathrm{d}y\mathrm{d}s}{T^{\frac32+H}} \right]. \]

6.5.2. Proof of Corollary 1

Since

$$\mathbb{E}[u_s^{-p}] = \mathbb{E}\left[\left(\frac{1}{T-s}\int_s^T v_r\mathrm{d}r \right)^{-\frac{p}{2}}\right],$$

Lemmas 7 and 8 show that the assumptions (i)–(iii) of Proposition 4 hold. Moreover, v has almost-sure continuous paths; hence $\sqrt{\frac{v_t}{v_0}}$ tends to one almost surely and (iv) holds by reverse Fatou’s lemma. For $0\le s\le y$ , (33) implies

$$\mathbb{E}[\mathrm{D}^1_s v_y] = v_0 (y-s)^{H_{-}} \big(\chi \nu +\overline{\chi}\eta\rho\big)\qquad\text{and}\qquad\mathbb{E}[\mathrm{D}^2_s v_y] = v_0 (y-s)^{H_{-}}\overline{\chi}\eta\overline{\rho},$$

and clearly $\mathbb{E}[\mathrm{D}^3_s v_y] =0$ . Therefore, Proposition 4 implies

\[\lim_{T\downarrow0} \frac{\widehat{\mathcal{S}}_T}{T^{H_{-}}} = \frac{\rho_1}{2v_0}\frac{v_0(\chi\nu+\overline{\chi}\eta\rho)}{H_+(H+\frac32)} + \frac{\rho_2}{ 2v_0}\frac{v_0\overline{\chi}\eta\overline{\rho}}{H_+(H+\frac{3}{2})}= \frac{\rho_1 \chi\nu + \eta \overline{\chi} (\rho_1 \rho+\rho_2\overline{\rho})}{(2H_+)(H+\frac32)}.\]

6.6. Partial derivatives of the Black–Scholes function

Recall the Black–Scholes formula (6) and assume $\varsigma\,:\!=\,\sigma\sqrt{T-t} \gt 0$ fixed. Then

\begin{align*}\partial_{x} \mathrm{BS}(t,x,k,\sigma) & = \mathrm{e}^x \mathcal{N}(d_+(x,k,\sigma)),\\[3pt] \partial_{x}^2 \mathrm{BS}(t,x,k,\sigma) & = \mathrm{e}^x \left\{\mathcal{N}(d_+(x,k,\sigma)) + \frac{\mathcal{N}^{\prime}(d_+(x,k,\sigma))}{\varsigma}\right\},\end{align*}

so that (we drop the dependence on t and $\sigma$ in the $G({\cdot})$ notation)

$$G(x,k) \,:\!=\, (\partial_{x}^2 - \partial_{x}) \mathrm{BS}(t,x,k,\sigma) = \frac{\mathrm{e}^{x -\frac{1}{2} d_+(x,k,\sigma)^2}}{\varsigma\sqrt{2\pi}}= \frac{\mathrm{e}^{k-\frac{1}{2} d_-(x,k,\sigma)^2}}{\varsigma\sqrt{2\pi}}.$$

Now define

$$f(x,k) \,:\!=\, x- \frac{d_+(x,k,\sigma)^2}{2} = k - \frac{d_-(x,k,\sigma)^2}{2} = \frac{x+k}{2} - \frac{(x-k)^2}{2\varsigma^2} -\frac{\varsigma^2}{8}.$$

We then have

\begin{align*}\partial_{x} f(x,k) & = \frac{1}{2} - \frac{x-k}{\varsigma^2},\qquad\partial_{k} f(x,k) = \frac{1}{2}+ \frac{x-k}{\varsigma^2},\\[3pt] \partial_{x}^2 f(x,k) & = \partial_{k}^2 f(x,k) = -\partial^2_{xk} f(x,k) = - \frac{1}{\varsigma^2}.\end{align*}

For the partial derivatives, noting that $\partial_{x} G = \frac{1}{\varsigma\sqrt{2\pi}} \partial_{x} f \mathrm{e}^{f}$ implies the ATM formula

\begin{align*}\partial_{x} G(x,x) = \frac{1}{2\varsigma\sqrt{2\pi}}\exp\left\{x-\frac{\varsigma^2}{8}\right\},\end{align*}

and furthermore,

$$\partial_{xk} G = \frac{\mathrm{e}^f}{\varsigma\sqrt{2\pi}} \Big( \partial_{xk} f + \partial_{x} f \partial_{k} f \Big)\qquad\text{and}\qquad\partial_{xk} G(x,x) = \frac{\mathrm{e}^{f(x,x)}}{\varsigma\sqrt{2\pi}} \left( \frac{1}{\varsigma^2} + \frac14 \right).$$

We further define the partial derivatives appearing in the proof of Theorem 2, after (24):

\begin{align*}L(x,k) &\,:\!=\, \left(\frac14 \partial_x \!+\! \frac{1}{2} \partial_{xk}\right)\!G(x,k) = \frac{1}{\varsigma\sqrt{2\pi}} \mathrm{e}^{f(x,k)} \!\left(\frac14 \!+\! \frac14 \partial_k f(x,k) \!-\! \frac{1}{2} (\partial_k f(x,k))^2 \!-\! \frac{1}{2} \partial_{kk} f(x,k) \right),\\[3pt] L(x,x)&= \frac{\mathrm{e}^{f(x,x)}}{\varsigma\sqrt{2\pi}} \left( \frac14 + \frac{1}{2\varsigma^2} \right).&\end{align*}

Using $\partial_k f = 1-\partial_x f$ and $\partial_{xk} f = - \partial_{xx} f = -\partial_{kk}f$ , we compute

\begin{align*}\partial_k L &= \frac{\mathrm{e}^f}{\varsigma\sqrt{2\pi}} \left[ \frac34 \partial_x f - \frac54 (\partial_x f)^2+\frac{1}{2} (\partial_x f)^3 - \frac54 \partial_{xx} f + \frac32 \partial_x f \partial_{xx}f \right],\\[3pt] \partial_k L(x,x) &= \frac{\mathrm{e}^{f(x,x)}}{\varsigma\sqrt{2\pi}} \left( \frac18 + \frac{1}{2\varsigma^2}\right).\end{align*}

Finally, we need the derivatives featuring in the proof of Theorem 3. We start with

\begin{align*}\widetilde H & = \partial_{xk} L = \frac{\mathrm{e}^f}{\varsigma\sqrt{2\pi}} \\[3pt] & \qquad \left[ \frac34 (\partial_{x} f)^2\! -\!\frac54 (\partial_{x} f)^3 \!+\! \frac{1}{2} (\partial_{x} f)^4 + \frac34 \partial_{xx} f \!-\! \frac{15}{4} \partial_{x} f \partial_{xx} f \!+\! 3 (\partial_{x} f)^2 \partial_{xx} f \!+\! \frac32 (\partial_{xx} f)^2 \right],\\[3pt] \partial_{xk} L(x,x) &= \frac{\mathrm{e}^{f(x,x)}}{\varsigma\sqrt{2\pi}} \left(\frac{1}{16} + \frac{3}{8\varsigma^2} + \frac{3}{2\varsigma^4}\right).\end{align*}

The next partial derivative yields

\begin{align*}\partial_{xxk} L & = \frac{\mathrm{e}^f}{\varsigma\sqrt{2\pi}} \left[ \frac34 (\partial_{x} f)^3 - \frac54 (\partial_{x} f)^4 + \frac{1}{2} (\partial_{x} f)^5 + \frac94 \partial_{xx} f \partial_{x} f - \frac{15}{2} \partial_{xx} f (\partial_{x} f)^2 \right.\\[3pt] & \qquad\qquad \left.- \frac{15}{4} (\partial_{xx} f)^2 + 5\partial_{xx} f (\partial_{x} f)^3 + \frac{15}{2}(\partial_{xx} f)^2 \partial_{x} f\right],\\[3pt] \partial_{xxk} L(x,x) &= \frac{\mathrm{e}^{f(x,x)}}{\varsigma\sqrt{2\pi}} \left(\frac{1}{32} + \frac{1}{8\varsigma^2} \right),\end{align*}

and differentiating one last time we reach

\begin{align*}\partial_{xxxk} L & = \frac{\mathrm{e}^f}{\varsigma\sqrt{2\pi}} \bigg[\frac34 (\partial_{x} f)^4 - \frac54 (\partial_{x} f)^5 + \frac{1}{2} (\partial_{x} f)^6 + \frac92 \partial_{xx} f (\partial_{x} f)^2 - \frac{25}{2} \partial_{xx} f (\partial_{x} f)^3 \\[3pt] &\qquad - \frac{75}{4} (\partial_{xx} f)^2 \partial_{x} f + \frac{15}{2} \partial_{xx} f (\partial_{x} f)^4 + \frac{45}{2} (\partial_{xx} f)^2(\partial_{x} f)^2 + \frac94 (\partial_{xx} f)^2 + \frac{15}{2} (\partial_{xx} f)^3\bigg],\\[3pt] \partial_{xxxk} L(x,x) & = \frac{\mathrm{e}^{f(x,x)}}{\varsigma\sqrt{2\pi}} \bigg(\frac{1}{64} - \frac{1}{32\varsigma^2} - \frac{3}{2\varsigma^4} - \frac{15}{2\varsigma^6} \bigg).\end{align*}

We conclude that

\begin{align*}H(x,x) = (\partial_{xxxk} - \partial_{xxk})L(x,x)= \frac{\mathrm{e}^{f(x,x)}}{\varsigma\sqrt{2\pi}} \left({-}\frac{1}{64} - \frac{5}{32\varsigma^2} - \frac{3}{2\varsigma^4}- \frac{15}{2\varsigma^6} \right).\end{align*}

Funding information

A. P. acknowledges financial support from the EPSRC CDT in Financial Computing and Analytics, while A. P. and A. J. acknowledge support from the EPSRC EP/T032146/1 grant. Part of the project was carried out while A. P. was at Imperial.

Competing interests

There were no competing interests to declare which arose during the preparation or publication process of this article.

References

Adler, R. J. and Taylor, J. E. (2007). Random Fields and Geometry. Springer, New York.Google Scholar

Alòs, E., García-Lorite, D. and Muguruza, A. (2022). On smile properties of volatility derivatives and exotic products: understanding the VIX skew . SIAM J. Financial Math. 13, 32–69.CrossRef Google Scholar

Alòs, E. and León, J. A. (2017). On the curvature of the smile in stochastic volatility models. SIAM J. Financial Math. 8, 373–399.CrossRef Google Scholar

Alòs, E., León, J. A. and Vives, J. (2007). On the short-time behavior of the implied volatility for jump-diffusion models with stochastic volatility. Finance Stoch. 11, 571–589.CrossRef Google Scholar

Alòs, E. and Shiraya, K. (2019). Estimating the Hurst parameter from short term volatility swaps. Finance Stoch. 23, 423–447.CrossRef Google Scholar

Baldeaux, J. and Badran, A. (2014). Consistent modelling of VIX and equity derivatives using a 3/2 plus jumps model. Appl. Math. Finance 21, 299–312.CrossRef Google Scholar

Barletta, A., Nicolato, E. and Pagliarani, S. (2018). The short-time behavior of VIX implied volatilities in a multifactor stochastic volatility framework. Math. Finance 29, 928–966.CrossRef Google Scholar

Bayer, C. et al. (eds) (2023). Rough Volatility. Society for Industrial and Applied Mathematics, Philadelphia.CrossRef Google Scholar

Bayer, C., Friz, P. K. and Gatheral, J. (2016). Pricing under rough volatility. Quant. Finance 16, 887–904.CrossRef Google Scholar

Bayer, C., Qiu, J. and Yao, Y. (2022). Pricing options under rough volatility with backward SPDEs. SIAM J. Financial Math. 13, 179–212.CrossRef Google Scholar

Bennedsen, M., Lunde, A. and Pakkanen, M. S. (2022). Decoupling the short-and long-term behavior of stochastic volatility. J. Financial Econometrics 20, 961–1006.CrossRef Google Scholar

Berestycki, H., Busca, J. and Florent, I. (2004). Computing the implied volatility in stochastic volatility models. Commun. Pure Appl. Math. 57, 1352–1373.CrossRef Google Scholar

Bergomi, L. (2005). Smile dynamics II. Risk, 67–73.Google Scholar

Bergomi, L. (2008). Smile dynamics III. Risk, 90–96.Google Scholar

Bonesini, O., Callegaro, G. and Jacquier, A. (2023). Functional quantization of rough volatility and applications to the VIX. Quant. Finance 23, 1769–1792.CrossRef Google Scholar

Bonesini, O., Jacquier, A. and Pannier, A. (2023). Rough volatility, path-dependent PDEs and weak rates of convergence. Preprint. Available at https://arxiv.org/abs/2304.03042.Google Scholar

Bourgey, F. and De Marco, S. (2022). Multilevel Monte Carlo simulation for VIX options in the rough Bergomi model. J. Comput. Finance 26, 53–82.Google Scholar

Carr, P. and Madan, D. (2014). Joint modeling of VIX and SPX options at a single and common maturity with risk management applications. IIE Trans. 46, 1125–1131.CrossRef Google Scholar

Cont, R. and Kokholm, T. (2013). A consistent pricing model for index options and volatility derivatives. Math. Finance 23, 248–274.CrossRef Google Scholar

De Marco, S. (2018). Volatility derivatives in (rough) forward variance model, Slides at Stochastic Analysis and its Applications, Oaxaca, Mai 2018 Bachelier Congress, Dublin. Available at http://webfiles.birs.ca/cmo-workshops/2018/18w5080/files/Marco.pdf.Google Scholar

De Marco, S. and Henry-Labordère, P. (2015). Linking vanillas and VIX options: a constrained martingale optimal transport problem. SIAM J. Financial Math. 6, 1171–1194.CrossRef Google Scholar

El Euch, O. and Rosenbaum, M. (2019). The characteristic function of rough Heston models. Math. Finance 29, 3–38.CrossRef Google Scholar

Forde, M., Gerhold, S. and Smith, B. (2021). Small-time VIX smile and the stationary distribution for the rough Heston model. Preprint. Available at https://nms.kcl.ac.uk/martin.forde.Google Scholar

Fouque, J.-P. and Saporito, Y. F. (2018). Heston stochastic vol-of-vol model for joint calibration of VIX and S&P 500 options. Quant. Finance 18, 635–654.CrossRef Google Scholar

Fukasawa, M. (2011). Asymptotic analysis for stochastic volatility: martingale expansion. Finance Stoch. 15, 635–654.CrossRef Google Scholar

Gatheral, J. (2008). Slides at The Fifth World Congress of the Bachelier Finance Society London, July 18, 2008. Bachelier Congress, London. Available at https://pdfs.semanticscholar.org/9c56/d0bc9c3f75791a623daa8474c6620b8f03e1.pdf.Google Scholar

Gatheral, J., Jaisson, T. and Rosenbaum, M. (2018). Volatility is rough. Quant. Finance 18, 933–949.CrossRef Google Scholar

Gatheral, J., Jaisson, T. and Rosenbaum, M. (2020). The quadratic rough Heston model and the joint S&P 500/VIX smile calibration problem. Risk.CrossRef Google Scholar

Goutte, S., Ismail, A. and Pham, G. (2017). Regime-switching stochastic volatility model: estimation and calibration to VIX options. Appl. Math. Finance 24, 38–75.CrossRef Google Scholar

Guennoun, H., Jacquier, A., Roome, P. and Shi, F. (2018). Asymptotic behaviour of the fractional Heston model. SIAM J. Financial Math. 9, 1017–1045.CrossRef Google Scholar

Guyon, J. (2020). The joint S&P 500/VIX smile calibration puzzle solved. Risk.Google Scholar

Guyon, J., Menegaux, R. and Nutz, M. (2017). Bounds for VIX futures given S&P 500 smiles. Finance Stoch. 21, 593–630.CrossRef Google Scholar

Horvath, B., Jacquier, A. and Lacombe, C. (2019). Asymptotic behaviour of randomised fractional volatility models. J. Appl. Prob. 56, 496–523.CrossRef Google Scholar

Horvath, B., Jacquier, A. and Tankov, P. (2020). Volatility options in rough volatility models. SIAM J. Financial Math. 11, 437–469.CrossRef Google Scholar

Jacquier, A., Martini, C. and Muguruza, A. (2018). On VIX futures in the rough Bergomi model. Quant. Finance 18, 45–61.CrossRef Google Scholar

Jacquier, A., Pakkanen, M. and Stone, H. (2018). Pathwise large deviations for the rough Bergomi model. J. Appl. Prob. 55, 1078–1092.CrossRef Google Scholar

Jacquier, A. and Pannier, A. (2022). Large and moderate deviations for stochastic Volterra systems. Stoch. Process. Appl. 149, 142–187.CrossRef Google Scholar

Jacquier, A. and Zuric, Z. (2023). Random neural networks for rough volatility. Preprint. Available at https://arxiv.org/abs/2305.01035.CrossRef Google Scholar

Jaisson, T. and Rosenbaum, M. (2016). Rough fractional diffusions as scaling limits of nearly unstable heavy tailed Hawkes processes. Ann. Appl. Prob. 26, 2860–2882.CrossRef Google Scholar

Kokholm, T. and Stisen, M. (2015). Joint pricing of VIX and SPX options with stochastic volatility and jump models. J. Risk Finance 16, 27–48.CrossRef Google Scholar

Lacombe, C., Muguruza, A. and Stone, H. (2021). Asymptotics for volatility derivatives in multi-factor rough volatility models. Math. Financial Econom. 15, 545–577.CrossRef Google Scholar

Nualart, D. (2006). The Malliavin Calculus and Related Topics. Springer, Berlin.Google Scholar

Pacati, C., Pompa, G. and Reno, R. (2018). Smiling twice: the Heston ++ model. J. Banking Finance 96, 185–206.CrossRef Google Scholar

Pannier, A. (2023). Path-dependent PDEs for volatility derivatives. Preprint. Available at https://arxiv.org/abs/2311.08289.Google Scholar

Pannier, A. and Salvi, C. (2024). A path-dependent PDE solver based on signature kernels. Preprint. Available at https://arxiv.org/abs/2403.11738.Google Scholar

Papanicolaou, A. (2018). Extreme-strike comparisons and structural bounds for SPX and VIX options. SIAM J. Financial Math 9, 401–434.CrossRef Google Scholar

Papanicolaou, A. and Sircar, R. (2014). A regime-switching Heston model for VIX and S&P 500 implied volatilities. Quant. Finance 14, 1811–1827.CrossRef Google Scholar

Protter, P (2005). Stochastic Integration and Differential Equations. Springer, Berlin.CrossRef Google Scholar

Rosenbaum, M. and Zhang, J. (2022). Deep calibration of the quadratic rough Heston model. Risk.Google Scholar

Viens, F. and Zhang, J. A martingale approach for fractional Brownian motions and related path-dependent PDEs. Ann. Appl. Prob. 29, 3489–3540.Google Scholar

Article contents

Rough multi-factor volatility for SPX and VIX options

Abstract

Keywords

MSC classification

Information

1. Introduction

2. Framework

2.1. Level, skew, and curvature

2.2. Examples

2.2.1. Asset price

2.2.2. VIX

2.2.3. Asian options

2.2.4. Multi-factor rough Bergomi

2.3. General assumptions

3. Main results

4. Asymptotic results in the VIX case

4.1. A generic volatility model

4.2. The two-factor rough Bergomi

5. The stock smile under multi-factor models

5.1. Tips for joint calibration in the two-factor rough Bergomi model

6. Proofs

6.1. Useful results

6.2. Proofs of the main results

6.2.1. Proof of Theorem 1: level

6.2.2. Proof of Theorem 2: skew

6.2.3. Proof of Theorem 3: curvature

6.3. Proof of Proposition 1: VIX asymptotics

6.3.1. Convergence lemmas

6.3.2. Proof of Proposition 1

6.4. Proofs in the two-factor rough Bergomi model

6.4.1. Proof of Proposition 2

6.4.2. Proof of Proposition 3

6.5. Proofs for the stock price

6.5.1. Proof of Proposition 4

6.5.2. Proof of Corollary 1

6.6. Partial derivatives of the Black–Scholes function

Funding information

Competing interests

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests