1. Introduction
Time series of intraday prices are typically described as a discretized path of a continuous-time stochastic process. To have arbitrage-free markets the log-price process should be a semimartingale. Risk estimation based on high-frequency data at the highest available observation frequencies has to take microstructure frictions into account. Disentangling these market microstructure effects from the dynamics of the long-run price evolution has led to observation models with additive noise; see, for instance, [Reference Aït-Sahalia and Jacod2, Reference Hansen and Lunde13, Reference Li and Linton19]. The market microstructure noise, modelling among other effects the oscillation of traded prices between bid and ask order levels in an electronic market, is classically a centred (white) noise process with expectation equal to zero. These models can explain many stylized facts of high-frequency data. Having available full limit order books including data of submissions, cancellations, and executions of bid and ask limit orders, however, it is not clear which time series to consider at all. While challenging the concept of one price process it raises the question of whether the information can be exploited more efficiently, in particular to improve risk quantification. The stochastic boundary model considered for limit order prices of an order book has been discussed by [Reference Bibinger, Jirak and Reiß5], [Reference Liu, Liu, Liu and Ding20], and [Reference Bishwal8, Chapter 1.8]. It preserves the concept of an underlying efficient, semimartingale log-price which determines the long-run price dynamics and an additive, exogenous noise which models market-specific microstructure frictions. Its key idea is that ask order prices should (in most cases) lie above the unobservable efficient price and bid prices below the efficient price. This leads to observation errors which are irregular in the sense of having non-zero expectation and a distribution with a lower- or upper-bounded support. Considering without loss of generality a model for (best) ask order prices, we obtain lower-bounded observation errors and use local minima for the estimation. Modelling (best) bid prices instead would yield a model with upper-bounded observation errors and local maxima could be used for an analogous estimation. Both can be combined in practice.
It is known that the statistical and probabilistic properties of models with irregular noise are very different than for regular noise and require other methods; see, for instance, [Reference Jirak, Meister and Reiß17, Reference Meister and Reiß23, Reference Reiß and Wahl24]. Therefore, our estimation methods and asymptotic theory are quite different compared to the market microstructure literature, while we can still profit from some of the techniques used there. In [Reference Bibinger, Jirak and Reiß5] an estimator for the quadratic variation of a continuous semimartingale, that is, the integrated volatility, was proposed with convergence rate $n^{-1/3}$ , based on n discrete observations with one-sided noise. Optimality of the rate was proved in the standard asymptotic minimax sense. The main insight was that this convergence rate is better than the optimal rate, $n^{-1/4}$ , under regular market microstructure noise.
A recent strand of literature proposes structural, parametric market microstructure noise models incorporating information based on observed order book quantities as volume or trade types; see [Reference Chaker9–Reference Clinet and Potiron11, Reference Li, Xie and Zheng18]. Splitting the noise into a parametric function of such quantities and residual noise, a plug-in estimation of integrated volatility can also yield faster convergence rates than in the classical model with uninformative noise. While this effect of improved volatility estimation appears to be a similarity to our work, our viewpoint on market microstructure is quite distinct. We focus on a model with one-sided instead of centred noise, but we neither impose a parametric assumption on the noise, nor do we include additional trading information. Such refinements of a one-sided noise model, as discussed in the mentioned works for the centred noise model, might be of interest for future research when microstructure effects of bid and ask quotes are better understood. This could potentially further improve volatility estimation.
Inference on the spot volatility is one of the most important topics in the financial literature; see, for instance, [Reference Bibinger, Hautsch, Malec and Reiß4, Reference Mancini, Mattiussi and Renò22] and the references therein. In this work, we address spot volatility estimation for the model from [Reference Bibinger, Jirak and Reiß5]. Using local minima over blocks of shrinking lengths $h_n\propto n^{-2/3}\propto (nh_n)^{-2}$ , the resulting distribution of local minima in [Reference Bibinger, Jirak and Reiß5] became involved and infeasible, such that a central limit theorem for the integrated volatility estimator could not be obtained. Our spot volatility estimator is related to a localized version of the estimator from [Reference Bibinger, Jirak and Reiß5], combined with truncation methods to eliminate jumps of the semimartingale. For the asymptotic theory, however, we follow a different approach choosing blocks of lengths $h_n$ , where $h_n n^{2/3}\to\infty$ slowly. This allows us to establish stable central limit theorems with the best achievable rate, arbitrarily close to $n^{-1/6}$ , in the important special case of a semimartingale volatility. We exploit this to construct pointwise asymptotic confidence intervals.
Although the asymptotic theory relies on block lengths that are slightly unbalanced by smoothing out the impact of the noise distribution on the distribution of local minima asymptotically, our numerical study demonstrates that the confidence intervals work well in realistic scenarios with block lengths which optimize the estimation performance. Robustness to different noise specifications is an advantage that is naturally implied by our approach. Our estimator is surprisingly simple: it is a local average of squared differences of block-wise minima times a constant factor which comes from moments of the half-normal distribution of the minimum of a Brownian motion over the unit time interval. This estimator is consistent. However, the stable central limit theorem at a fast convergence rate requires a subtle bias correction which incorporates a more precise approximation of the asymptotic distribution of local minima. For that purpose, our analysis is based on a generalization of the arcsine law which gives the distribution of the proportion of time over some interval that a Brownian motion is positive. In order to compute the bias-correction function numerically, we introduce an efficient algorithm. Reducing local minima over many random variables to iterated minima of two random variables in each step combined with a convolution step can be interpreted as a kind of dynamic programming approach. It turns out to be much more efficient compared to the natural approximation by a Monte Carlo simulation and is a crucial ingredient of our numerical application. Our convergence rate is much faster than the optimal rate, $n^{1/8}$ , for spot volatility estimation under regular noise [Reference Bibinger, Hautsch, Malec and Reiß4, Reference Hoffmann, Munk and Schmidt-Hieber14]. The main contribution of this work is to develop the probabilistic foundation for the asymptotic analysis of the estimator and to establish the stable central limit theorems, asymptotic confidence, and a numerically practicable method.
The methods and proof techniques to deal with jumps are inspired by the truncation methods pioneered in [Reference Mancini21] and summarized in [Reference Jacod and Protter15, Chapter 13]. Overall, the strategy and restrictions on jump processes are to some extent similar, while several details under irregular noise using order statistics are rather different compared to settings without noise or with regular centred noise as in [Reference Bibinger and Winkelmann7].
We introduce and further discuss our model in Section 2. Section 3 presents estimation methods and Section 4 asymptotic results. The numerical application is considered in Section 5 and a Monte Carlo simulation study illustrates the appealing finite-sample performance of the method. All proofs are given in Section 6.
2. Model with lower-bounded, one-sided noise and assumptions
Consider an Itô semimartingale
with a one-dimensional standard Brownian motion $(W_t)$ , defined on some filtered probability space $(\Omega^X,\mathcal{F}^X,(\mathcal{F}^X_t),\mathbb{P}^X)$ . For the drift process $(a_t)$ and the volatility process $(\sigma_t)$ we impose the following quite general assumptions.
Assumption 1. The processes $(a_t)_{t\ge 0}$ and $(\sigma_t)_{t\ge 0}$ are locally bounded. The volatility process is strictly positive, $\inf_{t\in[0,1]}\sigma_t>0$ , $\mathbb{P}^X$ -almost surely. For all $0\leq t+s\leq1$ , $t\ge 0$ , $s\ge 0$ , with some constants $C_{\sigma}>0$ , and $\alpha>0$ ,
Condition (2) introduces a regularity parameter $\alpha$ , governing the smoothness of the volatility process. The parameter $\alpha$ is crucial, since it will naturally influence the convergence rates of spot volatility estimation. Inequality (2) is less restrictive than $\alpha$ -Hölder continuity, since it does not rule out volatility jumps. For instance, any compound Poisson jump process with a jump size distribution having finite second moments satisfies (2) with $\alpha=\frac12$ . Since second moments in (2) of such a process are bounded by a constant times $(s^2+s)$ , i.e. the second moment of a Poisson distribution with parameter s, this readily follows. Similar bounds for more general jump processes are given, for instance, in [Reference Jacod and Protter15, Section 13]. This is important as empirical evidence for volatility jumps, in particular simultaneous price and volatility jumps, has been reported for intraday high-frequency financial data [Reference Bibinger, Neely and Winkelmann6, Reference Tauchen and Todorov28]. The presented theory is, moreover, for general stochastic volatilities, also allowing for rough volatility. Rough fractional stochastic volatility models recently became popular and are used, for instance, in the macroscopic model of [Reference El Euch, Fukasawa and Rosenbaum12, Reference Rosenbaum and Tomas25].
The jump component of (1) is illustrated as in [Reference Jacod and Protter15] and related literature, where the predictable function $\delta$ is defined on $\Omega\times \mathbb{R}_+\times \mathbb{R}$ , and the Poisson random measure $\mu$ is compensated by $\nu(\textrm{d} s,\textrm{d} z)=\lambda(\textrm{d} z)\otimes \textrm{d} s$ , with a $\sigma$ -finite measure $\lambda$ . We impose the following standard condition with a generalized Blumenthal–Getoor or jump activity index r, $0\le r\le 2$ .
Assumption 2. Assume that $\sup_{\omega,x}|\delta(t,x)|/\gamma(x)$ is locally bounded with a non-negative, deterministic function $\gamma$ which satisfies $\int_{\mathbb{R}}(\gamma^r(x)\wedge 1)\,\lambda(\textrm{d} x)<\infty$ .
We use the notation $a\wedge b=\min\!(a,b)$ , and $a\vee b=\max\!(a,b)$ , throughout this paper. Assumption 2 is most restrictive in the case $r=0$ , when jumps are of finite activity. The larger r is, the more general jump components are allowed. We will develop results under mild restrictions on r.
The process $(X_t)$ , which can be decomposed into
with a continuous component $(C_t)$ and a càdlàg jump component $(J_t)$ , provides a model for the latent efficient log-price process in continuous time.
High-frequency (best) ask order prices from a limit order book at times $t_i^n$ , $0\le i\le n$ , on the fix time interval [0, 1] cannot be adequately modelled by discrete recordings of $(X_t)$ . Instead, we propose the additive model with lower-bounded, one-sided microstructure noise:
The crucial property of the model is that the support of the noise is lower bounded. It is not that important that this boundary is zero—it could be a different constant, or even a regularly varying function over time. The methods and results presented are robust with respect to such model generalizations. We set the bound equal to zero, which appears to be the most natural choice for limit orders.
Assumption 3. The independent and identically distributed (i.i.d.) noise $(\varepsilon_i)_{0\le i\le n}$ has a cumulative distribution function (CDF) $F_{\eta}$ satisfying
This is a nonparametric model in that the extreme value index is $-1$ for the minimum domain of attraction close to the boundary. This standard assumption on one-sided noise has already been used in [Reference Jirak, Meister and Reiß17, Reference Reiß and Wahl24] within different frameworks. We do not require assumptions about the maximum domain of attraction, moments, and the tails of the noise distribution. Parametric examples which satisfy (5) are, for instance, the uniform distribution on some interval, the exponential distribution, and the standard Pareto distribution with heavy tails.
The i.i.d. assumption on the noise is crucial, and generalizations to weakly dependent noise will require considerable work and new proof concepts. Heterogeneity, that is, a time-dependent noise level $\eta(t)$ , could be included in our asymptotic analysis under mild assumptions.
3. Construction of spot volatility estimators
We partition the observation interval [0, 1] into $h_n^{-1}$ equispaced blocks, $h_n^{-1}\in\mathbb{N}$ , and take local minima on each block. We hence obtain, for $k=0,\ldots,h_n^{-1}-1$ , the local, block-wise minima
While $h_n^{-1}$ is an integer, $nh_n$ is in general not integer valued. For a simple interpretation, however, we can think of $nh_n$ as an integer-valued sequence which gives the number of noisy observations per block in the case of equidistant observations. A spot volatility estimator could be obtained as a localized version of the estimator from [Reference Bibinger, Jirak and Reiß5, (2.9)] for the integrated volatility in the analogous model. The idea is that differences $m_{k,n}-m_{k-1,n}$ of local minima estimate differences of efficient prices, and a sum of squared differences can be used to estimate the volatility. However, things are not that simple. To determine the expectation of squared differences of local minima we introduce the function
where $(B_t)$ and $(\tilde B_t)$ denote two independent standard Brownian motions. In [Reference Bibinger, Jirak and Reiß5], the block length balanced the order of block-wise minimal errors $(nh_n)^{-1}$ under (5) and the order $h_n^{1/2}$ of the movement of the stochastic semimartingale boundary over a block. For $h_n n^{2/3}\to \infty$ , $\Psi_n$ tends to the identity function, so we have that
In this asymptotic regime local minima are mainly determined by local minima of the boundary process, such that the first-order approximation equals (6) when neglecting the noise $(\varepsilon_i)$ on the right-hand side. The half-normal distribution of the minimum of a Brownian motion over an interval and its moments then readily yield (7). A formal proof of (7) is contained in Step 3 of the proof of Theorem 1 in Section 6.2. Note that we defined $\Psi_n$ differently than in [Reference Bibinger, Jirak and Reiß5], e.g. in their (A.35), with the additional factor $\pi/(\pi-2)$ . By the simple asymptotic approximation in (7), we do not require $\Psi_n^{-1}$ for a consistent estimator.
When there are no price jumps, a simple consistent estimator for the spot squared volatility $\sigma_{\tau}^2$ is given by
for suitable sequences $h_n\to 0$ and $K_n\to\infty$ . Using only observations before time $\tau$ , the estimator is available online at time $\tau\in(0,1]$ during a trading day. For $\tau$ close to 0, when $\lfloor h_n^{-1}\tau\rfloor\le K_n$ , the factor $K_n^{-1}$ can be adjusted to get an average. Since this is unimportant for asymptotic theory, we keep $K_n^{-1}$ for simple notation. Working with ex post data over the whole interval, instead of using only observations before time $\tau$ , we may also use
or an estimator with an average centred around time $\tau\in(0,1)$ . The difference between the two estimators (9) and (8) can be used to infer a possible jump in the volatility process at time $\tau\in(0,1)$ , as in [Reference Bibinger, Neely and Winkelmann6].
To construct confidence intervals for the spot volatility, it is useful to also establish a spot quarticity estimator:
A spot volatility estimator that is robust with respect to jumps in $(X_t)$ is obtained with threshold versions of these estimators. We truncate differences of local minima whose absolute values exceed a threshold $u_n= \beta\cdot h_n^{\kappa}$ , $\kappa\in\big(0,\frac12\big)$ , with some positive constant $\beta$ , which leads to
4. Asymptotic results
We establish asymptotic results for equidistant observations, $t_i^n=i/n$ . We begin with the asymptotic theory in a setup without jumps in $(X_t)$ .
Theorem 1. (Stable central limit theorem for continuous $(X_t)$ .) Set $h_n$ such that $h_n n^{2/3}\to \infty$ and $K_n=C_K h_n^{\delta -2\alpha/(1+2\alpha)}$ for arbitrary $\delta$ , $0<\delta<2\alpha/(1+2\alpha)$ , and some constant $C_K>0$ . If $(X_t)$ is continuous, i.e. $J_t=0$ in (3), under Assumptions 1 and 3, the spot volatility estimator (8) is consistent, $\hat\sigma^2_{\tau-}\stackrel{\mathbb{P}}{\rightarrow} \sigma_{\tau-}^2$ , and satisfies the stable central limit theorem
There is only a difference between $\sigma_{\tau}^2$ and its left limit $\sigma_{\tau-}^2$ in the case of a volatility jump at time $\tau$ . In particular, the estimator is also consistent for $\sigma_{\tau}^2$ for any fix $\tau\in(0,1)$ . The convergence rate $K_n^{-1/2}$ gets arbitrarily close to $n^{ -2\alpha/(3+6\alpha)}$ , which is optimal in our model. The optimal rate is attained, according to [Reference Bibinger, Jirak and Reiß5], for $h_n\propto n^{-2/3}$ and $K_n\propto h_n^{-2\alpha/(1+2\alpha)}$ , i.e. $\delta\downarrow 0$ . In the important special case when $\alpha=\frac12$ , for a semimartingale volatility, the rate is arbitrarily close to $n^{-1/6}$ . This is much faster than the optimal rate of convergence in the model with additive centred microstructure noise, which is known to be $n^{ -1/8}$ [Reference Bibinger, Hautsch, Malec and Reiß4, Reference Hoffmann, Munk and Schmidt-Hieber14]. The constant in the asymptotic variance is obtained from several variance and covariance terms including (squared) local minima, and is approximately 2.44. The function $\Psi_n$ was shown to be monotone and invertible in [Reference Bibinger, Jirak and Reiß5], and $\Psi_n$ and its inverse $\Psi_n^{-1}$ can be approximated using Monte Carlo simulations, see Section 5.1. The asymptotic distribution of the estimator does not hinge on the noise level $\eta$ , which is different to methods for centred noise. Hence, we do not require any pre-estimation of noise parameters and the theory directly extends to a time-varying noise level $\eta(t)$ in (5) under the mild assumption that $0<\eta(t)<\infty$ for all t. The stable convergence in (12) is stronger than weak convergence and is important, since the limit distribution is mixed normal depending on the stochastic volatility. We refer to [Reference Jacod and Protter15, Section 2.2.1] for an introduction to stable convergence. For a normalized central limit theorem, we can use the spot quarticity estimator (10).
Proposition 1. (Feasible central limit theorem.) Under the conditions of Theorem 1, the spot quarticity estimator (10) is consistent, such that we get for the spot volatility estimation the normalized central limit theorem
Proposition 1 yields asymptotic confidence intervals for spot volatility estimation. For $q\in (0,1)$ , we have
by the monotonicity of $\Psi_n^{-1}$ , with $\Phi$ the CDF of the standard normal distribution. Since $\Psi_n^{-1}$ is differentiable by [Reference Bibinger, Jirak and Reiß5, (A.35)] and the derivative is $\big(\Psi_n^{-1}\big)'=1+{\scriptstyle{\mathcal{O}}}(1)$ by (7), the delta method (for stable convergence) also yields asymptotic confidence intervals and the central limit theorem
We cannot simply replace $\Psi_n\big(\sigma_{\tau-}^2\big)$ in (12) by its first-order approximation $\sigma_{\tau-}^2$ , or $\Psi_n^{-1}\big(\hat\sigma^2_{\tau-}\big)$ in (13) by $\hat\sigma^2_{\tau-}$ , since the biases do not converge to zero sufficiently fast. That is, $(\hat\sigma^2_{\tau-}-\sigma_{\tau-}^2) =\mathcal{O}_{\mathbb{P}}\big(K_n^{-1/2}\big)$ does not hold true in general. Furthermore, if the condition $h_n n^{2/3}\to\infty$ is violated, the central limit theorems do not apply.
Theorem 2. (Stable central limit theorem with jumps in $(X_t)$ .) Set $h_n$ such that $h_n n^{2/3}\to \infty$ and $K_n=C_K h_n^{\delta -2\alpha/(1+2\alpha)}$ for arbitrary $\delta$ , $0<\delta<2\alpha/(1+2\alpha)$ , and some constant $C_K>0$ . Under Assumptions 1, 2, and 3, with
the truncated spot volatility estimator (11) with
is consistent, $\hat\sigma^{2,\textrm{(tr)}}_{\tau-}\stackrel{\mathbb{P}}{\rightarrow} \sigma_{\tau-}^2$ , and satisfies the stable central limit theorem
In order to obtain a central limit theorem at (almost) optimal rate, we thus have to impose mild restrictions on the jump activity. For the standard model with a semimartingale volatility, i.e. $\alpha=\frac12$ , we need $r<\frac32$ , and for $\alpha=1$ we have the stronger condition that $r<\frac43$ . These conditions are equivalent to those of [Reference Bibinger and Winkelmann7, Theorem 1], which gives a central limit theorem for spot volatility estimation under similar assumptions on $(X_t)$ , but with a slower rate of convergence for centred microstructure noise. Using a truncated quarticity estimator with the same thresholding again yields a feasible central limit theorem and asymptotic confidence intervals.
Remark 1. From a theoretical point of view we might ponder why we do not work out an asymptotic theory for $h_n\propto n^{-2/3}$ when noise and efficient price both influence the asymptotic distribution of the local minima. However, in this balanced case, the asymptotic distribution is infeasible. For this reason, [Reference Bibinger, Jirak and Reiß5] could not establish a central limit theorem for their integrated volatility estimator. Moreover, their estimator was only implicitly defined depending on the unknown function $\Psi_n^{-1}$ . Even imposing a parametric assumption on the noise as an exponential distribution would not render a feasible limit theory for $h_n\propto n^{-2/3}$ —see the discussion in [Reference Bibinger, Jirak and Reiß5]. Choosing $h_n$ such that $h_n n^{2/3}\to\infty$ slowly instead yields a simple, explicit, and consistent estimator and a feasible central limit theorem for spot volatility estimation. In particular, we use $\Psi_n$ only for the bias correction of the simple estimator, while the estimator itself and the (estimated) asymptotic variance do not hinge on $\Psi_n$ . Central limit theorems for spot volatility estimators are in general only available at almost optimal rates, when the variance dominates the squared bias in the mean squared error; see, for instance, Theorem 13.3.3 and the remarks below it in [Reference Jacod and Protter15]. Therefore, (12) is the best achievable central limit theorem. Moreover, our choice of $h_n$ avoids strong assumptions on the noise that would be inevitable for smaller blocks. Our numerical work will demonstrate that the asymptotic results presented are useful in practice and facilitate efficient inference on the spot volatility. In particular, Section 5.2 revolves around the question of how to choose block lengths in practice.
5. Implementation and simulations
5.1. Monte Carlo approximation of $\Psi_n$
Although the function $\Psi_n$ from (6) tends to the identity asymptotically, it has a crucial role as a bias correction of our estimator in (12). We can compute the function numerically based on a Monte Carlo simulation. Hence, we have to compute $\Psi_n(\sigma^2)$ as a Monte Carlo mean over many iterations and over a fine grid of values for the squared volatility. Then, we can also numerically invert the function and use $\Psi_n^{-1}(\!\cdot\!)$ . To make this procedure feasible without too high a computational expense we require an algorithm to efficiently sample from the law of the local minima for some given n and block length $h_n$ .
Consider, for $nh_n\in\mathbb{N}$ with $Z_i\stackrel{\textrm{iid}}{\sim}\mathcal{N}(0,1)$ and the observation errors $(\varepsilon_k)_{k\ge 0}$ , the minimum
for some fixed $\sigma>0$ , and, for $l\in\{0,\ldots,nh_n\}$ ,
where we set $Z_0\;:\!=\;0$ . Since
with $M_0^{nh_n-1}$ generated independently from $M_1^{nh_n}$ , we want to simulate samples distributed as $M_0^{nh_n-1}$ and $M_1^{nh_n}$ , respectively. Note that for finite $nh_n$ there is no exact equality between the moments of $M_0^{nh_n-1}$ and $M_1^{nh_n}$ , which can be relevant in particular for moderate values of $nh_n$ . As in the simulation of Section 5.2, we implement exponentially distributed observation errors $(\varepsilon_k)$ , with some given noise level $\eta$ . In data applications, we can do the same with an estimated noise level
This estimator works for all noise distributions with finite fourth moments. In view of the discussion of the model in [Reference Bibinger, Jirak and Reiß5], exponentially distributed noise is the most natural example satisfying (5). Simulations with other noise distributions lead to similar results. This is expected, since the estimator only hinges on local minima and their distribution is asymptotically more determined by the Brownian motion than by the noise distribution. To simulate the local minima for given n, $h_n$ , $\eta$ , and squared volatility $\sigma^2$ in an efficient way we use a specific dynamic programming principle. Observe that
In the baseline noise model $\varepsilon_k\stackrel{\textrm{iid}}{\sim}\text{Exp}(\eta)$ , the random variable $({\sigma}/{\sqrt{n}})Z_{nh_n}+\varepsilon_{nh_n}$ has an exponentially modified Gaussian (EMG) distribution. With any fixed noise distribution, we can easily generate realizations from this convolution. A pseudorandom variable distributed as $M_1^{nh_n}$ is now generated following the last transformation in the reverse direction. Algorithmically, this reads
-
1. Generate $U_{nh_n} \sim \textrm{EMG}(\sigma^2/n,\eta)\sim \textrm{Exp}(\eta) + (\sigma/\sqrt{n})\textrm{Norm}(1)$
-
2. $U_{nh_n-1}=\min\!(U_{nh_n},\textrm{Exp}(\eta))+(\sigma/\sqrt{n})\textrm{Norm}(1)$
-
3. Iterate until $U_1$
where the end point $U_1$ has the target distribution of $M_1^{nh_n}$ . In each iteration step, we thus take the minimum of the current state of the process with one independent exponentially distributed random variable and the convolution with one independent normally distributed random variable. To sample from the distribution of $M_0^{nh_n-1}$ instead, we use the same algorithm and just drop the convolution with the normal distribution in the last step.
This algorithm facilitates a many times faster sampling from the distribution of local minima and numerical approximation of $\Psi_n$ compared to running for each value a standard Monte Carlo simulation in that local minima are computed over blocks of length $h_n$ .
Figure 1 plots the result of the Monte Carlo approximation of $\Psi_n(\sigma^2)$ for $n=23\,400$ and $n\cdot h_n=15$ on a grid of 1500 values of $\sigma^2$ . In this case, $h_n$ is quite small, but this configuration turns out to be useful in Section 5.2. We know that $\Psi_n(\sigma^2)$ is monotone, such that the oscillation of the function in Figure 1 is due to the inaccuracy of the Monte Carlo means, although we use $N=100\,000$ iterations for each grid point. Nevertheless, we can see that the function is rather close to a linear function with slope $1{.}046$ based on a least squares estimate. The left panel of Figure 1 draws a comparison to the identity function which is illustrated by the dotted line, while the right panel draws a comparison to the linear function with slope $1{.}046$ . We see that it is crucial to correct for the bias in (12) when using such small values of $h_n$ . Although the function $\Psi_n(\sigma^2)$ is not exactly linear, a simple bias correction dividing estimates by $1{.}046$ is almost as good as using the more precise numerical inversion based on the Monte Carlo approximation. Since the Monte Carlo approximations of $\Psi_n(\sigma^2)$ look close to linear functions in all the cases considered, we report the estimated slopes based on least squares and $N=100\,000$ Monte Carlo iterations for different values of $h_n$ in Table 1 to summarize concisely the distance between the function $\Psi_n(\sigma^2)$ and the identity. Simulating all iterations for all grid points with our algorithm takes only a few hours with a standard computer.
5.2. Simulation study of estimators
We simulate $n=23\,400$ observations corresponding to one observation per second over a (NASDAQ) trading day of 6.5 hours. The efficient price process is simulated from the model
The factor $(\nu_t)$ generates a typical U-shaped intraday volatility pattern. $(W_t,B_t)$ is a two-dimensional Brownian motion with leverage $\textrm{d}[W,B]_t=-0{.}2\,\textrm{d} t$ . The stochastic volatility component has several realistic features and the simulated model is in line with recent literature; see [Reference Bibinger, Neely and Winkelmann6] and references therein. We do not include a drift in $X_t$ to avoid introducing another process or more parameters. Any drift evolving within a reasonable range of values will not affect the numerical results presented. Observations with lower-bounded, one-sided microstructure noise are generated by $Y_i=X_{{i}/{n}}+\varepsilon_i$ , $0\le i\le n$ , with exponentially distributed noise $\varepsilon_i\stackrel{\textrm{iid}}{\sim}\text{Exp}(\eta)$ , with $\eta=10\,000$ . The noise variance is then rather small, but this is in line with stylized facts of real NASDAQ data such as, for instance, those analyzed in [Reference Bibinger, Hautsch, Malec and Reiß4, Reference Bibinger, Neely and Winkelmann6]. Note that the noise level estimate is analogous to the one used for regular market microstructure noise. Typical noise levels obtained e.g. for Apple are approximately 15 000, and approximately 4000 for 3M; see the supplement of [Reference Bibinger, Hautsch, Malec and Reiß4].
Figure 2 shows a fixed path of the squared volatility. We fix this path for the following Monte Carlo simulation and generate new observations of $(X_t)$ and $(Y_i)$ in each iteration according to our model. The dashed line in Figure 2 gives the estimated volatility by the Monte Carlo means over $N=50\,000$ iterations based on $n\cdot h_n=15$ observations per block using the non-adjusted estimator (8), but with windows which are centred around the block on which we estimate the spot volatility, i.e. windows centred around the time $\tau$ , and with $K_n=180$ . We plot estimates on each block, where the estimates close to the boundaries rely on fewer observations. The solid line gives the bias-corrected volatility estimates using the numerically evaluated function $\Psi_n$ , based on the algorithm from Section 5.1 with $n\cdot h_n=15$ and $n=23\,400$ . We determined the values $n\cdot h_n=15$ and $K_n=180$ as suitable values to obtain a small mean squared error. In fact, the choice of $K_n=180$ is rather large in favour of a smaller variance that yields a rather smooth estimated spot volatility in Figure 2. The estimated volatility hence appears smoother compared to the true semimartingale volatility, but the intraday pattern is captured well by our estimation. We expect that this is typically an appealing implementation in practice as smaller $K_n$ results in a larger variance. Choosing $K_n=180$ rather large, we have to use quite small block sizes $h_n$ to control the overall bias of the estimation. Since $h_n\cdot n^{2/3}\approx 0{.}52$ is small, the bias correction becomes crucial here. Still, our asymptotic results work well for this implementation. This can be seen by the comparison of pointwise empirical 10% and 90% quantiles from the Monte Carlo iterations illustrated by the grey area and the 10% and 90% quantiles of the limit normal distribution with the asymptotic variance from (12). The latter are drawn as dotted lines for the blocks with distance larger than $K_n/2$ from the boundaries, where the variances are of order $K_n^{-1}$ . Close to the boundaries the empirical variances increase due to the smaller number of blocks used for the estimates. Moreover, the bias correction, which is almost identical to dividing each estimate by $1{.}046$ , correctly scales the simple estimates which have a significant positive bias for the chosen tuning parameters. Overall, our asymptotic results provide a good finite-sample fit even though we have $h_n\cdot n^{2/3}<1$ here. Note, however, that $\sigma_t \cdot \eta\approx 100$ , and our asymptotic expansion in fact requires that $h_n^{3/2} n\sigma_t \eta$ is large when taking constants into account. Since the simulated scenario uses realistic values, we recommend similar block lengths for applications to real high-frequency financial data. According to the summary statistics in the supplement of [Reference Bibinger, Hautsch, Malec and Reiß4], some assets exhibit higher noise-to-signal ratios, and for those larger blocks are preferable.
All values multiplied by a factor of $10^6$ .
Table 2 summarizes the performance of the estimation along different choices of $nh_n$ and $K_n$ using the following quantities:
-
MSD: the mean standard deviation of N iterations averaged over all grid points;
-
MAB: the mean absolute bias of N iterations averaged over all grid points and for the estimator (8) without any bias correction;
-
MABC: the mean absolute bias of N iterations averaged over all grid points and for the estimator (8) with a simple bias correction dividing estimates by the factors given in Table 1.
All the results are based on $N=50\,000$ Monte Carlo iterations. First of all, the values used for Figure 2 are not unique minimizers of the mean squared error. Several other combinations given in Table 2 render equally good results. Overall, the performance is comparable within a broad range of block lengths and window sizes. The variances decrease for larger $K_n$ , while the bias increases with larger $K_n$ for fixed $h_n$ . Important for the bias is the total window size, $K_n\cdot h_n$ , over which the volatility is approximated by a constant for the estimation. The variance only depends on $K_n$ : changing the block length for fixed $K_n$ does not significantly affect the variance. While the MSD is hence almost constant within the columns of Table 2, the bias after correction, MABC, increases from the top down due to the increasing window size. Without the bias correction two effects interfere for MAB. Larger blocks reduce the systematic bias due to $\Psi_n(\sigma_t^2)-\sigma_t^2$ , but the increasing bias due to the increasing window size prevails for $n\cdot h_n=78$ , and the two larger values of $K_n$ .
6. Proofs
6.1. Law of the integrated negative part of a Brownian motion
A crucial lemma for our theory is on an upper bound for the CDF of the integrated negative part of a Brownian motion. We prove a lemma based on a generalization of Lévy’s arcsine law by [Reference Takács27]. The result is in line with the conjecture in [Reference Janson16, (261)], where one finds an expansion of the density with a precise constant for the leading term. Denote by $f_+$ the positive part and by $f_-$ the negative part of some real-valued function f.
Lemma 1. For a standard Brownian motion $(W_t)_{t\ge 0}$ ,
Proof. Observe the equality in distribution $\int_0^1(W_t)_-\,\textrm{d} t\stackrel{\textrm{d}}{=}\int_0^1(W_t)_+\,\textrm{d} t$ , such that
For any $\varepsilon>0$ , the inequality
leads us to
Using [Reference Takács27, (15) and (16)], we obtain
with $\Phi$ the CDF of the standard normal distribution. Thereby, we obtain
and elementary bounds give the upper bound
Choosing $\varepsilon=x^{1/3}$ , we obtain the upper bound
6.2. Asymptotics of the spot volatility estimation in the continuous case
Proof of Theorem 1. In the following, we write $A_n\lesssim B_n$ for two real sequences if there exists some $n_0\in\mathbb{N}$ and a constant K such that $A_n\le K B_n$ for all $n\ge n_0$ .
Step 1 In the first step, we prove the approximation
with
We show that, for $k\in\{1,\ldots,h_n^{-1}-1\}$ ,
We subtract $X_{kh_n}$ from $m_{k,n}$ and $m_{k-1,n}$ , and use that, for all i,
This implies that
Changing the roles of $(Y_{i}-X_{kh_n})$ and $(\sigma_{(k-1)h_n}(W_{t_i^n}-W_{kh_n})+\varepsilon_{i})$ , we obtain by the analogous inequalities and the triangle inequality, with $M_t\;:\!=\;X_{kh_n} + \int_{kh_n}^{t}\sigma_{(k-1)h_n}\,\textrm{d} W_s$ , that
We write $(C_t)$ for $(X_t)$ to emphasize continuity, see (3). Then (14) follows from
and the analogous estimate for $m_{k-1,n}$ and $\tilde m^*_{k-1,n}$ . We decompose
Under Assumption 1, we can assume that $(\sigma_t)$ and $(a_t)$ are bounded on [0, 1] by the localization from [Reference Jacod and Protter15, Section 4.4.1]. Using Itô’s isometry and Fubini’s theorem, we obtain that
such that Assumption 1 yields, for any $t\in[kh_n,(k+1)h_n]$ ,
By Doob’s martingale maximal inequality and since $\sup_{t\in[kh_n,(k+1)h_n]}\int_{kh_n}^t |a_s|\,\textrm{d} s = \mathcal{O}_{\mathbb{P}}(h_n)$ ,
We conclude that (15) holds, since $\alpha>0$ . Since
and analogously for $(m_{k-1,n}-\tilde m^*_{k-1,n})$ , we conclude Step 1 by writing
Step 2 We bound the bias of the spot volatility estimation using Step 1. For $\lfloor h_n^{-1}\tau\rfloor>K_n$ , we obtain from the definition of the function $\Psi_n$ in (6) that
The first $\lesssim$ estimate is in fact an equality up to an additional factor $(1+{\scriptstyle{\mathcal{O}}}(1))$ , since $\Psi'_{\!\!n}(x)=1+{\scriptstyle{\mathcal{O}}}(1)$ for all $x\ge 0$ , exploiting the abovementioned differentiability based on [Reference Bibinger, Jirak and Reiß5, (A.35)]. For the asymptotic upper bounds we used the binomial formula
exploiting as in Step 1 that $(\sigma_t)$ is bounded with some upper bound C, and Hölder’s inequality to conclude with (2) from Assumption 1. Finally, we used that $\big(\alpha\wedge\frac12\big) > \alpha/(2\alpha+1)$ for all $\alpha$ .
Step 3 For the consistency of $\hat\sigma^2_{\tau-}$ , we prove that
This includes a proof of (7). Denote by $\mathbb{P}_{\sigma_{(k-1)h_n}}$ the regular conditional probabilities conditioned on $\sigma_{(k-1)h_n}$ , and by $\mathbb{E}_{\sigma_{(k-1)h_n}}$ the expectations with respect to the conditional measures. We obtain by the tower rule that
by the conditional independence of $\tilde m_{k,n}$ and $\tilde m^*_{k-1,n}$ .
We establish and use an approximation of the tail probabilities of $(\tilde m_{k,n})$ and $(\tilde m^*_{k-1,n})$ , respectively. For $x\in\mathbb{R}$ , we have
by the tower rule for conditional expectations, and since $\varepsilon_{i}\stackrel{\textrm{iid}}{\sim}F_{\eta}$ . We have
From (5), and with a first-order Taylor expansion of $z\mapsto \log\!(1-z)$ , we have
as $y\to 0$ , where we add the positive part in the last equality since $F_{\eta}(y)=0$ for any $y\le 0$ . We obtain
In the last equality we used that the Riemann sums tend almost surely to the integral with a standard Brownian motion $(B_t)_{t\ge 0}$ in the integrand. Since the expression in the expectation is bounded, as a product of conditional probabilities, by 1, we conclude with dominated convergence. If $nh_n^{3/2}\to \infty$ , we deduce that
We do not have a lower bound for $\int_0^1(B_t-x)_{-}\,\textrm{d} t$ . However, using that the first entry time $T_x$ of $(B_t)$ in x, conditional on $\{\inf_{0\le t\le 1} B_t< x \}$ , has a continuous conditional density $f(t\mid T_x<1)$ , by Lemma 1 and properties of the Brownian motion we obtain, for any $\delta>0$ ,
We focus on the second addend of the first factor, since the exponential term decays faster. It is bounded by a constant times
for any sequence $(b_n)$ , $b_n\in(0,1)$ , where we used Lemma 1. Choosing a $b_n$ which minimizes the order yields
almost surely, with a remainder that satisfies $R_n = \mathcal{O}\big(\big(h_n^{3/2}n\big)^{-({1+\delta})/{4}}\big)$ . From the unconditional Lévy distribution of $T_x$ , $f(s\mid T_x<1)$ is explicit, but we omit its precise form which does not influence the asymptotic order. Under the condition $nh_n^{3/2}\to \infty$ , the minimum of the Brownian motion over the interval hence dominates the noise in the distribution of local minima, different than for the choice $h_n\propto n^{-2/3}$ . By the reflection principle,
for $x\ge 0$ .
Using the illustration of moments by integrals over tail probabilities we exploit this, and a completely analogous estimate for $\tilde m^*_{k-1,n}$ , to approximate conditional expectations. This yields, for all $k\in\{1,\ldots,h_n^{-1}-1\}$ ,
We used (19). An analogous computation yields the same result for $\tilde m^*_{k-1,n}$ :
For the second conditional moments, we obtain, for all $k\in\{1,\ldots,h_n^{-1}-1\}$ ,
The last identity uses the illustration of the second moment of the normal distribution as an integral over tail probabilities. An analogous computation yields
Inserting the identities for the conditional moments in (17) yields
such that
This proves (16). Since the next step shows that the variance of the estimator tends to zero, consistency holds true.
Step 4 We determine the asymptotic variance of the estimator. Illustrating moments as integrals over tail probabilities, with the analogous approximation as above, we obtain, for all $k\in\{1,\ldots,h_n^{-1}-1\}$ ,
We have used the first four moments of the half-normal distribution and their illustration via integrals over tail probabilities. The dependence structure between $\tilde m_{k,n}$ and $\tilde m^*_{k,n}$ also affects the variance of $\hat\sigma^2_{\tau-}$ . We perform approximation steps for covariances similar to those for the moments of local minima above, using
This shows that the joint distribution of $(\tilde m_{k,n},\tilde m^*_{k,n})$ relates to the distribution of the minimum and the difference between the minimum and the endpoint of Brownian motion over an interval, or equivalently the distribution of the maximum and the difference between the maximum and the endpoint. The latter is readily obtained from the joint density of the maximum and the endpoint, which is a well-known result on stochastic processes; see, e.g., [Reference Shepp26]. Utilizing this, we obtain, for all $k\in\{1,\ldots,h_n^{-1}-1\}$ ,
The additional remainder of order $h_n^{\alpha}$ in probability is due to the different approximations of $(\sigma_t)$ in $\tilde m_{k,n}$ and $\tilde m^*_{k,n}$ . This implies that, for all $k\in\{1,\ldots,h_n^{-1}-1\}$ ,
With analogous steps, we deduce two more covariances which contribute to the asymptotic variance:
All covariance terms which enter the asymptotic variance are of one of these forms. For the conditional variance given $\sigma^2_{\tau-}$ , we obtain
Step 5 For a central limit theorem, the squared bias needs to be asymptotically negligible compared to the variance, which is satisfied for $K_n={\scriptstyle{\mathcal{O}}}(h_n^{-2\alpha/(1+2\alpha)})$ . By the existence of higher moments of $\tilde m_{k,n}$ and $\tilde m^*_{k-1,n}$ , a Lyapunov-type condition is straightforward, such that asymptotic normality conditional on $\sigma^2_{\tau-}$ is implied by a classical central limit theorem for m-dependent triangular arrays such as the one in [Reference Berk3]. A feasible central limit theorem is implied by this conditional asymptotic normality in combination with $\mathcal{F}^X$ -stable convergence. For the stability, we show that the $\alpha_n=K_n^{1/2}\big(\hat\sigma_{\tau-}^2-\sigma_{\tau-}^2\big)$ satisfy
for any $\mathcal{F}^X$ -measurable bounded random variable Z and continuous bounded function g, where
with U a standard normally distributed random variable which is independent of $\mathcal{F}^X$ . By the above approximations it suffices to prove this for the statistics based on $\tilde m_{k,n}$ and $\tilde m^*_{k-1,n}$ from (14), and Z measurable with respect to $\sigma\big(\int_0^t\sigma_s\,\textrm{d} W_s,0\le t\le 1\big)$ . Set
Denote with $\mathcal{H}_n$ the $\sigma$ -field generated by $\bar X(n)_t$ and $\mathcal{F}^X_0$ . The sequence $(\mathcal{H}_n)_{n\in\mathbb{N}}$ is isotonic with limit $\bigvee_n \mathcal{H}_n=\sigma(\int_0^t\sigma_s\,\textrm{d} W_s,0\le t\le 1)$ . Since $\mathbb{E}[Z\mid\mathcal{H}_n]\rightarrow Z$ in $L^1(\mathbb{P})$ as $n\rightarrow\infty$ , it is enough to show that $\mathbb{E}[Zg(\alpha_n)]\rightarrow \mathbb{E}[Z]\mathbb{E}[g(\alpha)]$ for Z being $\mathcal{H}_{n_0}$ -measurable for some $n_0\in\mathbb{N}$ . Observe that $\alpha_n$ includes only increments of local minima based on $\tilde X(n)_t$ , which are uncorrelated from those of $\bar X(n)_t$ . For all $n\ge n_0$ , we hence obtain $\mathbb{E}[Zg(\alpha_n)]=\mathbb{E}[Z]\mathbb{E}[g(\alpha_n)] \rightarrow \mathbb{E}[Z]\mathbb{E}[g(\alpha)]$ by a standard central limit theorem. This shows (20), and completes the proof of (12).
Proof of Proposition 1. For the quarticity estimator (10), when $\lfloor h_n^{-1}\tau\rfloor>K_n$ we have
by using the same moments as in the computation of the asymptotic variance. We can bound its variance by
which readily implies Proposition 1.
6.3. Asymptotics of the truncated spot volatility estimation with jumps
Proof of Theorem 2. Denote by $D^X_k\;:\!=\;m_{k,n}-m_{k-1,n}$ , $k=1,\ldots,h_n^{-1}-1$ , the differences of local minima based on the observations (4), with the general semimartingale (3) with jumps. Denote by $D^C_k\;:\!=\;\tilde m_{k,n}-\tilde m_{k-1,n}^*$ , $k=1,\ldots,h_n^{-1}-1$ , the differences of the unobservable local minima considered in Section 6.2. In particular, the statistics $D^C_k$ are based only on the continuous part $(C_t)$ in (3) such that the jumps are eliminated. Theorem 2 is implied by Proposition 1 if we can show that
We decompose this difference of the truncated estimator, which is based on the available observations with jumps, and the non-truncated estimator, which uses non-available observations without jumps, in the following way:
with some arbitrary constant $c\in(0,1)$ . Without loss of generality we can set $\beta=1$ in this proof, i.e. $u_n=h_n^{\kappa}$ . We consider the three addends, which are different error terms, separately by
-
• large absolute statistics based on the continuous part $(C_t)$ ;
-
• non-truncated statistics which contain (small) jumps;
-
• the truncation also of the continuous parts in the statistics $(D_k^X)$ which exceed the threshold.
The probability $\mathbb{P}(|D^C_k|> c u_n)$ can be bounded using the estimate from (18) and Gaussian tail bounds. Observe that the remainder in (18) is non-negative. This yields that, for some $y>0$ , we have
which is intuitive, since the errors $(\varepsilon_i)$ are non-negative. We apply the triangular inequality and then Hölder’s inequality to the expectation of the absolute first error term and obtain, for any $p\in \mathbb{N}$ ,
Since $2\kappa-1<0$ and p is arbitrarily large, we conclude that the first error term is asymptotically negligible. We will use the elementary inequalities
Therefore, we can bound $|D^X_k-D^C_k|$ by
and the remainder term of the approximation for the continuous part, which is $\mathcal{O}_{\mathbb{P}}\big(h_n^{\alpha\wedge 1/2}\big)$ . Since the compensated small jumps of a semimartingale admit a martingale structure, Doob’s inequality for càdlàg $L_2$ -martingales can be used to bound these suprema. Based on these preliminaries, we obtain, for the expected absolute value of the second error term,
Applying the elementary inequalities from above, a cross term in the upper bound for $\big(D^X_k\big)^2-\big(D^C_k\big)^2$ is of smaller order and directly neglected. It can be handled using the Cauchy–Schwarz inequality. In the last step, we adopt a bound on the expected absolute thresholded jump increments from [Reference Aït-Sahalia and Jacod1, (54)]. For the negligibility of the second error term, we thus get the condition that
Doob’s inequality also yields
For this upper bound, we decomposed the jumps in the sum of large jumps and the martingale of compensated small jumps, to which we applied Doob’s inequality. We derive the following estimate for the expectation of the third (absolute) error term:
For the negligibility of the third error term, we thus get the condition that
Since, under the conditions of Theorem 2, (21) and (22) are satisfied, the proof is finished by the negligibility of all addends in the decomposition above.
Acknowledgement
The author is grateful to an anonymous reviewer for helpful comments.
Funding information
Financial support from the Deutsche Forschungsgemeinschaft (DFG) under grant 403176476 is gratefully acknowledged.
Competing interests
There were no competing interests to declare which arose during the preparation or publication process of this article.