1 Introduction
In pairs trading, we focus on two securities whose spread of (log) prices is modelled as a mean-reverting stochastic process. When the spread widens, we make the higher security short and the cheaper one long. When the spread reverts to its mean, we clear the positions and make a profit. For a comprehensive review of studies on pairs trading, we can refer to Krauss [Reference Krauss11] and the references therein, where various schemes of pairs trading are well categorized and explained. As for the formation period, that is, the period to find and identify the comovements of a pair of security price processes, the distance approach and the cointegration approach are introduced as two major ones. As for the trading period after a suitable pair of securities is selected in the formation period, several trading rules have been studied in the pairs trading literature. Among these, the so-called threshold rules are widely employed to trigger trading signals in constructing pairs trading strategies, and a one-dimensional Ornstein–Uhlenbeck (OU) process is popularly used as a simple tractable model for the mean-reverting stochastic spread process. Zeng and Lee [Reference Zeng and Lee14] studied a static optimization problem for determining “optimal” preset thresholds, where a series expansion formula for the expected value of a certain first passage time of an OU process is utilized (see Remark 2.7 in Section 2). A related study of the OU process model by Bertram [Reference Bertram1] influenced Zeng and Lee’s work. In the present paper, we present a unified approach to Zeng and Lee’s problem for determining optimal thresholds for pairs trading with a general ergodic one-dimensional diffusion model for the stochastic spread process. Our results are summarized as follows.
-
(a) We present a general class of one-dimensional diffusion models (see Assumptions 2.1–4.2) which have symmetric stationary distributions. The class contains various tractable models: not only the OU process having a Gaussian stationary distribution, but also the Pearson diffusion process having a t-stationary distribution [Reference Forman and Sørensen7, Reference Wong and Bellman13], and the Jacobi diffusion process having a $\beta $ -stationary distribution [Reference Forman and Sørensen7, Reference Karlin and Taylor10].
-
(b) For the class of one-dimensional diffusion models stated in (a) above,
-
– an explicit, analytic formula for the expected value of a certain first passage time is derived (see Theorem 3.1 for details), and
-
– the static optimization problem for selecting the thresholds of pairs trading is solved, where the long-time averaged profit is used as the criterion function; a simple equation for the optimal thresholds, which involves a one-dimensional integral, is explicitly described (see Theorem 4.5 for details).
These analytically tractable results for general non-Gaussian models seem to be important, as the drawbacks of applying a Gaussian OU process model to non-Gaussian financial data have been pointed out by Bertram [Reference Bertram1] and Krauss [Reference Krauss11].
-
As evidence of the non-Gaussian nature of real financial time series data, we observe daily Nikkei 225 (Nikkei stock average) and TOPIX (Tokyo stock price index) data from the Japanese stock market from 1 June 2011 to 30 December 2020. Figure 1 shows price movements, Figure 2 shows log-price movements, Figure 3 shows the regression residual, $\log (\text {Nikkei 225})- 1.144\times \log (\text {TOPIX})-1.502$ (the spread of log-prices), and Figure 4 shows the quantile–quantile plot (see Chambers et al. [Reference Chambers, Cleveland, Kleiner and Tukey3]) of the residual data, which seem to be non-Gaussian (with sample kurtosis 3.412061).
2 Model set-up
Consider two security price processes $(P_t)_{t\ge 0}$ and $(Q_t)_{t\ge 0}$ in a continuous-time economy. We may regard P and Q as discounted price processes for simplicity. Suppose that fir some constants $\beta ,\lambda \in {\mathbb R}$ ,
follows a one-dimensional diffusion process. Concretely, $X=(X_t)_{t\ge 0}$ satisfies the stochastic differential equation (SDE)
on a filtered probability space $(\Omega ,{\mathcal F},{\mathbb P}, ({\mathcal F}_t)_{t\ge 0})$ endowed with the ${\mathcal F}_t$ -Brownian motion $(W_t)_{t\ge 0}$ , where $E=(l,r)$ ( $-\infty \le l<0<r\le \infty $ ) is the state space. Here, $\mu ,\sigma :E\to {\mathbb R}$ are continuous functions and $\sigma (x)^2>0$ for all $x\in E$ . Associated with X, we define the scale function
and the speed measure on E by
For the basic roles of the scale function and the speed measure in one-dimensional diffusion theory, we can refer to the works of Durret [Reference Durret4, Chapter 6] and Karatzas and Shreve [Reference Karatzas and Shreve9, Section 5.5], for example. We make the following assumptions.
Assumption 2.1 The boundary values of the scale function are
and $m(E)<\infty $ .
Assumption 2.2 The boundaries of the state space are $-l=r>0$ . In the SDE (2.2), $\mu $ is an odd function and $\sigma ^2$ is an even function, so, $\mu (x)=-\mu (-x)$ and $\sigma (x)^2=\sigma (-x)^2$ for all $x\in E$ .
Remark 2.3 Assumption 2.1 implies that X is recurrent and has the invariant probability measure, given by
Hence, X is ergodic, and its stationary distribution is given by (2.3). Moreover, we see that
where ${\mathbb P}_x(\cdot )={\mathbb P}(\cdot \mid X_0=x)$ . This relation is seen in Karatzas and Shreve [Reference Karatzas and Shreve9], which is an extension of a result obtained by Pollack and Siegmund [Reference Pollak and Siegmund12]. On the other hand, Assumption 2.2 implies the symmetry of the stationary distribution $\bar {m}$ on E. Further, it implies that s and $s''$ are odd functions and $s'$ is an even function.
Here are some examples which satisfy Assumptions 2.1 and 2.2.
Example 2.4 (Ornstein–Uhlenbeck process)
Let $\sigma \in {\mathbb R}_{++}$ be constant, let $\mu (x)=-\kappa \sigma ^2 x$ with $\kappa \in {\mathbb R}_{++}:=(0,\infty )$ , and let $E={\mathbb R}$ . The associated process is written as
In this case, we see that
and the stationary distribution is a centred Gaussian distribution.
Example 2.5 (Pearson diffusion)
Let $\mu (x)=-\kappa \gamma ^2 x$ and $\sigma (x)= \gamma \sqrt {\delta +x^2}$ with $\kappa ,\gamma ,\delta \in {\mathbb R}_{++}$ , and let $E={\mathbb R}$ . The associated process is written as
In this case, we see that
and the stationary distribution is a scaled t-distribution.
Example 2.6 (Jacobi diffusion)
Let $\mu (x)=-\kappa \gamma ^2x$ and $\sigma (x)= \gamma \sqrt {\delta ^2-x^2}$ with $\kappa , \gamma ,\delta \in {\mathbb R}_{++}$ , and let $E=(-\delta ,\delta )$ . The associated process is written as
In this case, we see that
and the stationary distribution is a centred and scaled beta distribution.
Remark 2.7 Forman and Sørensen [Reference Forman and Sørensen7] studied the parametrized one-dimensional diffusion process
with $\theta>0$ , $\mu ,a,b,c\in {\mathbb R}$ , which contains the above three examples, and shows the feasibility of explicit statistical inference of parameters.
3 Expected value formula for first passage time
For $y\in E$ , denote the first hitting time by
where we make the interpretation $\inf \emptyset =+\infty $ , with $\emptyset $ being the empty set. Using the simplified notation ${\mathbb E}_x[(\cdot )] ={\mathbb E}[ (\cdot ) \mid X_0=x]$ , we obtain the following theorem.
Theorem 3.1. Under Assumptions 2.1 and 2.2 , and for $l<-\alpha <\beta <\alpha <r$ ,
Remark 3.2 For the OU process given in Example 2.4, Zeng and Lee [Reference Zeng and Lee14] derived the series expansion formula
where we use the notation
for the gamma function. Using Theorem 3.1 for Example 2.4, we have the following different analytic representation:
Proof. Let $l<a<x<b<r$ . Recall that
where we define the Green function by
(see, for example, Durret [Reference Durret4, Chapter 6] or Karatzas and Shreve [Reference Karatzas and Shreve9, Section 5.5]). Note that
Letting $b\uparrow r$ and using Assumption 2.1 yields
Then, combining (3.2) with $x=\beta $ , $a=-\alpha $ , $b=\alpha $ and (3.3) with $x=\alpha $ , $a=\beta $ , we have
where we use the property that s is an odd function, that is, $s(x)=-s(-x)$ for $x\in E$ , and the symmetry of m, that is, $m(A)=m(-A)$ for $A\in {\mathcal B}(E)$ , which follow from Assumption 2.2. This completes the proof.
Remark 3.3 Under Assumption 2.1 (without imposing Assumption 2.2), we have for $l<\beta <\alpha <r$ ,
This assertion follows by combining equation (3.2) with $x=\alpha $ , $a=\beta $ , and
with $x=\beta $ , $b=\alpha $ . Equation (3.5) follows from (3.2), by letting $a\downarrow l$ and using Assumption 2.1. Note that formula (3.4) appeared in Karatzas and Shreve [Reference Karatzas and Shreve9]. Also, Bertram [Reference Bertram1] has derived this formula for the OU process given in Example 2.4.
4 Optimal thresholds for pairs trading
In this section, we apply Theorem 3.1 to compute optimal thresholds for pairs trading. Consider the situation where the stochastic spread process $X=(X_t)_{t\ge 0}$ (2.1) is governed by the one-dimensional SDE (2.2), and that Assumptions 2.1 and 2.2 are satisfied. First, we note that the SDE (2.2) has a weak solution, which is unique in the sense of probability law (for the definition of the uniqueness of the solution of SDE in the sense of probability law, see [Reference Karatzas and Shreve9, Chapter 5]). For the proof, we refer to Karatzas and Shreve [Reference Karatzas and Shreve9], where the case $E={\mathbb R}$ was treated, which can be straightforwardly modified to apply to our situation with $E\subset {\mathbb R}$ . Next, we note that
where the superscript $(x)$ denotes the starting position of the process X, that is, $x=X_0^{(x)} \in E$ , which follows from the uniqueness of the solution of SDE (2.2) and Assumption 2.2.
Now, inspired by Zeng and Lee [Reference Zeng and Lee14], we consider the following trading strategy. Let $a>0$ and $b \in [-a,a]$ be two preset thresholds, which determine the trading signals. We set $T_0=0$ , and, for $n\in {\mathbb N}$ , determine the starting time $S_n$ and the completion time $T_n$ of the nth trading cycle by
respectively. The nth trading cycle consists of the following two steps.
-
(A) At time $S_n$ , if $X_{S_n}=a$ (respectively, $X_{S_n}=-a$ ), set a one dollar short (respectively, long) position of security P and a $\beta $ dollar long (respectively, short) position of security Q. Keep these positions until time $T_n$ .
-
(B) At time $T_n$ , clear the positions and make a profit. Wait until the next starting time $S_{n+1}$ .
The profit of the nth trading cycle is (approximately) given by
where $\text {sgn}(x)$ denotes the sign of $x\in {\mathbb R}$ , and $c>0$ is the total transaction cost in the trading cycle. Here, recall that
are independent, identically distributed random variables, which are deduced from the time-homogeneity of X and the strong Markov property (the strong Markov property of the solution of the Markovian SDE can be found in Karatzas and Shreve [Reference Karatzas and Shreve9, Chapter 5]). We then define
which is a renewal process. The cumulative profit obtained from the completed trading until time t is
and the long-time averaged profit is
where we use the strong law of large numbers result in renewal theory [Reference Borovkov2, Chapter 10]. To determine the optimal threshold levels, we are interested in the maximization problem
where
and the objective function can be expressed as
by Theorem 3.1. To solve the problem expressed by (4.2)–(4.4), we impose the following conditions.
Assumption 4.1 For all $x\in E\setminus \{0\}$ , $x\mu (x)<0$ .
Assumption 4.2 In the case where $r=\infty $ , $\displaystyle \lim \nolimits _{x\uparrow r} s'(x)=\infty $ .
Remark 4.3 Assumption 4.1 implies that X is mean-reverting: the drift $\mu $ of X is always directed “inward”. It also implies that for all $x\in (0,r)$ ,
where we recall that for all $x\in E$ ,
which follow from Assumption 2.2 and Remark 2.3. Assumption 4.2 ensures that $L(a,b) \to 0$ as $a \uparrow r(=\infty )$ ; the long-time averaged profit becomes small if the threshold a is too large. The details can be found in the proof of Theorem 4.5.
We obtain the following theorem concerning the solution of the maximization problem (4.2).
Theorem 4.5. Suppose that Assumptions 2.1 – 4.2 are satisfied, and let
Then there exists a unique constant $a^*>0$ such that
and it defines the maximizer for (4.2) with (4.3) and (4.4) as follows:
Proof. Let
We see that
and $\partial _b f(a,b)\le 0$ for any $(a,b)\in {\mathcal A}_{+}=\{ (a,b)\in {\mathcal A} \mid b\ge 0 \}$ . Indeed, the inequality
holds for $(a,b)\in {\mathcal A}_{+}$ as s is convex on $(0,r)$ . Hence, we deduce that
where $ {\mathcal A}_{-}=\{ (a,b)\in {\mathcal A} \,|\, b\le 0 \}.$ Then we check the following.
-
(i) For $(a,b)\in {\mathcal A}_{-}$ , we see that
$$ \begin{align*} f(a,b)\le \frac{2a-c}{s(a)}=\bar{f}(a), \end{align*} $$as s is increasing. Further, we see that $\lim \nolimits _{a\uparrow r}\bar {f}(a)=0$ by Assumption 2.1 (if $r<\infty $ ) and l’Hôpital’s rule combined with Assumption 4.2 (if $r=\infty $ ). -
(ii) We compute the boundary values of f on ${\mathcal A}_-$ as
-
(a) $f(a,b)=f(a,a-c)=0$ on $\displaystyle \{(a,b)\in {\mathcal A} \,|\, b=a-c, a\ge {c}/{2}\}$ .
-
(b) $\displaystyle f(a,b)=f(a,0)=({a-c})/{s(a)}$ on $\displaystyle \{ (a,b)\in {\mathcal A} \,|\, b=0, a\ge c \}$ .
-
(c) $\displaystyle f(a,b)=f(a,-a)=({a-c/2})/{s(a)}$ on $\displaystyle \{ (a,b)\in E^2 \,|\, b=-a, a\ge {c}/{2}\}$ .
-
Hence, we deduce that the maximum of $f(a,b)$ on ${\mathcal A}_{-}$ exists, and that the maximizer(s) exist(s) in ${\mathcal A}_0\cup \partial _1 A$ , where
Here, we see that ${\mathcal A}_0\subset \partial _1 A$ . Indeed, $\partial _a f(a,b)=\partial _b f(a,b)=0$ for $(a,b)\in {\mathcal A}_-$ implies
from which we deduce the relations $s'(a)-s'(b)=0$ and $b=-a$ . Now, to compute the maximizer of
on $\partial _1{\mathcal A}(={\mathcal A}_0\cup \partial _1 A)$ , we first note the results
where the latter follows from $g(a)\le \bar {f}(a) \to 0$ as $a\uparrow r$ as we have seen. We next compute the derivative as
Here, note that $g'(c/2)=1/s( c/2)>0$ for $c>0$ and that $h'(a)=-(a-c/2)s''(a)< 0$ for $a> c/2$ . So, combining these observations with (4.6), we deduce that there exists a unique $a^*\in (c/2,r)$ such that $h(a^*)=0=g'(a^*)$ , and we have that $g'(a)>0$ for $a\in ( c/2,a^*)$ and $g'(a)<0$ for $a\in ( a^*,r)$ . Therefore, recalling (4.5), we conclude that
Remark 4.6 The resulting trading strategy with the optimal thresholds $(a^*,-a^*)$ is called the new optimal rule (NOR) by Zeng and Lee [Reference Zeng and Lee14]. Step (B) of this strategy has no waiting time, which is different from that of the conventional trading rule with thresholds $0=b<a$ . Thus, one trading cycle of the NOR is described as follows.
-
(A) When $X_{t_1}=a^*$ (respectively, $X_{t_1}=-a^*$ ), set a one dollar short (respectively, long) position of security P and a $\beta $ dollar long (respectively, short) position of security Q. Keep these positions as long as $X_t>-a^*$ (respectively, $X_t<a^*$ ).
-
(B) When $X_{t_2}=-a^*$ (respectively, $X_{t_2}=a^*$ ) ( $t_2>t_1$ ), clear the positions and make a profit. Restart immediately with step (A).
5 Asymptotic arbitrage
Pairs trading is sometimes explored in the context of so-called “statistical arbitrage”, as it is sometimes executed algorithmically at high frequency. In this section, we consider the asymptotic arbitrage property of the threshold strategy, which is a weak form of arbitrage in the long run. Let us consider the threshold strategy described in Section 4. The cumulative gain of the (self-financing) threshold strategy until time t is given by
where we regard P and Q as discounted price processes to cancel the interest rate effect. Here, we make the interpretation that
for example. As we have seen in Section 4, we can approximate the cumulative gain
Remark 5.1 On $\{ T_{N_t}\le t<S_{N_t+1}\}$ ,
which implies that the cumulative gain is always nonnegative and constant, until the new $(N_t+1)$ th trading cycle starts at time $S_{N_t+1}$ after the previous $N_t$ th trading cycle completes. On the other hand, note that, on $\{S_{N_t+1}\le t<T_{N_t+1}\}$ ,
which implies the cumulative gain can be negative, and hence the threshold strategy becomes risky until the new $(N_t+1)$ th trading cycle completes after starting at time $S_{N_t+1}$ .
We recall the notion of asymptotic arbitrage in the long run, which was introduced by Föllmer and Schachermayer [Reference Föllmer and Schachermayer6].
Definition 5.2 (Asymptotic arbitrage)
The cumulative gain process $(G_t)_{t\ge 0}$ realizes a strong asymptotic arbitrage (SAA) in the long run if it satisfies the following conditions:
-
(AA1) $G_0=0$ ;
-
(AA2) for any $\epsilon>0$ sufficiently small, there exists $T>0$ such that
$$ \begin{align*} G_T>-\epsilon \ \text{almost surely}\quad\text{and}\quad {\mathbb P}(G_T>\epsilon^{-1})\ge 1-\epsilon. \end{align*} $$
Let us impose the following condition.
Assumption 5.3 $\displaystyle \int _E |x|m\,(dx)<\infty $ .
We then obtain the following result.
Proposition 5.4 Suppose Assumptions 2.1 , 2.2 and 5.3 hold and let $a-b-c>0$ . Then the cumulative gain process $(G_t)_{t\ge 0}$ given by (5.1) satisfies
where $L(a,b)$ is given by (4.1), and the existence of an SAA follows.
Proof. First, recall the relation
Further, we see that
for any $x\in E$ , where we use Assumption 5.3 and relation (2.4). Combining (5.2), (5.3) and the law of large numbers for the renewal process $(N_t)_{t\ge 0}$ , we deduce that
So, for any $\delta>0$ ,
follows from Markov’s inequality. Further, for any $\delta ,\epsilon>0$ sufficiently small, there exists ${T}_*>0$ such that
Hence, for $T\ge \max ( T_*, (\{L(a,b)-\delta \}\epsilon )^{-1})$ ,
and an SAA is realized.
Remark 5.5 (Statistical arbitrage)
Hogan et al. [Reference Hogan, Jarrow, Teo and Warachka8] used the following definition. If the cumulative gain process $(G_{t})_{t\ge 0}$ satisfies the following four conditions, then we say that a statistical arbitrage (SA) opportunity exists:
-
(SA1) $G_0=0$ ;
-
(SA2) $\displaystyle \lim \nolimits _{t\to \infty }{\mathbb E}[G_{t}]>0$ ;
-
(SA3) $\displaystyle \lim \nolimits _{t\to \infty }{\mathbb P}(G_{t}<0)=0$ ;
-
(SA4) $\displaystyle \lim \nolimits _{t\to \infty }({1}/{t}) {\mathbb V}[G_{t}]=0$ if ${\mathbb P}(G_{t}<0)>0$ , for all $t\ge 0$ .
We conjecture that an SA does not exist in the cumulative gain process (5.1) of the threshold strategy. Indeed, (SA4) seems to be violated, if we recall the central limit theorem for the renewal process $(N_t)_{t\ge 0}$ discussed in Section 7.
6 Numerical experiment
In this section, using two examples of the stochastic spread processes, that is, Examples 2.4 and 2.5, we show some numerical experiments. Recall that the OU process (2.5) has the stationary distribution
the centered normal distribution with variance $1/2\kappa $ , and that Pearson diffusion process (2.6) has the stationary distribution,
which is a scaled t-distribution; concretely, we have
with
Hence, the limiting variance and kurtosis are given by
respectively, where ${\mathbb V}[ (\cdot )]$ denotes the variance and ${\mathbb K}[ (\cdot )]$ denotes the kurtosis.
For the optimization problem studied in Section 4, we consider the following numerical experiment for Pearson diffusion model (2.6).
-
(i) Set $\delta =(0.2)^2$ , $\gamma =1$ , and choose several values of $\kappa $ , which control the limiting variance and kurtosis (6.1) of the t-stationary distribution.
-
(ii) For each parameter set, numerically compute the optimal threshold value $a^*_{\mathrm {P}}$ and the optimal long-time averaged profit
$$ \begin{align*} L_{\mathrm{P}}^*=L_{\mathrm{P}}(a^*_{\mathrm{P}},-a^*_{\mathrm{P}}), \end{align*} $$given in Theorem 4.5, where $L_{\mathrm {P}}(a,b)$ is given by (4.4) for the Pearson diffusion model with $c=0.01$ .
Further, as a comparison, we consider the following numerical experiment for the OU process model (2.5).
-
(iii) Set the limiting variance $1/2\kappa_{\mathrm {OU}}$ of the OU process equal to the limiting variance $V^{\infty} =\delta/(2\kappa-1)$ of the Pearson diffusion process, to solve for the parameter value $\kappa_{\mathrm {OU}}$ :
$$ \begin{align*} \frac{1}{2\kappa_{\mathrm{OU}}}=\frac{\delta}{2\kappa-1} \quad \Leftrightarrow \quad \kappa_{\mathrm{OU}}= \frac{2\kappa-1}{2\delta}. \end{align*} $$Set $\sigma =0.2$ . -
(iv) For each corresponding parameter set, numerically compute the optimal threshold value $a^*_{\mathrm {OU}}$ and the optimal long-time averaged profit
$$ \begin{align*} L_{\mathrm{OU}}^*=L_{\mathrm{OU}}(a^*_{\mathrm{OU}},-a^*_{\mathrm{OU}}), \end{align*} $$given in Theorem 4.5, where $L_{\mathrm {OU}}(a,b)$ is given by (4.4) for the OU model with $c=0.01$ . -
(v) Compute
$$ \begin{align*} L_{\mathrm{mis}}=L_{\mathrm{P}}( a^*_{\mathrm{OU}}, -a^*_{\mathrm{OU}}), \end{align*} $$which is interpreted as the long-time expected profit for a “misspecified” agent who observes the limiting variance and “misapplies” OU model (2.5) instead of Pearson model (2.6). Also, compute the loss rate,$$ \begin{align*} \mathrm{Loss}= \frac{L_{\mathrm{P}}^*-L_{\mathrm{mis}}}{L_{\mathrm{P}}^*}. \end{align*} $$
From the result (Table 1) we see that $L^*_{\mathrm {P}}>L^*_{\mathrm {OU}}$ always holds though a model misspecification in the profit is rather small (as we see in the loss rate, Loss). Also, we see that higher mean-reversion (with larger $\kappa $ and $\kappa _{\mathrm {OU}}$ ) yields higher optimized profits.
7 Concluding remarks
The long-time maximization of profit for pairs trading with thresholds, discussed in Section 4, is an idealized problem, based on the law of large numbers in the long run:
To consider a suitable “mean–variance” optimization of profit is an interesting and important challenge as the fluctuation from the mean value seems to have a considerable effect in realistic situations with a finite time horizon. One such mean–variance optimization is to consider the criterion function
where
is the long-time limit of the mean-value of profit, which was analysed in Section 4. Note that
is the long-time limit of the variance of profit, and $\alpha>0$ is the risk-aversion parameter. The limiting variance (7.1) is obtained from the central limit theorem
for the scaled and centred renewal process (see Borovkov [Reference Borovkov2, Chapter 10]). Defining ${\mathbb V}_x[(\cdot )]={\mathbb E}_x[(\cdot )^2]- \{ {\mathbb E}_x[(\cdot )]\}^2$ , the variance ${\mathbb V}[U_1]$ in (7.1), rewritten as
can be computed by using Kac’s moment formula [Reference Fitzsimmons and Pitman5],
where we use the Green function (3.1), combined with the expected value formula (3.2). Although we have an analytic representation of $V(a,b)$ in closed form, the maximization of $\mathrm {MV}(a,b)$ does not seem to be straightforward, so this is left as a future research topic. For a different but related optimization for determining thresholds to trigger trading signals, we refer the reader to the Sharpe ratio maximization problem, discussed by Bertram [Reference Bertram1].
Acknowledgments
The authors are grateful to Chiaki Hara for fruitful discussions, and to the anonymous referees for helpful feedback leading to the improvements of this paper. Masaaki Fukasawa’s research is supported by a Grant-in-Aid for Scientific Research (A), no. 25245046, from the Japan Society for the Promotion of Science. Jun Sekine’s research is supported by a Grant-in-Aid for Scientific Research (C), no. 15K03540, from the Japan Society for the Promotion of Science.