Hostname: page-component-78c5997874-m6dg7 Total loading time: 0 Render date: 2024-11-10T13:24:27.042Z Has data issue: false hasContentIssue false

Continuous-time locally stationary time series models

Published online by Cambridge University Press:  20 June 2023

Annemarie Bitter*
Affiliation:
Ulm University
Robert Stelzer*
Affiliation:
Ulm University
Bennet Ströh*
Affiliation:
Imperial College London
*
*Postal address: Helmholtzstraße 18, 89069 Ulm, Germany.
*Postal address: Helmholtzstraße 18, 89069 Ulm, Germany.
****Postal address: 180 Queen’s Gate, London, SW7 2AZ, UK. Email address: b.stroh@imperial.ac.uk
Rights & Permissions [Opens in a new window]

Abstract

We adapt the classical definition of locally stationary processes in discrete time (see e.g. Dahlhaus, ‘Locally stationary processes’, in Time Series Analysis: Methods and Applications (2012)) to the continuous-time setting and obtain equivalent representations in the time and frequency domains. From this, a unique time-varying spectral density is derived using the Wigner–Ville spectrum. As an example, we investigate time-varying Lévy-driven state space processes, including the class of time-varying Lévy-driven CARMA processes. First, the connection between these two classes of processes is examined. Considering a sequence of time-varying Lévy-driven state space processes, we then give sufficient conditions on the coefficient functions that ensure local stationarity with respect to the given definition.

Type
Original Article
Copyright
© The Author(s), 2023. Published by Cambridge University Press on behalf of Applied Probability Trust

1. Introduction

To model non-stationary time series that locally behave in a stationary manner, Dahlhaus and others, starting with the seminal paper [Reference Dahlhaus11], developed a comprehensive theory and powerful estimation procedures, using a parameterized sequence of processes for the definition of local stationarity (see e.g. [Reference Dahlhaus12, Reference Dahlhaus13, Reference Dahlhaus and Polonik15], or [Reference Dahlhaus14] for an overview). Noticeable examples include, for instance, ARMA processes with continuous coefficient functions (see [Reference Dahlhaus11]). Also, more recently, non-parametric approaches that allow for linear and non-linear locally stationary models have been introduced and investigated in [Reference Bardet, Doukhan and Wintenberger3, Reference Dahlhaus, Richter and Wu16, Reference Dahlhaus and Subba Rao17, Reference Vogt44, Reference Vogt and Dette45].

Despite this success, the above approaches have mainly been carried out for models defined on $\mathbb{Z}$ , i.e. in a discrete-time framework. Surprisingly, there is so far no general theory available for locally stationary models defined on $\mathbb{R}$ , i.e. in a continuous-time framework, following the original approach of Dahlhaus.

In this paper, we define local stationarity for continuous-time models in the spirit of [Reference Dahlhaus11, Reference Dahlhaus12]. More precisely, we establish a definition motivated by [Reference Dahlhaus11, Reference Dahlhaus12] in the frequency and time domains, and because we consistently use $L^2$ -integration theory (see e.g. [Reference Applebaum1] for an introduction), we readily obtain that both definitions are equivalent. Based on the definition in the frequency domain, we define a time-varying spectral density and show that it can be uniquely determined by a sequence of locally stationary processes, using the Wigner–Ville spectrum (see also [Reference Dahlhaus11]). This uniqueness is a powerful property, as it is known to pave the way for a likelihood approximation (comparable to the Whittle likelihood for stationary processes), leading to powerful estimation methods (see [Reference Dahlhaus13]).

As an example, we consider time-varying Lévy-driven state space processes, which include the continuous-time analogue of time-varying ARMA (time-varying CARMA) processes. Lévy-driven CARMA processes are known to provide a flexible yet analytically tractable class of processes that have been applied to model a variety of phenomena from different areas [Reference Benth, Klüppelberg, Müller and Vos4, Reference Larsson and Mossberg25, Reference Marquardt and Stelzer28].

In the time-invariant setting, it is known from [Reference Schlemm and Stelzer41] that the class of CARMA processes is equivalent to the class of Lévy-driven state space processes. While it is easy to see that every time-varying CARMA process is also a time-varying Lévy-driven state space process, we show that the inverse inclusion fails to hold, at least for non-continuous coefficient functions. This motivates us to look at the class of time-varying Lévy-driven state space models.

As for previous appearances of continuous-time local stationarity in the literature, there are some noteworthy works on locally stationary Hawkes processes [Reference Mammen27, Reference Roueff and von Sachs34, Reference Roueff, von Sachs and Sansonnet35]. The approach in these works, as well as the nature of Hawkes processes, deviates significantly from our approach and the models we investigate, since for Hawkes processes the focus is on time-varying immigrant intensities and fertility functions. The paper [Reference Koo and Linton21] considers certain locally stationary diffusion models and their semiparametric estimation. The paper [Reference Matsuada and Yajima30] employs spatio-temporal random fields, especially of CARMA type, which are stationary in time and locally stationary in space. The definition of local stationarity in [Reference Matsuada and Yajima30] requires the moments of all orders to exist. The most recently published concurrent paper [Reference Kurisu24] defines locally stationary fields in the spirit of [Reference Vogt44] (see also [Reference Dahlhaus and Subba Rao17]) using a definition of local stationarity involving an almost sure condition. The jump parts of the Lévy-driven CARMA fields investigated there are restricted to the case of finite variation, and absolute continuity of the Lévy measure is assumed. Furthermore, the focus of that paper is on a non-parametric regression problem. In contrast to these works, we consistently work in an $L^2$ setting throughout, and we aim to establish a concise probabilistic theory of local stationarity in continuous time and locally stationary state space models.

The paper is structured as follows. In Section 2, we first review the definition of local stationarity in the discrete-time framework. Then in Section 2.2, we summarize basic facts about Lévy processes and orthogonal random measures, including integration with respect to them.

The novel definition of local stationarity for continuous-time models in both the frequency and time domains is given in Section 3. Moreover, we investigate asymptotic distributional properties of such models and show that the autocovariance function evaluated at distinct points tends to zero.

In Section 4, we investigate time-varying state space processes in the context of local stationarity. We start with a simple example in Section 4.1, where we consider a sequence of time-varying CAR(1) processes and give sufficient conditions on the coefficient function for the sequence to be locally stationary according to the given definition. Sections 4.2 and 4.3 are dedicated to general time-varying state space processes. First, in Section 4.2, the connection between the class of time-varying CARMA processes and time-varying state space processes is examined. Then we give sufficient conditions for a sequence of time-varying state space processes to be locally stationary.

Finally, in Section 5 we investigate the time-varying spectral density and the Wigner–Ville spectrum of locally stationary processes.

2. Preliminaries

Throughout this paper, we denote the set of positive integers by $\mathbb{N}$ , the set of non-negative real numbers by $\mathbb{R}^+_0$ , and the set of $m\times n$ matrices over a ring R by $M_{m\times n}(R)$ ; $\textbf{1}_n$ stands for the $n\times n$ identity matrix. Given a complex number z, we denote the complex conjugate of z by $\overline{z}$ . For square matrices $A,B\in M_{n\times n}(R)$ , $[A,B]=AB-BA$ denotes the commutator of A and B, Rank(A) the rank of A, and $\sigma(B)$ the spectrum of B. We write the transpose of a matrix $A \in M_{m\times n}(\mathbb{R})$ as A’ and the adjoint of a matrix $B \in M_{m\times n}(\mathbb{C})$ as $B^*$ . Norms of matrices and vectors are denoted by $\big\lVert \cdot \big\rVert$ . If the norm is not further specified, we take the Euclidean norm or its induced operator norm, respectively. For a complex number $z\in\mathbb{C}$ , the real part of z is denoted by $\mathfrak{Re}(z)$ . The Borel $\sigma$ -algebras are denoted by $\mathcal{B}(\!\cdot\!)$ and $\lambda$ stands for the Lebesgue measure, at least in the context of measures. In the following, we will assume all stochastic processes and random variables to be defined on a common complete probability space $(\Omega,\mathcal{F},\mathbb{P})$ equipped with an appropriate filtration if necessary. We simply write $L^p$ to denote the space $L^p(\Omega,\mathcal{F},\mathbb{P})$ and $L^p(X)$ to denote the space $L^p(X,\mathcal{B}(X),\lambda)$ for some set $X\subset\mathbb{R}$ with corresponding norms $\big\lVert \cdot \big\rVert_{L^p}$ . The ring of continuous functions in t from $\mathbb{R}$ to $\mathbb{R}$ is denoted by $\mathcal{R}[t]$ .

2.1. Locally stationary time series in discrete time

We follow the concept of local stationarity as established in [Reference Dahlhaus14] for discrete-time locally stationary time series models. In [Reference Dahlhaus14], the authors consider a parametric representation of a sequence of non-stationary time-varying processes in either the time domain or the frequency domain, which has to satisfy certain regularity conditions.

In the following we briefly review the mathematical details of the aforementioned concepts, as well as the most important results. To this end, we define the total variation of a function g on [0, 1], denoted by V(g), as

\begin{align*}V(g)\;:\!=\;\sup\Bigg\{ \sum_{k=1}^m |g(x_k)-g(x_{k-1})|, 0\leq x_0 < \ldots < x_m \leq 1, m\in\mathbb{N} \Bigg\},\end{align*}

and for $\kappa>0$ we define

\begin{align*}\ell_\kappa(j)\;:\!=\;\begin{cases} 1,& |j|\leq1, \\[5pt] |j|\log^{1+\kappa}|j|, & |j|>1,\end{cases}\end{align*}

for all $j\in\mathbb{Z}$ . For further details on the following two definitions we refer to [Reference Dahlhaus14].

Definition 1. ([Reference Dahlhaus14, Assumption 1].) Let $\{X_{t,T}, t=1,\ldots,T\}_{T\in\mathbb{N}}$ be a sequence of stochastic processes. Then $X_{t,T}$ is called locally stationary in the time domain if there exists a representation

\begin{align*}X_{t,T}=\sum_{j=-\infty}^{\infty} a_{t,T,j}\varepsilon_{t-j}, \qquad T\in\mathbb{N}, \ t=1,\ldots,T,\end{align*}

where

  1. (a) $\{\varepsilon_t,t\in\mathbb{Z}\}$ is an independent and identically distributed (i.i.d.) sequence with $\mathbb{E}[\varepsilon_t]=0$ and $Var(\varepsilon_t)=1$ ,

  2. (b) for all $j\in\mathbb{Z}$ it holds that

    \begin{align*}\sup_{\substack{t=1,\ldots,T\\ T\in\mathbb{N}}}|a_{t,T,j}| \leq \frac{K}{\ell_\kappa(j)},\end{align*}
    where $\kappa,K>0$ are constants, and
  3. (c) there exist functions $a_j(\!\cdot\!)\;:\;(0,1]\rightarrow\mathbb{R}$ , $j\in\mathbb{Z}$ , satisfying

    (1) \begin{align}\sup_{u\in(0,1]}|a_j(u)|\leq \frac{K}{\ell_\kappa(j)}, \qquad\sup_{j\in\mathbb{Z}}\sum_{t=1}^T \bigg|a_{t,T,j}-a_j\bigg(\frac{t}{T}\bigg)\bigg| \leq K , \quad \text{ and }\ V(a_j(\!\cdot\!)) \leq \frac{K}{\ell_\kappa(j)}\end{align}
    for some constant K.

Definition 2. ([Reference Dahlhaus14, p. 382].) Let $\{X_{t,T}\;:\; t=1,\ldots,T\}_{T\in\mathbb{N}}$ be a sequence of stochastic processes. Then $X_{t,T}$ is called locally stationary in the frequency domain with transfer functions $A_{t,T}^0\;:\;[\!-\!\pi,\pi]\rightarrow\mathbb{C}$ , $T\in\mathbb{N}$ , $t=1,\ldots,T$ , if it has the representation

\begin{align*}X_{t,T}=\int_{-\pi}^{\pi} e^{i\lambda t} A_{t,T}^0(\lambda) \xi(d\lambda) \qquad \text{for all } T\in\mathbb{N},\ t=1,\ldots,T\end{align*}

(with the integrals existing in $L^2$ ), where

  1. (a) $\xi(\lambda)$ is a stochastic process on $[\!-\!\pi,\pi]$ with mean zero and orthogonal increments,

  2. (b) there exist a constant K and a function $A\;:\;[0,1]\times[\!-\!\pi,\pi]\rightarrow\mathbb{C}$ which is continuous in the first component, satisfying $\overline{A(u,\lambda)}=A(u,-\lambda)$ and

    (2) \begin{align}\sup_{\substack{t=1,\ldots,T, \\ \lambda\in[\!-\!\pi,\pi]}}\bigg|A_{t,T}^0(\lambda)-A\bigg(\frac{t}{T},\lambda\bigg)\bigg| \leq \frac{K}{T}, \qquad T\in\mathbb{N}.\end{align}

Remark 1. (a) Thanks to the smoothness conditions on the coefficient functions $a_j(u)$ and the transfer function $A(u,\lambda)$ , the sequence $X_{t,T}$ shows locally stationary behavior (see e.g. [Reference Dahlhaus12, Definition 2.1]).

  1. (b) For a comprehensive introduction to orthogonal increment processes, orthogonal random measures, and the related $L^2$ -integration theory, we refer to [Reference Brockwell and Davis8].

  2. (c) We note that the given definitions of local stationarity in the time and frequency domains are not equivalent.

    However, using the spectral representation of the noise $\varepsilon_t = \int_{(-\pi,\pi]} \frac{1}{\sqrt{2\pi}} e^{i\lambda t} \xi(d\lambda)$ (see [Reference Brockwell and Davis8]), the Fourier transform allows for the following connections (see [Reference Dahlhaus13, Remark 2.2]) between the two concepts. It holds that

    \begin{align*}A_{t,T}^0(\lambda) &= \frac{1}{\sqrt{2\pi}} \sum_{j=-\infty}^{\infty} a_{t,T,j} e^{-i\lambda j}, \quad & \quad A(u,\lambda) &= \frac{1}{\sqrt{2\pi}} \sum_{j=-\infty}^{\infty} a_j(u) e^{-i\lambda j}, \\[5pt] a_{t,T,j} &= \frac{1}{\sqrt{2\pi}} \int_{-\pi}^{\pi} A_{t,T}^0(\lambda) e^{i\lambda j} d\lambda, \quad \text{ and} \quad & \quad a_j(u) &= \frac{1}{\sqrt{2\pi}} \int_{-\pi}^{\pi} A(u,\lambda) e^{i\lambda j} d\lambda,\end{align*}
    since $A_{t,T}^0(\!\cdot\!)\in L^2([\!-\!\pi,\pi],\mathbb{C})$ and $a_{t,T,j}\in\ell^2$ .

    Necessary conditions for Definitions 1 and 2 to be equivalent can be found in [Reference Dahlhaus and Polonik15, Remark 2.2]. In particular, this includes additional smoothness assumptions on $A(u,\lambda)$ and a stronger version of the second condition in (1).

The following two propositions give further insight into Definition 2 and the notion of local stationarity.

Proposition 1. Let $X_{t,T}$ be a locally stationary process in the frequency domain and $\{T_n\}_{n\in\mathbb{N}}\subset\mathbb{N}$ an increasing sequence. If $sT_n\in\{1,\ldots,T_n\}$ for some fixed $s\in[0,1]$ and all $n>n_0$ , $n_0\in\mathbb{N}$ , then it holds that

\begin{align*}A_{sT_n,T_n}^0(\!\cdot\!) \xrightarrow[n\rightarrow\infty]{L^2} A(s,\cdot).\end{align*}

Proof. This follows directly from (2).

For instance, the choice $T_n=2^n$ and $s=k/2^{n_0}$ for some $n_0\in\mathbb{N}$ and $k\in\{1,\ldots,T_{n_0}\}$ suits the conditions of Proposition 1.

Proposition 2. Let $X_{t,T}$ be a locally stationary process in the frequency domain with associated orthogonal increment process $\{\xi(\lambda),\lambda\in[\!-\!\pi,\pi]\}$ such that the sequence $\varepsilon_t = \int_{(-\pi,\pi]} \frac{1}{\sqrt{2\pi}} e^{i\lambda t} \xi(d\lambda)$ is i.i.d., and let $\{T_n\}_{n\in\mathbb{N}}\subset\mathbb{N}$ be an increasing sequence. If $sT_n\in\{1,\ldots,T_n\}$ for some fixed $s\in[0,1]$ and all $n>n_0$ , $n_0\in\mathbb{N}$ , then it holds that

\begin{align*}X_{sT_n,T_n}\overset{d}{\underset{n\rightarrow\infty}{\longrightarrow}}\int_{-\pi}^{\pi} A(s,\lambda) \xi(d\lambda).\end{align*}

Proof. First observe that every time series of the form $\int_{-\pi}^{\pi} e^{i\lambda t} A(\lambda) \xi(d\lambda)$ , $t\in\mathbb{Z}$ , where $\xi$ is an orthogonal increment process coming from an i.i.d. noise, is strictly stationary. Thus,

\begin{align*}\int_{-\pi}^{\pi} e^{i\lambda t_1} A_{sT_n,T_n}^0(\lambda) \xi(d\lambda)\overset{d}{=} \int_{-\pi}^{\pi} e^{i\lambda t_0} A_{sT_n,T_n}^0(\lambda) \xi(d\lambda)\end{align*}

for all $t_0,t_1\in\mathbb{Z}$ . In particular, for $t_1=sT_n$ , where $s\in[0,1]$ is such that $sT_n\in\{1,\ldots,T\}$ , and $t_0=0$ , we obtain

\begin{align*}X_{sT_n,T_n} =\int_{-\pi}^{\pi} e^{i\lambda sT_n} A_{sT_n,T_n}^0(\lambda) \xi(d\lambda) \overset{d}{=} \int_{-\pi}^{\pi} A_{sT_n,T_n}^0(\lambda) \xi(d\lambda).\end{align*}

The remainder follows from Proposition 1 and the continuity of the stochastic integral in mean square and thus in distribution with respect to the integrand.

Remark 2. A notable class of processes that are locally stationary in the time and frequency domains is that of time-varying AR(p) processes with continuous coefficient functions. For the mathematical details of this result we refer to [Reference Dahlhaus11, p. 147].

Among the variety of different concepts for local stationarity in the literature, we mention the results of [Reference Dahlhaus, Richter and Wu16, Reference Vogt44]. In [Reference Vogt44], the author considers a triangular array $X_{t,T}$ , $T\in\mathbb{N}$ , $t=1,\ldots,T$ , to be locally stationary if for each $u\in[0,1]$ there exists a strictly stationary process $\{X_t(u), t=1,\ldots,T\}$ such that almost surely

\begin{align*}\big| X_{t,T}-X_t(u) \big|\leq \bigg(\bigg| \frac{t}{T}-u \bigg|+\frac{1}{T}\bigg) U_{t,T}(u),\end{align*}

where $U_{t,T}(u)$ are positive random variables satisfying $\mathbb{E}[(U_{t,T}(u))^\rho]<\infty$ for some $\rho>0$ uniformly in u, t, and T. Time-varying AR(p) processes with continuous coefficient functions can also be embedded in this framework using arguments similar to those of [Reference Dahlhaus and Subba Rao17].

More recently, the authors of [Reference Dahlhaus, Richter and Wu16] developed a general theory for locally stationary processes based on stationary approximations. Similarly to [Reference Dahlhaus, Richter and Wu16, Reference Vogt44] assumes that there exists a strictly stationary process $\{X_t(u),t=1,\ldots,T\}$ such that for some $q,C>0$ ,

(3) \begin{align}\big\lVert X_t(u)-X_t(v) \big\rVert_{L^q}\leq C \big| u-v \big| \quad \text{and}\qquad \bigg\lVert X_{t,T}-X_t\bigg(\frac{t}{T}\bigg) \bigg\rVert_{L^q}\leq \frac{C}{T}\end{align}

uniformly in $t=1,\ldots,T$ and $u,v\in[0,1]$ . Based on these approximations, the authors establish asymptotic results as a law of large numbers and a central limit theorem, which in turn are used to derive asymptotic results for a maximum likelihood estimator (see [Reference Dahlhaus, Richter and Wu16, Section 5]). Again, time-varying AR(p) processes with continuous coefficient functions can be embedded in this framework. Recently, in [Reference Bardet, Doukhan and Wintenberger3], this work has been extended to models with infinite memory.

In view of the statistical results obtained from the approximations (3), a possible characterization of local stationarity in terms of similar approximations for continuous-time models will be the topic of future work.

2.2. Lévy processes and orthogonal random measures

In this section we lay the foundation for the definition of continuous-time locally stationary processes and briefly review Lévy processes and orthogonal random measures. We also cover basic results, including stochastic integration with respect to Lévy processes and orthogonal random measures. For further insight we refer to [Reference Applebaum1, Reference Sato39].

Definition 3. A real-valued stochastic process $L=\{L(t),t\in \mathbb{R}_0^+\}$ is called a Lévy process if

  1. (a) $L(0)=0$ almost surely,

  2. (b) for any $n\in\mathbb{N}$ and $t_0<t_1<t_2<\dots<t_n$ , the random variables $(L(t_0),L(t_1)-L(t_0),\dots,L(t_n)-L(t_{n-1}))$ are independent,

  3. (c) for all $s,t \geq 0$ , the distribution of $L(s+t)-L(s)$ does not depend on s, and

  4. (d) L is stochastically continuous.

Theorem 1. Let $L=\{L(t),t\geq0\}$ be a real-valued Lévy process. Then L(1) is an infinitely divisible real-valued random variable with characteristic triplet $(\gamma,\Sigma,\nu)$ , where $\gamma \in \mathbb{R}$ , $\Sigma>0$ , and $\nu$ is a Lévy measure on $\mathbb{R}$ , i.e. $\nu(0)=0$ and $\int_{\mathbb{R}}(1\wedge\big| x \big|^2)\nu(dx)<\infty$ . The characteristic function of L(t) is given by

(4) \begin{align}\begin{aligned}\varphi_{L(t)}(z)&=\mathbb{E}\Big[e^{izL(t)}\Big]=\exp(t \Psi_{L}(z)),\\[5pt] \Psi_L(z)&=\Bigg( i\gamma z -\frac{\Sigma z^2}{2}+\int_{\mathbb{R} } \Big(e^{i z x}-1-i z x \mathbb{1}_{Z}(x) \Big)\nu(dx)\Bigg),\end{aligned}\end{align}

where $z \in \mathbb{R}$ and $Z=\{x \in \mathbb{R}, \big| x \big|\leq 1\}$ .

In the remainder we work with a two-sided Lévy process, i.e. $L(t)=L_1(t)\mathbb{1}_{\{t\geq0\}} - L_2(\!-\!t)\mathbb{1}_{\{t<0\}}$ , where $L_1$ and $L_2$ are independent copies of a one-sided Lévy process. Throughout this paper, it will be assumed that

(5) \begin{align} \mathbb{E}[L(1)]=\gamma + \int_{|x|> 1} x \nu(dx)=0 \text{ and } \mathbb{E}[L(1)^2]=\Sigma + \int_{x\in\mathbb{R}} x^2 \nu(dx)<\infty.\end{align}

Thus, the above assumptions on the Lévy process imply that $\int_\mathbb{R} x^2 \nu(dx)<\infty$ . Occasionally, we will write $\Sigma_L\;:\!=\;Var(L(1))=\Sigma+\int_{x\in\mathbb{R}} x^2 \nu(dx)$ .

If the Lévy process satisfies (5) and $f\;:\;\mathbb{R}\times\mathbb{R}\rightarrow \mathbb{R}$ is a $\mathcal{B}(\mathbb{R}\times \mathbb{R}) - \mathcal{B}(\mathbb{R})$ -measurable function satisfying $f(t,\cdot)\in L^2(\mathbb{R})$ , then the integral $X(t)=\int_{\mathbb{R}} f(t,s) L(ds)$ , $t\in\mathbb{R}$ , exists in $L^2$ (see e.g. [Reference Marquardt and Stelzer28]).

Definition 4. ([Reference Krylov22, Definition 2.3.5].) A family $\{\xi(\Delta)\}_{\Delta\in\mathcal{B}(\mathbb{R})}$ of $\mathbb{C}$ -valued random variables is called an orthogonal random measure (ORM) if

  1. (a) $\xi(\Delta)\in L^2(\mathcal{B}(\mathbb{R}),\mathbb{C})$ for all bounded $\Delta\in\mathcal{B}(\mathbb{R})$ ,

  2. (b) $\xi(\emptyset)=0$ ,

  3. (c) $\xi(\Delta_1\cup\Delta_2)=\xi(\Delta_1)+\xi(\Delta_2)$ almost surely whenever $\Delta_1\cap\Delta_2=\emptyset$ , and

  4. (d) $F\;:\;\mathcal{B}(\mathbb{R})\rightarrow \mathbb{C}$ such that $F(\Delta)=\mathbb{E}[\xi(\Delta)\overline{\xi(\Delta)}]$ defines a $\sigma$ -additive positive definite measure, and it holds that $\mathbb{E}[\xi(\Delta_1)\overline{\xi(\Delta_2)}]=F(\Delta_1\cap\Delta_2)$ for all $\Delta_1,\Delta_2\in\mathcal{B}(\mathbb{R})$ .

F is referred to as the spectral measure of $\xi$ .

Theorem 2. ([Reference Marquardt and Stelzer28, Theorem 3.5].) Let L be a two-sided Lévy process satisfying (5). Then there exists an ORM $\Phi_L$ with spectral measure $F_L$ such that

  1. (a) $\mathbb{E}[\Phi_L(\Delta)]=0$ for any bounded $\Delta\in\mathcal{B}(\mathbb{R})$ ,

  2. (b) $F_L(dt)=\frac{\Sigma_L}{2\pi}$ dt, and

  3. (c) $\Phi_L$ is uniquely determined by $\Phi_L([a,b))\;:\!=\;\int_{-\infty}^\infty \frac{e^{-i\mu a}-e^{-i\mu b}}{2\pi i\mu} L(d\mu)$ .

In the proof of the above theorem, the standard theory of Fourier transforms on $L^2(\mathbb{R})$ (see e.g. [Reference Chandrasekharan10, Chapter 2] for an introduction) is used to show that

(6) \begin{align}\int_{-\infty}^\infty \varphi(\mu) \Phi_L(d\mu)=\frac{1}{\sqrt{2\pi}}\int_{-\infty}^\infty \widehat{\varphi}(u) L(du)\end{align}

for any complex function $\varphi\in L^2(\mathbb{R})$ and its Fourier transform $\widehat{\varphi}$ , where

\begin{align*}\widehat{\varphi}(u)=\frac{1}{\sqrt{2\pi}}\int_{-\infty}^\infty e^{-i\mu u} \varphi(\mu) d\mu.\end{align*}

Note that the above definition of the integral refers to the Plancherel extension theorem for the Fourier transform on $L^1(\mathbb{R})\cap L^2(\mathbb{R})$ to $L^2(\mathbb{R})$ . Hence, the corresponding integral is understood symbolically as the corresponding limit in $L^2(\mathbb{R})$ . Similarly, we define the inverse Fourier transform g of a function $\widehat{g}\in L^2(\mathbb{R})$ symbolically as

\begin{align*}g(\mu)=\frac{1}{\sqrt{2\pi}}\int_{-\infty}^\infty e^{i\mu u} \widehat{g}(u) du.\end{align*}

We also recall that for two complex functions $f,g\in L^2(\mathbb{R})$ and their Fourier transforms $\widehat{f}, \widehat{g}$ , it follows that $\hat{f}, \hat{g}\in L^2(\mathbb{R})$ and, by [Reference Rudin37, p. 189],

\begin{align*}\int_{-\infty}^\infty f(\mu)\overline{g(\mu)} d\mu = \int_{-\infty}^\infty \widehat{f}(u)\overline{\widehat{g}(u)} du.\end{align*}

3. Locally stationary processes in continuous time

Analogously to Section 2.1, one can define a (stationary) stochastic process $\{Y(t)\}_{t\in\mathbb{R}}$ via the representation as a linear process or the spectral representation, i.e.

\begin{align*}Y(t) = \int_{\mathbb{R}} g(t-u) L(du) \quad \text{or} \quad Y(t) = \int_{\mathbb{R}} e^{i\mu t} A(\mu) \Phi_L(d\mu), \qquad t\in\mathbb{R},\end{align*}

where g and A are square-integrable functions and L is a two-sided Lévy process with corresponding ORM $\Phi_L$ . As we consistently use $L^2$ -integrals to define the process in both the time and the frequency domain, and the Fourier transform is an isometry on $L^2$ , the two definitions are equivalent. Hence, from (6) it follows that the transfer function A and the kernel g satisfy

\begin{align*}g(u) = \frac{1}{2\pi} \int_{-\infty}^\infty e^{i\mu u} A(\mu) d\mu \quad \text{and} \quad A(\mu)= \int_{-\infty}^\infty e^{-i\mu v} g(v) dv.\end{align*}

Now, we allow the kernel function and the transfer function to be time-dependent, leading to

\begin{align*}Y(t) = \int_{\mathbb{R}} e^{i\mu t} A(t,\mu) \Phi_L(d\mu) = \int_{\mathbb{R}} g(t,t-u) L(du), \qquad t\in\mathbb{R},\end{align*}

where

\begin{align*} g(t,u) = \frac{1}{2\pi} \int_{-\infty}^\infty e^{i\mu u} A(t,\mu) d\mu\quad \text{and} \quad A(t,\mu) = \int_{-\infty}^\infty e^{-i\mu u} g(t,u) du.\end{align*}

As we are interested in real-valued processes, we require that g be real-valued, or equivalently that $\overline {A(\mu)}=A(\!-\!\mu)$ for all $\mu\in\mathbb{R}$ .

To be able to define local stationarity analogously to Section 2.1, we need not only a time-varying representation, but also a sequence of stochastic processes. The intuitive idea is to take a limiting kernel g and a sequence of kernels $g_N^0$ defining the processes in the time domain, such that

\begin{align*}\bigg\lVert g_N^0(t,\cdot)-g\bigg(\frac{t}{N},\cdot\bigg) \bigg\rVert_{L^2} \underset{N\rightarrow\infty}{\longrightarrow} 0.\end{align*}

However, for the limiting (stationary) process we prefer to fix a time $t\in\mathbb{R}$ rather than dealing with fractions $\frac{t}{N}$ . Replacing t by Nt leads to the following definition.

Definition 5. A sequence of stochastic processes $\{Y_N(t),t\in\mathbb{R}\}_{N\in\mathbb{N}}$ is said to be locally stationary in the time domain if it can be represented as

\begin{align*}Y_N(t) = \int_{\mathbb{R}} g_N^0(Nt,Nt-u) L(du),\quad \text{for all }t\in\mathbb{R},\ N\in\mathbb{N},\end{align*}

where L is a two-sided Lévy process and the kernel functions $g_N^0\;:\;\mathbb{R}\times\mathbb{R}\rightarrow\mathbb{R}$ satisfy the following:

  1. (a) $g_N^0(t,\cdot)\in L^2(\mathbb{R})$ for all $t\in\mathbb{R}$ , $N\in\mathbb{N}$ , and

  2. (b) there exists a (local/limiting kernel) function $g\;:\;\mathbb{R}\times\mathbb{R}\rightarrow\mathbb{R}$ such that the mapping $\mathbb{R}\rightarrow L^2(\mathbb{R})$ , $t\mapsto g(t,\cdot)$ is continuous and

    \begin{align*}g_N^0(Nt,\cdot) \underset{N\rightarrow\infty}{\overset{L^2}{\longrightarrow}} g(t,\cdot) \quad \text{for all } t\in\mathbb{R}.\end{align*}

Definition 6. A sequence of stochastic processes $\{Y_N(t),t\in\mathbb{R}\}_{N\in\mathbb{N}}$ is said to be locally stationary in the frequency domain if it can be represented as

(7) \begin{align}Y_N(t) = \int_{\mathbb{R}} e^{i\mu Nt} A_N^0(Nt,\mu) \Phi_L(d\mu),\quad \text{for all }t\in\mathbb{R},\ N\in\mathbb{N},\end{align}

where $\Phi_L$ is the ORM of a two-sided Lévy process L, and the transfer functions $A_N^0\;:\;\mathbb{R}\times\mathbb{R}\mapsto\mathbb{C}$ satisfy the following:

  1. (a) $A_N^0(t,\cdot)\in L^2$ for all $t\in\mathbb{R}$ , $N\in\mathbb{N}$ ,

  2. (b) $\overline{A_N^0(\cdot,\cdot)}=A_N^0(\cdot,-\!\cdot\!)$ , and

  3. (c) there exists a (local/limiting transfer) function $A\;:\;\mathbb{R}\times\mathbb{R}\rightarrow\mathbb{C}$ with $\overline{A(\cdot,\cdot)}=A(\cdot,-\!\cdot\!)$ such that the mapping $\mathbb{R} \mapsto L^2(\mathbb{R})$ , $t\mapsto A(t,\cdot)$ is continuous and

    \begin{align*}A_N^0(Nt,\cdot) \underset{N\rightarrow\infty}{\overset{L^2}{\longrightarrow}} A(t,\cdot),\quad \text{for all } t\in\mathbb{R}.\end{align*}

In contrast to the discrete-time case, it is now irrelevant whether we use the definition in the time or the frequency domain. Therefore, we will just use the term ‘locally stationary’ in both cases.

Proposition 3. Definitions 5 and 6 are equivalent. Moreover, the relationship between the (limiting) transfer function and the (limiting) kernel, using their Fourier transforms, is given by

\begin{align*}A_N^0(Nt,\mu) &= \int_{-\infty}^\infty e^{-i\mu u} g_N^0(Nt,u) du ,\qquad & \quad A(t,\mu) &= \int_{-\infty}^\infty e^{-i\mu u} g(t,u) du,\\[5pt] g_N^0(Nt,u) &= \frac{1}{2\pi} \int_{-\infty}^\infty e^{i\mu u} A_N^0(Nt,\mu) d\mu, \quad \textit{ and}\qquad & \quad g(t,u) &= \frac{1}{2\pi} \int_{-\infty}^\infty e^{i\mu u} A(t,\mu) d\mu.\end{align*}

Proof. The result follows immediately from Definitions 5 and 6 using Plancherel’s theorem.

The following lemma provides sufficient conditions for the continuity conditions on the mappings $t\mapsto A(t,\cdot)$ and $t\mapsto g(t,\cdot)$ from Definitions 5 and 6.

Lemma 1. Let $A\;:\;\mathbb{R}\times\mathbb{R}\rightarrow\mathbb{C}$ be a function which is continuous in the first argument, such that for all $t\in\mathbb{R}$ there exist an $\epsilon_t>0$ and a real function $f_t\in L^2(\mathbb{R})$ such that $\big| A(s,\cdot) \big|\leq f_t(\!\cdot\!)$ for all $s\in[t-\epsilon_t,t+\epsilon_t]$ . Then the mapping $\mathbb{R} \to L^2(\mathbb{R})$ , $t\mapsto A(t,\cdot)$ is continuous.

Proof. This is a straightforward application of the dominated convergence theorem.

In principle it is possible to replace the Lévy process by a process with weakly stationary uncorrelated increments and the ORM induced by the Lévy process by an arbitrary ORM. The resulting processes would be (locally) weakly stationary. However, at the moment it does not seem worthwhile to us to pursue this any further, for the following reason. To derive a continuous-time analogue of Proposition 2, the stationary and independent increments of the driving Lévy process L are essential. Therefore, the ORM in Definition 6 also has to be generated by a stochastic process on $\mathbb{R}$ with independent and stationary increments, i.e. by a Lévy process.

We note that this also ensures for all $\tilde{t}\in\mathbb{R}$ that the limiting process $Y_{\tilde{t}}(t)=\int_\mathbb{R} g(\tilde{t}, t-u) L(du)$ is strictly stationary. The next proposition provides the aforementioned continuous-time analogue of Proposition 2.

Proposition 4. Let $\{Y_N(t), t\in\mathbb{R}\}_{N\in\mathbb{N}}$ be a locally stationary process. Then, for fixed $t\in\mathbb{R}$ ,

\begin{align*}Y_N(t) \overset{d}{\underset{n\rightarrow\infty}{\longrightarrow}} \int_{\mathbb{R}} A(t,\mu) \Phi_L(d\mu) = \int_{\mathbb{R}} g(t,-u) L(du).\end{align*}

Proof. For $t\in\mathbb{R}$ we obtain, using a stationarity argument,

\begin{align*}Y_N(t) = \int_{\mathbb{R}} e^{i\mu Nt} A_N^0(Nt,\mu) \Phi_L(d\mu) \overset{d}{=} \int_{\mathbb{R}} A_N^0(Nt,\mu) \Phi_L(d\mu).\end{align*}

The remainder follows from the continuity of the stochastic integral in mean square and thus in distribution with respect to the integrand.

Proposition 5. Let $\{Y_N(t), t\in\mathbb{R}\}_{N\in\mathbb{N}}$ be a locally stationary sequence and $t_1,t_2\in\mathbb{R}$ such that $t_1\neq t_2$ . Then $Y_N(t_1)$ and $Y_N(t_2)$ are asymptotically uncorrelated, i.e. $Cov(Y_N(t_1),Y_N(t_2))\rightarrow0$ as $N\rightarrow\infty$ .

The intuition behind this proposition is that the kernel functions $g_N^0(Nt,Nt-\cdot)$ are square-integrable and therefore roughly vanish if the second argument tends to infinity. For $t_1\neq t_2$ , the difference between $Nt_1$ and $Nt_2$ increases for $N\rightarrow\infty$ . Therefore, for large N, the bulks of the kernels for $t_1$ and $t_2$ rest on far-apart segments of the Lévy process, which has independent increments.

Proof. Let $Y_N(t) = \int_{\mathbb{R}} g_N^0(Nt,Nt-u) L(du)$ be a sequence of locally stationary processes. Without loss of generality we assume that $t_1>t_2$ and set $h=t_1-t_2>0$ . It is sufficient to show that for all $t_2\in\mathbb{R}$ and $\varepsilon>0$ there exists an $N_0\in\mathbb{N}$ such that for all $N>N_0$ ,

\begin{align*}&\big| Cov(Y_N(t_2),Y_N(t_2+h)) \big| \\[5pt] &=\Sigma_L \Bigg| \int_\mathbb{R} g_N^0(Nt_2,Nt_2-u) g_N^0(N(t_2+h),N(t_2+h)-u) du \Bigg| <\varepsilon.\end{align*}

Let $t\in\mathbb{R}$ , and define $\mathcal{E}$ as the set of all elementary real functions in $L^2(\mathbb{R})$ , i.e.

\begin{align*}\mathcal{E}=\Bigg\{ f\in L^2(\mathbb{R})\;:\; f=\sum_{i=1}^n c_i \mathbb{1}_{[a_i,b_i)}, n\in\mathbb{N}, c_i\in\mathbb{R}, -\infty<a_i<b_i<\infty, i=1,\ldots,n \Bigg\}.\end{align*}

Then for all $\eta>0$ there exist $N_1\in\mathbb{N}$ and elementary functions $\hat{g}(t_2,\cdot),\hat{g}(t_2+h,\cdot)\in\mathcal{E}$ such that for all $N>N_1$ ,

\begin{align*}\big\lVert g(t_2,\cdot)-\hat{g}(t_2,\cdot) \big\rVert_{L^2}&<\eta,\qquad & \qquad \big\lVert g(t_2+h,\cdot)-\hat{g}(t_2+h,\cdot) \big\rVert_{L^2}&<\eta,\\[5pt] \Big\lVert g_N^0(Nt_2,\cdot)-g(t_2,\cdot) \Big\rVert_{L^2}&<\eta, \quad \text{ and}\qquad & \qquad \Big\lVert g_N^0(N(t_2+h),\cdot)-g((t_2+h),\cdot) \Big\rVert_{L^2}&<\eta,\end{align*}

using [Reference Royden36, Proposition 6.8]. For the remainder of the proof, it will be assumed that $N>N_1$ . Thus,

\begin{align*} \Big\lVert g_N^0(N(t_2+h),\cdot) \Big\rVert_{L^2} \leq \eta + \big\lVert g(t_2+h,\cdot) \big\rVert_{L^2}\quad \text{ and}\quad \Big\lVert g_N^0(N(t_2),\cdot) \Big\rVert_{L^2} \leq \eta + \big\lVert g(t_2,\cdot) \big\rVert_{L^2}.\end{align*}

We define the constant $K= \eta+\max\big\{\big\lVert g(t_2,\cdot) \big\rVert_{L^2} , \big\lVert g(t_2+h,\cdot) \big\rVert_{L^2} \big\}<\infty$ . Then, using the triangle inequality and the Cauchy–Schwarz inequality shows that

\begin{align*}|Cov(Y_N(t_2),Y_N(t_2+h))| &= \Sigma_L \Bigg| \int_\mathbb{R} g_N^0(Nt_2,Nt_2-u) g_N^0(N(t_2+h),N(t_2+h)-u) du \Bigg\| \\[5pt] &\leq \Sigma_L \Big( \Big\|g_N^0(Nt_2,\cdot)-g(t_2,\cdot)\Big\|_{L^2} \Big\|g_N^0(N(t_2+h),\cdot)\Big\|_{L^2} \\[5pt] &\quad+ \big\|g(t_2,\cdot)\big\|_{L^2} \Big\|g_N^0(N(t_2+h),\cdot)-g(t_2+h,\cdot)\Big\|_{L^2} \\[5pt] &\quad+ \big\|g(t_2,\cdot)-\hat{g}(t_2,\cdot)\big\|_{L^2} \big\|g(t_2+h,\cdot)\big\|_{L^2} \\[5pt] &\quad+ \big\|\hat{g}(t_2,\cdot)\big\|_{L^2} \big\|g(t_2+h,\cdot)-\hat{g}(t_2+h,\cdot)\big\|_{L^2} \\[5pt] &\quad+ \int_\mathbb{R} \big|\hat{g}(t_2,Nt_2-u)\big| \; \big|\hat{g}(t_2+h,N(t_2+h)-u)\big| du \Big) \\[5pt] &\leq \Sigma_L \Bigg( 4\eta K+ \int_\mathbb{R} \big|\hat{g}(t_2,Nt_2-u)\big| \; \big|\hat{g}(t_2+h,N(t_2+h)-u)\big| du \Bigg),\end{align*}

where the last integral tends to zero for $N\rightarrow\infty$ , by the dominated convergence theorem and the fact that the elementary functions $\hat{g}$ have bounded support.

4. Classes of locally stationary processes in continuous time

In this section, we consider sequences of time-varying CARMA processes, for which we derive sufficient conditions for local stationarity.

4.1. Locally stationary CAR(1) processes

The simplest Lévy-driven CARMA process is the Lévy-driven CAR(1) or Ornstein–Uhlenbeck-type process.

For a constant coefficient $a>0$ , a CAR(1) process is the stationary solution to the stochastic differential equation $dY(t)=-aY(t)dt+L(dt)$ , which can be expressed as

\begin{align*}Y(t)=\int_{-\infty}^t e^{-a (t-u)} L(du).\end{align*}

We replace the constant a by a time-varying function a(t) and arrive at a so-called time-varying CAR(1) process, which is given by

\begin{align*}Y(t)=\int_{-\infty}^t e^{-\int_u^t a(s) ds} L(du).\end{align*}

Additional rescaling results in a sequence of time-varying CAR(1) processes that could be locally stationary. We consider the sequence of stochastic processes $\{Y_N(t), t\in\mathbb{R}\}_{N\in\mathbb{N}}$ defined by

(8) \begin{align} Y_N(t) &= \int_{-\infty}^{Nt} e^{-\int_u^{Nt} a(\frac{s}{N}) ds} L(du),\end{align}

where $a\;:\;\mathbb{R}\rightarrow\mathbb{R}$ is a continuous coefficient function such that

\begin{align*} u\mapsto e^{-\int_u^{Nt} a(\frac{s}{N}) ds}\end{align*}

is square-integrable for all $t\in\mathbb{R}$ , $N\in\mathbb{N}$ , and L is a two-sided Lévy process. Recall that the Lévy process satisfies (5). In view of Definition 3, we obtain from (8) that

(9) \begin{align}\begin{aligned}g_N^0(Nt,Nt-u) &= \mathbb{1}_{\{Nt-u\geq 0\}} e^{-\int_u^{Nt} a(\frac{s}{N}) ds} &&\!\!\!= \mathbb{1}_{\{Nt-u\geq 0\}} e^{-\int_{-(Nt-u)}^0 a(\frac{s}{N}+t) ds} \quad \text{ and} \\[5pt] A_N^0(Nt,\mu) &= \int_\mathbb{R} e^{-i\mu u} g_N^0(Nt,u) du &&\!\!\!= \int_\mathbb{R} e^{-i\mu v} \mathbb{1}_{\{v\geq 0\}} e^{-\int_{-v}^0 a(\frac{s}{N}+t) ds} dv.\end{aligned}\end{align}

Proposition 6. Let $\{Y_N(t), t\in\mathbb{R}\}_{N\in\mathbb{N}}$ be a sequence of time-varying CAR(1) processes as defined in (8). If

  1. (C1) $a(\!\cdot\!)$ is continuous and

  2. (C2) for every $T\in\mathbb{R}^+$ there exists $\varepsilon_T>0$ such that $a(s)\geq \varepsilon_T$ for all $s\leq T$ ,

then $Y_N(t)$ is well-defined and locally stationary, with the limiting kernel g and limiting transfer function given by

\begin{align*}g(t,u) =\mathbb{1}_{\{u\geq 0\}}e^{-a(t)u}\quad \textit{ and}\qquad A(t,\mu)=\int_\mathbb{R} e^{-i\mu u} \mathbb{1}_{\{u\geq 0\}}e^{-a(t)u} du.\end{align*}

Proof. From (C2) we readily obtain that $Y_N(t)$ is well-defined. For all $t\in\mathbb{R}$ it holds that

\begin{align*}\Big\lVert g_N^0(Nt,\cdot)-g(t,\cdot) \Big\rVert_{L^2}^2 &=\Big\lVert g_N^0(Nt,Nt-\cdot)-g(t,Nt-\cdot) \Big\rVert_{L^2}^2 \\[5pt] &=\int_\mathbb{R}\! \Bigg| \mathbb{1}_{\{Nt-u\geq 0\}} e^{-\int_{-(Nt-u)}^0 a(\frac{s}{N}+t) ds} - \mathbb{1}_{\{Nt-u\geq 0\}}e^{-a(t)(Nt-u)} \Bigg|^2 du \\[5pt] &= \int_\mathbb{R} \mathbb{1}_{\{u\leq 0\}} \Bigg| e^{-\int_u^0 a(\frac{s}{N}+t) ds} - e^{a(t)u} \Bigg|^2 du\underset{N\rightarrow\infty}{\longrightarrow}0,\end{align*}

using the dominated convergence theorem. For the inner integral, the continuity of a on a compact set is sufficient for an application of the dominated convergence theorem. As majorant for the outer integral we consider $u\mapsto4\mathbb{1}_{\{u\leq 0\}} e^{2\varepsilon_t u}\in L^2$ . The required $L^2$ -continuity of the limiting kernel g can be obtained similarly, using Lemma 1.

Remark 3. The condition (C1) is intrinsically related to the continuity of the limiting kernel demanded in the definition of local stationarity. The condition (C2) is obviously satisfied if $a(\!\cdot\!)$ is bounded away from zero. However, as time goes to infinity, $a(\!\cdot\!)$ may go to zero arbitrarily fast. The latter is clearly connected to the fact that our time-varying CAR processes are causal by definition. It may be possible to weaken (C2) and allow $a(\!\cdot\!)$ also to approach 0 as time goes to minus infinity. Then the convergence to zero must be slow enough for all integrals to exist in $L^2$ . Carrying this out in detail appears rather intricate and not of relevance for the applications of locally stationary CAR(1) processes.

4.2. Time-varying CARMA(p, q) processes and time-varying state space models

Consider $p,q\in\mathbb{N}$ , where $p>q$ . The formal differential equation for a time-varying Lévy-driven CARMA(p, q) process is given by $p(t,D)Y(t) = q(t,D)DL(t)$ , i.e.

\begin{align*}&D^p Y(t) + a_1(t) D^{p-1}Y(t) + \ldots +a_p(t) Y(t) \\[5pt] &= b_0(t) DL(t) + b_1(t) D^2L(t) +\ldots +b_q(t) D^{q+1}L(t),\end{align*}

where D denotes the differential operator with respect to time and L(t) is a two-sided Lévy process satisfying (5). For continuous functions $a_i(t),b_i(t)$ , $i=1,\ldots,p$ , where $b_i(t)=0$ for all $i>q$ , the polynomials

(10) \begin{align}\begin{aligned} p(t,z) &= z^p+a_1(t)z^{p-1}+\ldots+a_{p-1}(t)z+a_p(t) \quad \text{ and} \\[5pt] q(t,z) &= b_0(t)+b_1(t)z+\ldots+b_{q-1}(t)z^{q-1}+b_q(t)z^q\end{aligned} \end{align}

are called autoregressive (AR) and moving average (MA) polynomials. For a rigorous definition we interpret the differential equations to be equivalent to the state space representation

(11) \begin{align}\begin{aligned}Y(t) &= \mathcal{B}(t)' \mathcal{X}(t), \quad \text{ and}\\[5pt] d\mathcal{X}(t) &= \mathcal{A}(t) \mathcal{X}(t) dt + \mathcal{C} L(dt), \quad t\in\mathbb{R},\end{aligned}\end{align}

with

\begin{align*}\mathcal{A}(t)&=\left( \begin{array}{c@{\quad}c@{\quad}c@{\quad}c}0 & 1 & \ldots & 0 \\[5pt] \vdots & \; & \ddots & \vdots \\[5pt] 0 & \; & \; & 1 \\[5pt] -a_p(t) & -a_{p-1}(t) & \ldots & -a_1(t) \\[5pt] \end{array}\right)\in M_{p\times p}(\mathcal{R}[t]) \quad \text{ and}\\[5pt] \mathcal{B}(t)&=\left( \begin{array}{c}b_0(t) \\[5pt] b_1(t)\\[5pt] \vdots \\[5pt] b_{p-1}(t)\\[5pt] \end{array}\right)\in M_{p\times 1}(\mathcal{R}[t]), \qquad\mathcal{C}=\left( \begin{array}{c}0 \\[5pt] \vdots\\[5pt] 0 \\[5pt] 1\\[5pt] \end{array}\right)\in M_{p\times 1}(\mathbb{R}),\end{align*}

where $\mathcal{R}[t]$ denotes the ring of continuous functions in t from $\mathbb{R}$ to $\mathbb{R}$ .

It is obvious that (11) has a unique solution when one fixes the value $X(t_0)$ at some point $t_0\in\mathbb{R}$ . For a Brownian motion as driving noise, such equations were investigated in [Reference Surulescu43, Section 2.1.1].

Provided the integrals exist in $L^2$ , it can be shown that a solution is given by

(12) \begin{align}\mathcal{X}(t) = \int_{-\infty}^t \Psi(t,s) \mathcal{C} L(ds) \quad \text{and} \quad Y(t) = \mathcal{B}(t)' \int_{-\infty}^t \Psi(t,s) \mathcal{C} L(ds),\end{align}

where $\Psi(t,t_0)$ is the unique matrix solution of the homogeneous initial value problem (IVP) $\frac{d}{dt}\Psi(t,t_0)=\mathcal{A}(t)\Psi(t,t_0)$ , where $\Psi(t_0,t_0)=\textbf{1}_p$ for all $t>t_0$ (see [Reference Brockett7, Sections 3 and 4]). The transition matrix $\Psi$ satisfies $\Psi(t,t_0)=\Psi(t,u)\Psi(u,t_0)$ for all $t>u>t_0$ (see [Reference Brockett7, Section 4, Theorem 2]). In particular, the integrals in (12) are well-defined (see Section 2.2) if there exist $\gamma,\lambda>0$ such that

\begin{align*}\big\lVert \Psi(t,t_0) \big\rVert \leq \gamma e^{-\lambda(t-t_0)}\qquad \text{ for all } t,t_0 \text{ with } t\geq t_0.\end{align*}

This condition corresponds to uniform exponential stability of the state space model in (11) and will be explained in more detail in Section 4.3.

The usual integral representation of stationary causal CARMA processes motivates the following definition.

Definition 7. A solution $\{Y(t)\}_{t\in\mathbb{R}}$ of the observation and state equations (11) in the form (12) is called a time-varying Lévy-driven CARMA(p, q) process (tvCARMA(p, q)).

For some initial time $t_0\in\mathbb{R}$ the process satisfies the following relation (see [Reference Surulescu43, Section 2.1.1]):

(13) \begin{align}\mathcal{X}(t) = \Psi(t,t_0)\Bigg(\mathcal{X}(t_0)+\int_{t_0}^t \Psi(u,t_0)^{-1} \mathcal{C} L(du)\Bigg).\end{align}

From [Reference Baake and Schlägel2, Remark 2] it follows that if for all $t,t_0\in\mathbb{R}$ and $t>t_0$

(14) \begin{align}\mathcal{A}(t)\int_{t_0}^t \mathcal{A}(s)ds = \int_{t_0}^t \mathcal{A}(s)ds \mathcal{A}(t),\end{align}

then $\Psi(t,t_0)=e^{\int_{t_0}^t \mathcal{A}(s)ds}$ .

If the commutativity assumption (14) holds, the equations (12) and (13) simplify to

\begin{align*}\mathcal{X}(t) &= e^{\int_{t_0}^t \mathcal{A}(s)ds}\mathcal{X}(t_0)+\int_{t_0}^t e^{\int_u^t \mathcal{A}(s)ds} \mathcal{C} L(du)\!\!\! &&= \int_{-\infty}^t e^{\int_u^t \mathcal{A}(s)ds} \mathcal{C} L(du) \quad \text{ and} \\[9pt] Y(t) &= \mathcal{B}(t)' e^{\int_{t_0}^t \mathcal{A}(s)ds}\mathcal{X}(t_0)+\int_{t_0}^t \mathcal{B}(t)' e^{\int_u^t \mathcal{A}(s)ds} \mathcal{C} L(du)\!\!\!&&= \int_{-\infty}^t \mathcal{B}(t)' e^{\int_u^t \mathcal{A}(s)ds} \mathcal{C} L(du)\end{align*}

for $t,t_0\in\mathbb{R}$ , where $t>t_0$ .

If the assumption (14) does not hold, $\Psi(t,t_0)$ can be expressed by the Peano–Baker series (see [Reference Baake and Schlägel2, Section 2])

\begin{align*}\Psi(t,t_0) &= \textbf{1}_p + \int_{t_0}^t \mathcal{A}(\tau_1) d\tau_1 + \int_{t_0}^t \mathcal{A}(\tau_1) \int_{t_0}^{\tau_1} \mathcal{A}(\tau_2) d\tau_2 d\tau_1 + \ldots = \sum_{n=0}^\infty \mathcal{I}_n(t),\end{align*}

where $\mathcal{I}_0(t)= \textbf{1}_p$ and $\mathcal{I}_n(t)= \int_{t_0}^t \mathcal{A}(\tau_1) \int_{t_0}^{\tau_1} \mathcal{A}(\tau_2) \cdots \int_{t_0}^{\tau_{n-1}} \mathcal{A}(\tau_n) d\tau_n \ldots d\tau_2 d\tau_1$ .

Remark 4. If $\mathcal{A}(s)$ and $\mathcal{A}(t)$ commute, i.e. $[\mathcal{A}(s),\mathcal{A}(t)]=0$ for all $s,t\in\mathbb{R}$ , then the commutativity assumption (14) holds. However, the matrices $\mathcal{A}(t)$ , $t\in\mathbb{R}$ , are in companion form and are not in general commutative (see also Proposition 12. For further insight into the commutativity of some matrices $\mathcal{A}(t)$ and $\int_{t_0}^t \mathcal{A}(s)ds$ as well as $\mathcal{A}(s)$ and $\mathcal{A}(t)$ , we refer to [Reference Rugh38, Exercise 4.8] and [Reference Wu and Sherif47].

The previous remark shows that, when considering time-varying CARMA(p, q) processes, it is in general not possible to describe the solution of the state space equations explicitly in the form of a matrix exponential. Instead one has to use the Peano–Baker series.

In [Reference Schlemm and Stelzer41, Corollary 3.4] it is proved that in the time-invariant case, the class of CARMA processes is equivalent to the class of continuous-time state space models. This motivates us to look at time-varying state space processes. We consider the observation and state equations

(15) \begin{align}\begin{aligned}Y(t) &= B(t)' X(t) \quad \text{ and} \\[5pt] dX(t) &= A(t)X(t)dt + C(t) L(dt),\end{aligned}\end{align}

where $t\in\mathbb{R}$ , $A(t)\in M_{p\times p}(\mathcal{R}[t])$ and $B(t),C(t)\in M_{p\times1}(\mathcal{R}[t])$ are arbitrary continuous coefficient functions, and L is a two-sided Lévy process satisfying (5).

Now, the representation of a time-varying CARMA process as given in (12) can be adapted to (general) state space processes. Provided the integrals exist in $L^2$ , it can be shown that a solution of (15) is given by

(16) \begin{align}X(t) = \int_{-\infty}^t \Psi(t,u) C(u) L(du) \quad \text{and} \quad Y(t) = B(t) \int_{-\infty}^t \Psi(t,u) C(u) L(du),\end{align}

where $\Psi(t,t_0)$ is the unique matrix solution of the IVP $\frac{d}{dt}\Psi(t,t_0)=A(t)\Psi(t,t_0)$ , $\Psi(t_0,t_0)=\textbf{1}_p$ , for $t>t_0$ . In particular, in the representation of (16), the integrals are well-defined if $C(\!\cdot\!)\in L^2(\mathbb{R})$ and there exist $\gamma,\lambda>0$ such that

\begin{align*}\big\lVert \Psi(t,t_0) \big\rVert \leq \gamma e^{-\lambda(t-t_0)}\qquad \text{for all }t\geq t_0 \text{ (uniform exponential stability)}.\end{align*}

For some initial time $t_0\in\mathbb{R}$ , the process satisfies the relation

(17) \begin{align}X(t) = \Psi(t,t_0)\Bigg(X(t_0)+\int_{t_0}^t \Psi(u,t_0)^{-1} C(u) L(du)\Bigg).\end{align}

Finally, we make the following definition.

Definition 8. A solution $\{Y(t)\}_{t\in\mathbb{R}}$ of the observation and state equations (15) in the form (16) is called a time-varying Lévy-driven state space process.

The natural question arises of whether all time-varying state space processes are tvCARMA processes, as in the time-invariant case. A comprehensive investigation of this question seems to be beyond the scope of this work. Below we present a result indicating that this is probably not the case in general (and definitely not if we allow the coefficient functions to have a discontinuity). Moreover, we give sufficient conditions for a positive answer.

Proposition 7. The classes of time-varying Lévy-driven CARMA models (11) and time-varying Lévy-driven state space models (15) with not necessarily continuous coefficient functions do not coincide in general.

Proof. Consider a two-dimensional time-varying state space model as defined in (15) with a structural break at $t=1$ . As coefficient functions we consider the step functions

(18) \begin{align}B(t)=\begin{cases} B_1 &\text{ if } t\leq1, \\[5pt] B_2 & \text{ if } t>1, \end{cases} \qquad A(t)=\begin{cases} A_1 &\text{ if } t\leq1, \\[5pt] A_2 & \text{ if } t>1, \end{cases} \quad \text{ and} \quad C(t)=\begin{cases} C_1 &\text{ if } t\leq1, \\[5pt] C_2 & \text{ if } t>1, \end{cases}\end{align}

which satisfy the uniform exponential stability assumption for the solution of (15).

We assume that the system is in the form of a CARMA process for $t\leq1$ and assume (for contradiction) that there exists an equivalent CARMA model as defined in (11) for all $t\in\mathbb{R}$ . Then the CARMA model shows the same structural resemblance as the corresponding state space model. In the following we denote the coefficients of the CARMA model by $\mathcal{B}(t)$ , $\mathcal{A}(t)$ , and $\mathcal{C}(t)$ . Using the same notation as in (18), we obtain

(19) \begin{align}&\mathcal{B}_1 = B_1, &&\mathcal{A}_1 = A_1, &&\mathcal{C}_1 = C_1, \nonumber \\&\mathcal{B}_2 = \begin{pmatrix} \ast \\ \ast \end{pmatrix}, &&\mathcal{A}_2 = \begin{pmatrix} 0\;\;\;\; & 1 \\ \ast\;\;\;\; & \ast \end{pmatrix},\;\;\;\;\;\;\;\text{ and}&&\mathcal{C}_2 = \begin{pmatrix} 0 \\ 1 \end{pmatrix}.\end{align}

Since the structural break divides the model into two separate linear models, the CARMA representations $(\mathcal{B}_1,\mathcal{A}_1,\mathcal{C}_1)$ and $(\mathcal{B}_2,\mathcal{A}_2,\mathcal{C}_2)$ are unique. From the proof of [Reference Schlemm and Stelzer41, Theorem 3.3] we obtain that $\mathcal{B}_2 e^{\mathcal{A}_2 (t-u)} \mathcal{C}_2 = B_2 e^{A_2 (t-u)} C_2$ for all $t>1$ and $u\in(1,t]$ .

On the one hand, we have

(20) \begin{align}\begin{aligned}Y(t) &= \mathbb{1}_{\{t\leq1\}} \Bigg( \int_{-\infty}^t B(t)' \Psi_A(t,u) C(u) L(du) \Bigg) \\[5pt] &\quad+ \mathbb{1}_{\{t>1\}} \Bigg( B(t)' \Psi_A(t,1) X(1) + B(t)' \Psi_A(t,1) \int_1^t \Psi_A(u,1)^{-1} C(u) L(du) \Bigg) \\[5pt] &= \mathbb{1}_{\{t\leq1\}} \Bigg( \int_{-\infty}^t {B}'_{\!\!1} e^{A_1(t-u)} C_1 L(du) \Bigg)\\[5pt] &\quad+ \mathbb{1}_{\{t>1\}} \Bigg( {B}'_{\!\!2} e^{A_2(t-1)} X(1) + \int_1^t {B}'_{\!\!2} e^{A_2(t-u)} C_2 L(du) \Bigg),\end{aligned}\end{align}

where $\Psi_A(s,s_0)$ denotes the solution of the aforementioned IVP with respect to A. On the other hand, Y(t) can be written as

(21) \begin{align}\begin{aligned}Y(t)&= \mathbb{1}_{\{t\leq1\}} \Bigg( \int_{-\infty}^t \mathcal{B}_1' e^{\mathcal{A}_1(t-u)} \mathcal{C}_1 L(du) \Bigg) \\[5pt] &\quad+\mathbb{1}_{\{t>1\}} \Bigg( \mathcal{B}'_{\!\!2} e^{\mathcal{A}_2(t-1)} \mathcal{X}(1) + \int_1^t \mathcal{B}'_{\!\!2} e^{\mathcal{A}_2(t-u)} \mathcal{C}_2 L(du) \Bigg).\end{aligned}\end{align}

From (19) it follows that

\begin{align*}X(1) &= \int_{-\infty}^1 e^{A_1(1-u)} C_1 L(du) =\int_{-\infty}^1 e^{\mathcal{A}_1(1-u)} \mathcal{C}_1 L(du)=\mathcal{X}(1).\end{align*}

Thus, combining (20) and (21), using (19) and the independent increments of the Lévy process, the equality ${B}'_{\!\!2} e^{A_2(t-1)} X(1)=\mathcal{B}'_{\!\!2} e^{\mathcal{A}_2(t-1)} X(1)$ has to hold almost surely for all $t>1$ . Therefore, for almost all x in the support of X(1) we obtain

(22) \begin{align}{B}'_{\!\!2} e^{A_2(t-1)} x=\mathcal{B}'_{\!\!2} e^{\mathcal{A}_2(t-1)} x.\end{align}

In the sequel we give a particular Lévy process and coefficient functions that lead to a contradiction in (22).

Assume that the Lévy process is a Brownian motion. Thus, it has the triplet $(0,\Sigma,0)$ for some $\Sigma>0$ . From [Reference Sato40] it follows that X(t) is a Lévy process with triplet $(0,\Sigma_X^t,0)$ , where

\begin{align*} \Sigma_X^1= \int_{-\infty}^1 e^{A_1(1-u)} C_1 \Sigma C'_{\!\!1} e^{A'_{\!\!1}(1-u)} du = \Sigma \int_0^\infty e^{A_1 u} C_1 C'_{\!\!1} e^{A'_{\!\!1} u} du \in M_{2\times 2}(\mathbb{R}).\end{align*}

The regularity of $\Sigma_X^1$ can be shown by investigating $Im(\Sigma_X^1)$ , where $Im(D)=\{Dx\;:\; x\in\mathbb{R}^d\}$ denotes the image of a matrix $D\in M_{d\times d}(\mathbb{R})$ . Using [Reference Bernstein6, Lemma 12.6.2] (see also [Reference Schlemm and Stelzer41, p. 54]), we obtain

\begin{align*}Im\Bigg(\int_0^\infty e^{A_1 u} C_1 C'_{\!\!1} e^{A'_{\!\!1} u} du\Bigg) = Im\bigg(\bigg[C_1\ \ A_1 C_1\ \ \cdots\ \ A_1^{p-1}C_1\bigg]\bigg).\end{align*}

Therefore, in our setting, it is sufficient to find $A_1,C_1$ such that $[C_1 \ A_1 C_1]$ is regular, which also implies that $\Sigma_X^1$ is positive definite. Then X(1) has characteristic function

\begin{align*} \mathbb{E}\Big[e^{i\langle z,X(1) \rangle }\Big] = e^{-\frac{1}{2}\langle z,\Sigma_X^1 z \rangle},\end{align*}

which corresponds to a two-dimensional $N(0,\Sigma_X^1)$ -distributed random variable having positive density for all values $x\in\mathbb{R}^2$ . To contradict (22), it is enough to show that for some $t>1$ ,

\begin{align*}{B}'_{\!\!2} e^{A_2(t-1)} x \neq \mathcal{B}'_{\!\!2} e^{\mathcal{A}_2(t-1)} x \qquad \text{ for all } x\in I \text{, where } I\subset\mathbb{R}^2 \text{ with } \lambda(I)>0.\end{align*}

We define

\begin{align*} &B_1 = \begin{pmatrix} 1 \\[5pt] 2 \end{pmatrix},\ B_2 = \begin{pmatrix} 1 \\[5pt] 1 \end{pmatrix},\ \mathcal{B}_2 = \begin{pmatrix} 5 \\[5pt] 2 \end{pmatrix}, \quad A_1 = \begin{pmatrix} 0\;\;\;\; & 1 \\[5pt] 1\;\;\;\; &1 \end{pmatrix}, \ A_2 = \begin{pmatrix} -2\;\;\;\; & 0 \\[5pt] 0\;\;\;\; & -3 \end{pmatrix},\\[5pt] &\mathcal{A}_2 = \begin{pmatrix} 0\;\;\;\; & 1 \\[5pt] -6\;\;\;\; & -5 \end{pmatrix},\ C_1 = \begin{pmatrix} 0 \\[5pt] 1 \end{pmatrix},\ C_2 = \begin{pmatrix} 1 \\[5pt] 1 \end{pmatrix},\ \text{ and } \mathcal{C}_2 = \begin{pmatrix} 0 \\[5pt] 1 \end{pmatrix}.\end{align*}

From this we obtain that the CARMA model has the same transfer function as the state space model, since

\begin{align*} {B}'_{\!\!2} (z\textbf{1}_2-A_2)^{-1} C_2 = \frac{2z+5}{z^2+5z+6}=\mathcal{B}'_{\!\!2} (z-\textbf{1}_2\mathcal{A}_2)^{-1} \mathcal{C}_2.\end{align*}

Moreover,

\begin{align*} [C_1\ \ A_1 C_1]=\begin{pmatrix} 0\;\;\;\;\; & 1 \\[5pt] 1\;\;\;\;\; & 1\end{pmatrix}\end{align*}

is regular. Given a vector $x=(x_1, x_2)'$ , it remains to investigate $\mathcal{B}'_{\!\!2} e^{\mathcal{A}_2(t-1)} x - {B}'_{\!\!2} e^{A_2(t-1)} x$ . For a matrix $D\in\mathbb{C}^{2\times 2}$ with eigenvalues $\sigma(A)=\{\mu,\lambda\}$ , [Reference Bernstein6, Proposition 11.3.2] gives that

\begin{align*}e^D=\begin{cases} e^\lambda ((1-\lambda)\textbf{1}_2+D) & \text{if } \mu=\lambda, \\[5pt] \frac{\mu e^\lambda - \lambda e^\mu}{\mu-\lambda} \textbf{1}_2 + \frac{e^\mu - e^\lambda}{\mu-\lambda} D & \text{if } \mu\neq\lambda. \end{cases}\end{align*}

Since $\sigma(\mathcal{A}_2(t-1))=\{-2(t-1),-3(t-1)\}$ , we obtain

\begin{align*}&\mathcal{B}'_{\!\!2} e^{\mathcal{A}_2(t-1)} x - {B}'_{\!\!2} e^{A_2(t-1)} x \\[5pt] &= \begin{pmatrix} 5\;\;\;\;\; & 2 \end{pmatrix}\Bigg(\frac{-3(t-1) e^{-2(t-1)} - (\!-\!2)(t-1) e^{-3(t-1)}}{-3(t-1)-(\!-\!2)(t-1)} \begin{pmatrix} 1\;\;\;\;\; & 0 \\[5pt] 0\;\;\;\;\; & 1 \end{pmatrix} \\[5pt] &\qquad\qquad\qquad+ \frac{e^{-3(t-1)} - e^{-2(t-1)}}{-3(t-1)-(\!-\!2)(t-1)} \begin{pmatrix} 0\;\;\;\;\; & 1 \\[5pt] -6\;\;\;\;\; & -5 \end{pmatrix} (t-1) \Bigg)\begin{pmatrix} x_1 \\[5pt] x_2 \end{pmatrix} \\[5pt] &\qquad\qquad\qquad- \begin{pmatrix} 1\;\;\;\;\; & 1 \end{pmatrix}\Bigg(\begin{pmatrix} e^{-2(t-1)}\;\;\;\;\; & 0 \\[5pt] 0\;\;\;\;\; & e^{-3(t-1)} \end{pmatrix}\Bigg)\begin{pmatrix} x_1 \\[5pt] x_2 \end{pmatrix} \\[5pt] &= 2x_1 \Big( e^{-2(t-1)} + e^{-3(t-1)}\Big) +x_2 \Big( e^{-2(t-1)} \Big) >0\end{align*}

for all $x\in I=\{x\in\mathbb{R}^2 \;:\; x_1>0,x_2>0\}$ and $t>1$ .

Under more rigorous conditions on the coefficient functions, the concept of controllability from linear system theory allows for a characterization for special canonical forms, which occur in the state space representation of CARMA processes ( $\mathcal{A}$ is in companion matrix form). The following results summarize the key aspects of this characterization, which is based mainly on [Reference Silverman42], but also on [Reference Benmahammed5, Reference Ramar and Ramaswami32, Reference Ramaswami and Ramar33].

Definition 9. ([Reference Rugh38, Chapters 9 and 10].) Let Y(t) be a state space model as defined in (15), where A(t) is $(p-1)$ times continuously differentiable and C(t) is p times continuously differentiable. We define the controllability matrix $W_p(t)$ as

\begin{align*}W_p(t)&=[K_0(t)\ K_1(t)\ \cdots \ K_{p-1}(t)], \quad \text{ where}\\[5pt] K_0(t)&=C(t), \quad K_{i+1}(t) = - A(t) K_i(t) + \frac{d}{dt}K_i(t), \quad i=1,\ldots,p-2.\end{align*}

Then the state space process X(t) is called

  1. (a) controllable on $[t_0,t_1]$ , $t_0<t_1$ , if there exists $t\in[t_0,t_1]$ with $Rank(W_p(t))=p$ , and

  2. (b) instantaneously controllable if $Rank(W_p(t))=p$ for all $t\in\mathbb{R}$ .

Proposition 8. ([Reference Silverman42, Theorem 1.].) Consider a state space process satisfying (15) such that A is $(p-1)$ times continuously differentiable and C is p times continuously differentiable. Then it is equivalent to a CARMA process satisfying (11) if and only if it is instantaneously controllable. Equivalence means that there exists a regular matrix $T(t)\in M_{n\times n}(\mathcal{R}[t])$ which is continuously differentiable and satisfies

\begin{align*}\mathcal{X}(t) = T(t) X(t)\end{align*}

almost surely. The relationship between the two systems is given by $T(t) = \mathcal{W}_p(t) W_p(t)^{-1}$ , $\mathcal{A}(t) = \Big( T(t) A(t) + \frac{d}{dt}T(t) \Big) T(t)^{-1}$ , and $\mathcal{C} = T(t) C(t)$ , where $\mathcal{W}_p(t)$ and $W_p(t)$ are the controllability matrices of the state space model and the CARMA process.

Corollary 1. The class of time-varying Lévy-driven state space models as defined in (15) with $(p-1)$ times continuously differentiable coefficient functions A, p times continuously differentiable coefficient functions C, and controllability matrices $W_p(t)$ that have rank p everywhere is equivalent to the class of time-varying CARMA(p, q) processes as defined in (11) with $(p-1)$ times continuously differentiable coefficient functions $\mathcal{A}$ and controllability matrices $W_p(t)$ that have rank p everywhere.

Proof. Any time-varying CARMA(p, q) process is obviously also a time-varying state space process. For the converse, let Y(t) be a time-varying state space process defined by (15) which is instantaneously controllable with controllability matrix $W_p(t)$ . Then, by Proposition 8, the state system $dX(t) = A(t)X(t)dt + C(t) L(dt)$ is equivalent to the CARMA system $d\mathcal{X}(t) = \mathcal{A}(t)\mathcal{X}(t)dt+\mathcal{C} L(dt)$ with

\begin{align*}\mathcal{X}(t) = T(t) X(t), \quad \mathcal{C} = T(t) C(t), \quad \text{and} \quad \mathcal{A}(t) = \Bigg( T(t) A(t) + \frac{d}{dt}T(t) \Bigg) T(t)^{-1},\end{align*}

where $T(t) = \mathcal{W}_p(t) W_p(t)^{-1}$ is regular. Thus,

\begin{align*}Y(t)&= B(t)' X(t) = B(t)' T(t)^{-1} \mathcal{X}(t) = \mathcal{B}(t)' \mathcal{X}(t)\quad \text{and}\\[5pt] d\mathcal{X}(t) &= \mathcal{A}(t)\mathcal{X}(t)dt+\mathcal{C} L(dt),\end{align*}

which is a representation for Y(t) as a time-varying CARMA(p, q) process in (11).

4.3. Locally stationary linear state space models—Peano–Baker series

We investigate sufficient conditions for sequences of time-varying state space processes, which obviously also include sequences of time-varying CARMA processes, to be locally stationary.

Let $\{Y_N(t), t\in\mathbb{R}\}_{N\in\mathbb{N}}$ be a sequence of time-varying linear state space processes defined by

\begin{align*}Y_N(t) &= B(t)' X_N(Nt)\quad \text{and} \\[5pt] X_N(Nt) &= \Psi_N(Nt,0) \int_{-\infty}^{Nt} \Psi_N(u,0)^{-1} C\bigg(\frac{u}{N}\bigg) L(du),\end{align*}

where $\Psi_N(s,s_0)$ is the solution of the matrix differential equation

\begin{align*}\Psi_N(s_0,s_0) &= \textbf{1}_p, \\[5pt] \frac{d}{ds}\Psi_N(s,s_0) &= A\bigg(\frac{s}{N}\bigg) \Psi_N(s,s_0), \quad \text{for all } s,s_0\in\mathbb{R} \text{ with } s>s_0,\end{align*}

which can be expressed as follows (see [Reference Baake and Schlägel2, Section 2]):

\begin{align*}\Psi_N(s,s_0) = \textbf{1}_p + \int_{s_0}^s A\bigg(\frac{\tau_1}{N}\bigg) d\tau_1 + \int_{s_0}^s A\bigg(\frac{\tau_1}{N}\bigg) \int_{s_0}^{\tau_1} A\bigg(\frac{\tau_2}{N}\bigg) d\tau_2 d\tau_1 + \ldots.\end{align*}

The substitution $s\mapsto s+Nt$ in (9) is necessary to achieve a dependence of the kernel function $g_N^0(t,\cdot)$ on $Nt-u$ . Therefore, we define $\widetilde{\Psi}_{N,t}(0,-(Nt-u))$ for a fixed point $t\in\mathbb{R}$ as the solution of the matrix differential equation

\begin{align*}\widetilde{\Psi}_{N,t}(s_0,s_0) &= \textbf{1}_p, \\[5pt] \frac{d}{ds}\widetilde{\Psi}_{N,t}(s,s_0) &= A\bigg(\frac{s}{N}+t\bigg) \widetilde{\Psi}_{N,t}(s,s_0), \quad \text{for all } s,s_0\in\mathbb{R} \text{ with } s>s_0,\end{align*}

which can again be expressed as

\begin{align*}\widetilde{\Psi}_{N,t}(s,s_0)= \textbf{1}_p + \int_{s_0}^s A\bigg(\frac{\tau_1}{N}+t\bigg) d\tau_1 + \int_{s_0}^s A\bigg(\frac{\tau_1}{N}+t\bigg) \int_{s_0}^{\tau_1} A\bigg(\frac{\tau_2}{N}+t\bigg) d\tau_2 d\tau_1 + \ldots.\end{align*}

From [Reference Brockett7, Theorem 4.2] we obtain

\begin{align*} \Psi_N(Nt,0)\Psi_N(u,0)^{-1} = \Psi_N(Nt,0)\Psi_N(0,u) = \Psi_N(Nt,u).\end{align*}

Since

\begin{align*}\Psi_N(Nt,u) &= \textbf{1}_p + \int_{u}^{Nt} A\bigg(\frac{\tau_1}{N}\bigg) d\tau_1 + \int_{u}^{Nt} A\bigg(\frac{\tau_1}{N}\bigg) \int_{u}^{\tau_1} A\bigg(\frac{\tau_2}{N}\bigg) d\tau_2 d\tau_1 + \ldots \\[5pt] &= \textbf{1}_p + \int_{u-Nt}^0 A\bigg(\frac{\tau_1}{N}+t\bigg) d\tau_1 + \int_{u-Nt}^0 A\bigg(\frac{\tau_1}{N}+t\bigg) \int_{u-Nt}^{\tau_1} A\bigg(\frac{\tau_2}{N}+t\bigg) d\tau_2 d\tau_1 + \ldots \\[5pt] &= \widetilde{\Psi}_{N,t}(0,-(Nt-u)),\end{align*}

we neglect the superscript tilde and define a sequence of time-varying linear state space processes as follows.

Definition 10. A sequence of time-varying linear state space processes $\{Y_N(t), t\in\mathbb{R}\}_{N\in\mathbb{N}}$ is defined as

(23) \begin{align}Y_N(t) = \int_\mathbb{R} \mathbb{1}_{\{Nt-u\geq 0\}} B(t)' \Psi_{N,t}^0(0,-(Nt-u)) C\bigg(\frac{u}{N}\bigg) L(du) \end{align}

with (limiting) kernel function (in view of Definition 5) given by

\begin{align*}g_N^0(Nt,Nt-u) &= \mathbb{1}_{\{Nt-u\geq 0\}} B(t)' \Psi_{N,t}^0(0,-(Nt-u)) C\bigg(\frac{-(Nt-u)}{N}+t\bigg) \quad\text{and}\\[5pt] g(t,Nt-u) &= \mathbb{1}_{\{Nt-u\geq 0\}} B(t)' \Psi_t(0,-(Nt-u)) C(t),\end{align*}

where $\Psi_{N,t}^0(0,-(Nt-u))$ and $\Psi_t(0,-(Nt-u))$ are the solutions of the matrix differential equations

(24) \begin{align}\begin{aligned}\Psi_{N,t}^0(s_0,s_0) &= \textbf{1}_p, \quad &&\frac{d}{ds} \Psi_{N,t}^0(s,s_0) = A\bigg(\frac{s}{N}+t\bigg) \Psi_{N,t}^0(s,s_0),\\[5pt] \Psi_t(s_0,s_0)&= \textbf{1}_p, \quad \text{and} \quad && \frac{d}{ds} \Psi_t(s,s_0)= A(t) \Psi_t(s,s_0) \end{aligned}\end{align}

for $s>s_0$ .

Using the Peano–Baker series, if necessary, the solutions of the above matrix differential equations are given by $\Psi_t(s,s_0)=e^{A(t)(s-s_0)}$ and

\begin{align*}\Psi_{N,t}^0(s,s_0) = \textbf{1}_p+\int_{s_0}^s A\bigg(\frac{\tau_1}{N}+t\bigg) d\tau_1+ \int_{s_0}^s A\bigg(\frac{\tau_1}{N}+t\bigg) \int_{s_0}^{\tau_1} A\bigg(\frac{\tau_2}{N}+t\bigg) d\tau_2 d\tau_1 + \ldots.\end{align*}

Proposition 9. Let $\{Y_N(t), t\in\mathbb{R}\}_{N\in\mathbb{N}}$ be a sequence of time-varying state space processes as in Definition 10. If

  1. (C1) the coefficient functions $A(\!\cdot\!)$ , $B(\!\cdot\!)$ , and $C(\!\cdot\!)$ are continuous,

  2. (C2) $\left\lVert B(s) \right\rVert<\infty$ for all $s\in\mathbb{R}$ , $\sup_{s\in\mathbb{R}}\|C(s)\|<\infty$ , and

  3. (C3) $\left\lVert \Psi_{N,t}^0(0,u) \right\rVert \leq F_t(u)$ for some real function $F_t\in L^2((\!-\!\infty,0])$ for all $N\in\mathbb{N}$ and $t\in\mathbb{R}$ ,

then $Y_N(t)$ is locally stationary.

Proof. Consider $Y_N(t)$ , $g_N^0$ , g, $\Psi_{N,t}^0$ , and $\Psi_t$ as defined above. For fixed $u,t\in\mathbb{R}$ it holds that

\begin{align*}\left| g_N^0(Nt,-u)-g(t,-u) \right|&= \mathbb{1}_{\{u\leq 0\}} \bigg| B(t)' \Big( \Psi_{N,t}^0(0,u) - \Psi_t(0,u) \Big) C\bigg(\frac{u}{N}+t\bigg) \\[5pt] &\qquad\qquad + B(t)' \Psi_t(0,u) \bigg( C\bigg(\frac{u}{N}+t\bigg) - C(t) \bigg)\bigg| \\[5pt] &\leq \mathbb{1}_{\{u\leq 0\}}\Big( \big\lVert B(t) \big\rVert \Big\lVert \Psi_{N,t}^0(0,u) - \Psi_t(0,u) \Big\rVert \bigg(\sup_{s\in\mathbb{R}}\big\lVert C(s) \big\rVert\bigg)\\[5pt] &\qquad\qquad+ \big\lVert B(t) \big\rVert\big\lVert \Psi_t(0,u) \big\rVert\bigg\lVert C\bigg(\frac{u}{N}+t) - C(t) \bigg\rVert\bigg) \;=\!:\;P_1+P_2.\end{align*}

Since $C(\!\cdot\!)$ is continuous, we immediately obtain $P_2\rightarrow0$ as $N\rightarrow\infty$ . In view of $P_1$ , it is sufficient to show that for all $\varepsilon>0$ and sufficiently large N,

(25) \begin{align}\Big\lVert \Psi_{N,t}^0(0,u)-\Psi_t(0,u) \Big\rVert\leq \varepsilon.\end{align}

Thanks to the equivalence of all norms on $M_{p\times p}(\mathbb{R})$ , it is sufficient to prove (25) for the norm of each column. By $\Psi_{N,t}^{0(j)}(0,u)$ and $\Psi_t^{(j)}(0,u)$ we denote the jth column, $j=1,\ldots,p$ , of $\Psi_{N,t}^0(0,u)$ and $\Psi_t(0,u)$ . Then, for functions $f_{N,t}(s,x)=A(\frac{s}{N}+t)x$ and $\tilde{f}_{t}(s,x)=A(t)x$ , we obtain

\begin{align*}\Psi_{N,t}^{0(j)}(u,u) &=e_j, \quad &&\frac{d}{ds} \Psi_{N,t}^{0(j)}(s,u) = f_{N,t}\Big(s,\Psi_{N,t}^{0(j)}(s,u)\Big),\\[5pt] \Psi_t^{(j)}(u,u)&= e_j, \quad \text{and} \quad && \frac{d}{ds} \Psi_t^{(j)}(s,u)= \tilde{f}_{t}\Big(s,\Psi_t^{(j)}(s,u)\Big), \end{align*}

where $e_j$ denotes the jth unit vector. Note that $f_{N,t}$ and $\tilde{f}_{t}$ are Lipschitz continuous in the second argument, with Lipschitz constant

\begin{align*} L=\sup_{s\in[u,0]}\bigg\lVert A\bigg(\frac{s}{N}+t\bigg) \bigg\rVert+A(t)<\infty.\end{align*}

Moreover,

\begin{align*}\Big\lVert f_t\Big(s,\Psi_{t}^{(j)}(s,u)\Big) - f_{N,t}\Big(s,\Psi_{t}^{(j)}(s,u)\Big) \Big\rVert \leq \delta \Big\lVert \Psi_{t}^{(j)}(s,u) \Big\rVert \leq \delta c,\quad s\in[u,0],\end{align*}

since $\left\lVert A(\frac{s}{N}+t)-A(t) \right\rVert<\delta$ for any $\delta>0$ for sufficiently large N, and $\Psi_{t}^{(j)}(\cdot,u)$ is continuous and thus bounded on [u,0]. An application of [Reference Walter46, Section 12.V] gives (25). Finally, by using the dominated convergence theorem with majorant

\begin{align*}\Big| g_N^0(Nt,-u) \Big| &\leq \mathbb{1}_{\{u\leq 0\}} \big\lVert B(t) \big\rVert \bigg(\sup_{s\in\mathbb{R}}\big\lVert C(s) \big\rVert\bigg) \Big\lVert \Psi_{N,t}^0(0,u) \Big\rVert \\[5pt] &\leq \mathbb{1}_{\{u\leq 0\}} c_t F_t(u)\in L^2(\mathbb{R})\end{align*}

for some constant $c_t>0$ , where the last inequality follows from (C1) and (C2), we can deduce that $\Big\lVert g_N^0(Nt,\cdot)-g(t,\cdot) \Big\rVert_{L^2}\rightarrow0$ as $N\rightarrow\infty$ .

In fact, the assumption (C3) in Proposition 9 is an immediate consequence if the state space system is uniformly exponentially stable.

Definition 11. ([Reference Rugh38, Chapter 6, Definition 6.5 and Theorem 6.7].) A sequence of linear state space models as in Definition 10 is called uniformly exponentially stable if there exist $\gamma>0$ and $\lambda>0$ such that

\begin{align*}\Big\lVert \Psi_{N,t}^0(s,s_0) \Big\rVert \leq \gamma e^{-\lambda(s-s_0)}\end{align*}

for all $s,s_0$ , where $s>s_0$ , $N\in\mathbb{N}$ , and $t\in\mathbb{R}$ .

If a linear state space model as in Definition 10 is uniformly exponentially stable, then the condition (C3) from Proposition 9 is satisfied for $F_t(u)\;:\!=\;\gamma e^{\lambda u}$ and some $\gamma,\lambda>0$ .

Proposition 10. Each of the following two conditions is sufficient for a state space model $\{Y_N(t), t\in\mathbb{R}\}_{N\in\mathbb{N}}$ as in Definition 10 to be uniformly exponentially stable:

  1. (a) Let $\lambda_{max}(t)$ , $t\in\mathbb{R}$ , denote the largest eigenvalue of $A(t)+A(t)'$ . If there exist positive constants $\gamma$ and $\lambda$ such that

    \begin{align*}\int_{s_0}^s \lambda_{max}\bigg(\frac{\nu}{N}+t\bigg) d\nu \leq -\lambda(s-s_0) + \gamma\end{align*}
    for all s, $s_0$ , t, and N with $s\geq s_0$ , then by [Reference Rugh38, Corollary 8.4], $Y_N(t)$ is uniformly exponentially stable.
  2. (b) Suppose A(t) is continuously differentiable and there exist positive constants $\alpha$ , $\mu$ , and $\beta$ such that $\lVert A(t) \rVert \leq\alpha$ , $\Big\lVert \frac{d}{dt}A(t) \Big\rVert \leq\beta$ , and the eigenvalues $\lambda_j(t)$ of A(t) for $j=1,\ldots,p$ satisfy $\mathfrak{Re}(\lambda_j(t))\leq-\mu$ for all t. Then by [Reference Rugh38, Theorem 8.7], $Y_N(t)$ is uniformly exponentially stable.

Remark 5. Part (b) of Proposition 10 corresponds to the condition (C2) in Proposition 6 for sequences of tvCAR(1) processes.

Remark 6. There is a strong structural resemblance between the above results and known results from the theory on locally stationary processes in discrete time. Indeed, let us consider a sequence of time-varying AR(p) processes given by $X_{t,T}=\sum_{j=1}^pa_j(t/T)X_{t-j,T}+\sigma(t/T)\varepsilon_t$ , where $\varepsilon_t$ is a sequence of i.i.d. random variables such that $\mathbb{E}[\varepsilon_t]=0$ and $Var(\varepsilon_t)=1$ , $a_j(u)$ is a family of arbitrary coefficient functions, and $\sigma(u)$ is an arbitrary function. For a comprehensive discussion on the sequence of processes $X_{t,T}$ we refer to [Reference Dahlhaus11]. In the following we consider the matrix function

\begin{align*}A(u)&\;:\!=\;\left( \begin{array}{c@{\quad}c@{\quad}c@{\quad}c@{\quad}c}a_1(u) & a_2(u) & \ldots & \ldots & a_p(u) \\[5pt] 1 & 0 & \ldots & \ldots & 0 \\[5pt] 0 & 1 & \ddots & \ldots & \vdots \\[5pt] \vdots & \vdots & \ddots &\ddots & \vdots \\[5pt] 0 & \ldots & 0 & 1 & 0 \\[5pt] \end{array}\right).\end{align*}

In this setting, the authors of [Reference Dahlhaus11, Reference Künsch23] show that $X_{t,T}$ is locally stationary if $a_j(u)$ is continuous for all $j=1,\ldots,p$ and the eigenvalues $\lambda_i(u)$ , $i=1,\ldots,p$ , of A(u) satisfy $|\lambda_i(u)|\leq\delta$ for some $\delta\in(0,1)$ . Moreover, in [Reference Dahlhaus11, Theorem 2.3(ii)], the authors consider the $a_j(u)$ to be parameterized and derive suitable properties for the treatment of maximum likelihood estimates under the additional assumption that the $a_j(u)$ as well as their derivatives are uniformly bounded.

If the commutativity assumption (14) holds, the transition matrix is given by

\begin{align*} \Psi_{N,t}^0(0,u)=e^{\int_u^{0} A\left(\frac{s}{N}+t\right) ds}.\end{align*}

Then $Y_N(t)$ simplifies to

\begin{align*}Y_N(t) = \int_{-\infty}^{Nt} B(t)' e^{\int_{-(Nt-u)}^{0} A(\frac{s}{N}+t) ds} C\bigg(\frac{u}{N}\bigg) L(du).\end{align*}

Proposition 11. Let $\{Y_N(t), t\in\mathbb{R}\}_{N\in\mathbb{N}}$ be a sequence of time-varying state space processes as in Definition 10. If (C1) and (C2) from Proposition 9 hold, $\{A(t)\}_{t\in\mathbb{R}}$ is mutually commutative, the eigenvalues $\lambda_j(t)$ of A(t) for $j=1,\ldots,p$ satisfy $\mathfrak{Re}(\lambda_j(t))\leq-\mu$ for all $t\in\mathbb{R}$ and some $\mu>0$ , and either

  1. (D1) A(t) is diagonalizable for all $t\in\mathbb{R}$ or

  2. (D2) there exists $\tau>0$ such that

    \begin{align*} \sup_{\tau\leq x<\infty}\Bigg\lVert \frac{1}{x}\int_\tau^xA\bigg(\frac{s}{N}+t\bigg) \Bigg\rVert<C\end{align*}
    for all N and a constant $C>0$ ,

then $Y_N(t)$ is locally stationary.

Proof. It is sufficient to check (C3) from Proposition 9. We start by assuming that (D1) holds. Then, by [Reference Horn and Johnson19, Theorem 1.3.12], $\{A(t)\}_{t\in\mathbb{R}}$ is simultaneously diagonalizable. Thus, there exists a non-singular matrix S such that $S^{-1}A(\frac{s}{N}+t)S=diag(\lambda_1(\frac{s}{N}+t),\ldots,\lambda_p(\frac{s}{N}+t))\;=\!: D(\frac{s}{N}+t)$ . Considering the spectral norm, we obtain for all $u\leq0$

\begin{align*}\Big\lVert \Psi_{N,t}^0(0,u) \Big\rVert&=\Big\lVert e^{\int_u^{0} A(\frac{s}{N}+t) ds} \Big\rVert=\Big\lVert e^{\int_u^{0}S D(\frac{s}{N}+t) dsS^{-1}} \Big\rVert\leq\big\lVert S \big\rVert\Big\lVert S^{-1} \Big\rVert \Big\lVert e^{\int_u^{0}D(\frac{s}{N}+t) ds} \Big\rVert\\[5pt]&\leq C \max \bigg\{ \sqrt{\mu}\;:\; \mu\in\sigma\bigg( e^{\int_u^0 D(\frac{s}{N}+t)^* + D(\frac{s}{N}+t) ds} \bigg) \bigg\}\\[5pt] &= C \max_{j=1,\ldots,p} \sqrt{ e^{2\int_u^0 \mathfrak{Re}(\lambda_j(\frac{s}{N}+t)) ds} }\\[5pt] &= C \max_{j=1,\ldots,p} e^{\int_u^0 \mathfrak{Re}(\lambda_j(\frac{s}{N}+t)) ds} \leq C e^{\int_u^0 -\mu ds} = C e^{\mu u}\end{align*}

for some constant $C>0$ .

In the case where (D2) holds, we have

\begin{align*}\Big\lVert \Psi_{N,t}^0(0,u) \Big\rVert&=\Big\lVert e^{\int_u^{0} A(\frac{s}{N}+t) ds} \Big\rVert=\Big\lVert e^{\int_0^{-u} A(\frac{-s}{N}+t) ds} \Big\rVert\\[5pt] &\leq \mathbb{1}_{\left\{-u\in[0,\tau]\right\}}e^{\left| \tau \right|\sup_{s\in[0,\tau]}\left\lVert A\left(\frac{-s}{N}+t\right) \right\rVert}\\[5pt] &\quad+ \mathbb{1}_{\{-u> \tau\}}e^{\left| \tau \right|\sup_{s\in[0,\tau]}\left\lVert A\left(\frac{-s}{N}+t\right) \right\rVert} \Big\lVert e^{\int_\tau^{-u}A\left(\frac{-s}{N}+t\right)ds} \Big\rVert,\end{align*}

where we used that the integrals $\int_0^{\tau}A\left(\frac{-s}{N}+t\right)ds$ and $\int_\tau^{-u}A\left(\frac{-s}{N}+t\right)ds$ commute. Therefore, it is sufficient to bound

\begin{align*} \Big\lVert e^{\int_\tau^{-u}A(\frac{-s}{N}+t)ds} \Big\rVert.\end{align*}

In the following we use [Reference Lukes26, Theorem 7.7.1]. Since the family $\{A(t)\}_{t\in\mathbb{R}}$ is mutually commutative, the family can be reduced simultaneously to an upper triangular form by a single unitary transformation, i.e. there exists a unitary matrix $U\in M_{p\times p}(\mathbb{C})$ such that $U^*A(t)U=T(t)$ is an upper triangular matrix for all $t\in\mathbb{R}$ (see [Reference Horn and Johnson19, Theorem 2.3.3]). For each $x>\tau$ , the diagonal entries (and hence also the eigenvalues) of $\frac{1}{x}\int_\tau^xT\left( \frac{s}{N}+t \right)ds$ are $\frac{1}{x}\int_\tau^x \lambda_1\left(\frac{s}{N}+t\right)ds,\ldots,\frac{1}{x}\int_\tau^x \lambda_p\left(\frac{s}{N}+t\right)ds$ . Moreover, these are also the eigenvalues of $\frac{1}{x}\int_\tau^x A\left( \frac{s}{N}+t \right)ds$ , since

\begin{align*}\frac{1}{x}\int_\tau^xT\bigg( \frac{s}{N}+t \bigg)ds=\frac{1}{x}U^*\int_\tau^x A\bigg( \frac{s}{N}+t \bigg)ds\ U.\end{align*}

For the real part of the eigenvalues we obtain

\begin{align*}\mathfrak{Re}\bigg(\frac{1}{x}\int_\tau^x \lambda_i\bigg(\frac{s}{N}+t\bigg)ds\bigg)=\frac{1}{x}\int_\tau^x \mathfrak{Re}\bigg(\lambda_i\bigg(\frac{s}{N}+t\bigg)\bigg)ds\leq-\mu \bigg(\frac{x-\tau}{x}\bigg)\leq-\mu\end{align*}

for all $i=1,\ldots,p$ , $N\in\mathbb{N}$ , and $x\in\mathbb{R}$ . Hence

\begin{align*}\overline{\bigcup_{\tau\leq t<\infty}\!\!\!\tilde\sigma\bigg(\frac{1}{x}\int_\tau^x A\bigg( \frac{s}{N}+t \bigg)ds\bigg)}\subset\!\!\overline{\bigcup_{\tau\leq t<\infty}\!\!\!\sigma\bigg(\frac{1}{x}\int_\tau^x A\bigg( \frac{s}{N}+t \bigg)ds\bigg)}\!\subset \big\{z\in\mathbb{C}\;:\;\mathfrak{Re}(z)\leq-\mu\big\}\end{align*}

for all $N\in\mathbb{N}$ , where $\tilde\sigma(B)$ denotes the collection of all distinct eigenvalues of the matrix B and $\sigma(B)$ the spectrum of B. Finally, an application of [Reference Lukes26, Theorem 7.7.1] gives

\begin{align*}\Big\lVert e^{\int_\tau^{-u}A\left(\frac{-s}{N}+t\right)ds} \Big\rVert\leq Ce^{\mu u}\end{align*}

for some constant $C>0$ .

Sequences of time-varying CARMA processes, i.e. where the family $\{\mathcal{A}(t)\}_{t\in\mathbb{R}}$ forms a family of companion matrices, cannot be covered by Proposition 11, since companion matrices are in general not commutative. The following proposition brings further insight when a family of companion matrices is mutually commutative.

Proposition 12. Let $\{\mathcal{A}(t)\}_{t\in\mathbb{R}}$ be a family of companion matrices and $\tau\in\mathbb{R}$ fixed. For any $t\in\mathbb{R}$ , the matrix $\mathcal{A}(t)$ commutes with $\mathcal{A}(\tau)$ if and only if it is a polynomial of $\mathcal{A}(\tau)$ over $\mathbb{C}$ .

Proof. It is clear that any polynomial of $\mathcal{A}(\tau)$ commutes with $\mathcal{A}(\tau)$ . For the other direction we refer to [Reference Horn and Johnson19, Exercise 3.3P17].

If, for a sequence of time-varying CARMA processes, the family $\{\mathcal{A}(t)\}_{t\in\mathbb{R}}$ is not mutually commutative, Proposition 9 provides sufficient conditions for local stationarity, where the condition (C3) can be derived from Proposition 10.

5. Time-varying spectrum

For a stationary process $\{Y(t),t\in\mathbb{R}\}$ , the autocovariance function $\gamma_Y(h)\;:\!=\;Cov(Y(t+h), Y(t))$ is related to the spectral density $f_Y(\lambda)$ by

\begin{align*} \gamma_Y(h)=\int_{-\infty}^\infty e^{ih\lambda} f_Y(\lambda) d\lambda \qquad \text{and} \qquad f_Y(\lambda) =\frac{1}{2\pi} \int_{-\infty}^\infty e^{-ih\lambda} \gamma_Y(h) dh.\end{align*}

To describe the time-varying spectrum of a discrete-time locally stationary time series, [Reference Dahlhaus11] uses the Wigner–Ville spectrum (see also [Reference Bruscato and Toloi9, Reference Flandrin and Martin18, Reference Martin and Flandrin29]). A comparable approach is presented in [Reference Priestley31, Section 11.2], where the author uses the evolutionary spectrum. However, in contrast to this approach, the Wigner–Ville spectrum has the important consequence of a unique spectral representation as discussed in [Reference Bruscato and Toloi9, p. 74] and [Reference Dahlhaus11, p. 143]. In view of this property, we follow the approach of [Reference Dahlhaus11] and define the Wigner–Ville spectrum and time-varying spectral density for a continuous-time locally stationary process as follows.

Definition 12. Let $\{Y_N(t), t\in\mathbb{R}\}_{N\in\mathbb{N}}$ be a sequence of locally stationary processes. For $N\in\mathbb{N}$ we define the Wigner–Ville spectrum as

\begin{align*}f_N(t,\lambda)=\frac{1}{2\pi} \int_{-\infty}^\infty e^{-i\lambda s} Cov\bigg(Y_N\bigg(t+\frac{s}{2N}\bigg),Y_N\bigg(t-\frac{s}{2N}\bigg)\bigg) ds,\end{align*}

and the (time-varying) spectral density of the process $Y_N(t)$ as

\begin{align*}f(t,\lambda)= \frac{\Sigma_L}{2\pi} |A(t,\lambda)|^2,\end{align*}

where $A(t,\lambda)$ denotes the limiting transfer function from Definition 6.

The following theorem is a continuous-time analogue to [Reference Dahlhaus11, Theorem 2.2].

Theorem 3. Let $\{Y_N(t), t\in\mathbb{R}\}_{N\in\mathbb{N}}$ be a sequence of locally stationary processes in the form (7). If

  1. (a) $\Big\lVert A_N^0\Big(N\Big(t\pm\frac{s}{2N}\Big),\cdot\Big)-A(t,\cdot) \Big\rVert_{L^2}\underset{N\rightarrow\infty}{\longrightarrow}0$ for all $s,t\in\mathbb{R}$ ,

  2. (b) $A_N^0$ and A are uniformly bounded in $L^2$ , i.e $\Big\lVert A_N^0(Nt,\cdot) \Big\rVert_{L^2}\leq K$ and $\left\lVert A(t,\cdot) \right\rVert_{L^2}\leq K$ for all $t\in\mathbb{R}$ , $N\in\mathbb{N}$ and a constant $K>0$ , and

  3. (c) $A_N^0(Nt,\cdot)$ and $A(t,\cdot)$ are differentiable for all $t\in\mathbb{R}$ , $N\in\mathbb{N}$ , and the derivatives $\frac{d}{d\mu}A_N^0(Nt,\mu)$ , $\frac{d}{d\mu}A(t,\mu)$ are uniformly bounded in $L^2$ , i.e.

    \begin{align*} \Bigg\lVert \frac{d}{d\mu}A_N^0(Nt,\cdot) \Bigg\rVert_{L^2}\leq K \quad \textit{and} \quad \Bigg\lVert \frac{d}{d\mu}A(t,\cdot) \Bigg\rVert_{L^2}\leq K,\end{align*}
    for all t, N and a constant $K>0$ ,

then the Wigner–Ville spectrum tends pointwise for each $t\in\mathbb{R}$ in mean square to the time-varying spectral density, i.e.

\begin{align*}\int_\mathbb{R} \left| f_N(t,\lambda)-f(t,\lambda) \right|^2 d\lambda \overset{N\rightarrow\infty}{\longrightarrow} 0.\end{align*}

Remark 7. Since $A_N^0$ and A are defined as Fourier transforms of $g_N^0$ and g in $L^2$ , they exist as elements in $L^2$ , i.e. as representatives of equivalence classes. As usual, this does not allow for taking derivatives in the usual sense, but would lead to the concept of weak derivatives.

However, for a function $f\in L^1$ such that $uf(u)\in L^1$ , the derivative of the Fourier transform $\widehat{f}(\mu)$ can be expressed as $\frac{d}{d\mu}\widehat{f}(\mu) = \widehat{(\!-\!iuf(u))}(\mu)$ , by [Reference Katznelson20, Theorem 1.6, Chapter VI]. An application of this theorem to the Fourier transform pairs $A_N^0$ and $g_N^0$ , as well as A and g, ensures the existence of the pointwise derivatives in (c). The conditions on the kernel functions can be readily obtained, for instance, if the sequence of locally stationary state space models considered is uniformly exponentially stable, since then the kernel functions are of exponential decay.

Lemma 2. Let $f\;:\;\mathbb{R}\rightarrow\mathbb{R}$ be differentiable, such that either $f\in L^1(\mathbb{R})$ and $f'\in L^1(\mathbb{R})$ , or $f\in L^2(\mathbb{R})$ and $f'\in L^2(\mathbb{R})$ . Then $\lim_{x\rightarrow\pm\infty}f(x)=0$ and $f\in L^p(\mathbb{R})$ for all $p>2$ .

Proof. We first consider the case $f\in L^2(\mathbb{R})$ and $f'\in L^2(\mathbb{R})$ . Let $\varepsilon>0$ . Then, as $f\in L^2(\mathbb{R})$ and $f'\in L^2(\mathbb{R})$ , there exist $\delta_1,\delta_2>0$ such that for any measurable sets $E_1$ and $E_2$ it holds that

\begin{align*}\lambda(E_1)&<\delta_1 \implies \Bigg( \int_{E_1}f(s)^2 ds\Bigg)^{\frac{1}{2}}<\frac{\varepsilon}{\sqrt{2}} \quad \text{ and} {}\\[5pt] \lambda(E_2)&<\delta_2 \implies \Bigg( \int_{E_2}f'(s)^2 ds\Bigg)^{\frac{1}{2}}<\frac{\varepsilon}{\sqrt{2}},\end{align*}

where $\lambda$ denotes the Lebesgue measure. Let $x,y\in\mathbb{R}$ with $\left| x-y \right|<\delta\;:\!=\;\min(\delta_1,\delta_2)$ . Without loss of generality we assume $x>y$ and use Hölder’s inequality to obtain

(26) \begin{align}\Big| f(x)^2-f(y)^2 \Big|=\Bigg| \int_y^x\frac{d}{ds}(f(s)^2)ds \Bigg|\leq2\Bigg(\int_y^xf'(s)^2ds\Bigg)^{\frac{1}{2}}\Bigg(\int_y^xf(s)^2ds\Bigg)^{\frac{1}{2}}<\varepsilon.\end{align}

In fact, we showed that $f^2$ is uniformly continuous. Suppose now that $\lim_{x\rightarrow\infty}f(x)^2=0$ does not hold. Then there exist some $\varepsilon>0$ and an increasing sequence $(x_n)_{n\in\mathbb{N}}$ such that $x_n\rightarrow\infty $ as $n\rightarrow\infty$ and $\left| f(x_n)^2 \right|>\varepsilon$ for all $n\in\mathbb{N}$ . By passing to a subsequence we can additionally assume that $x_{n+1}-x_n>1$ for all $n\in\mathbb{N}$ . Using (26) we find some $1>\delta>0$ such that $\left| x-y \right|<\delta$ ensures that $\left| f(x)^2-f(y)^2 \right|<\frac{\varepsilon}{2}$ . Now, define $I_n\;:\!=\;[x_n,x_n+\delta]$ and let $x\in I_n$ . Then

\begin{align*} f(x)^2=\Big| f(x_n^2)-(f(x_n)^2-f(x)^2) \Big|\geq \Big| f(x_n^2) \Big| - \Big| f(x_n)^2-f(x)^2 \Big|\geq \frac{\varepsilon}{2}.\end{align*}

Finally,

\begin{align*}\int_{\mathbb{R}}f(s)^2\geq \sum_{n=1}^\infty \int_{I_n}f(s)^2ds\geq \sum_{n=1}^\infty \frac{\varepsilon}{2}\delta=\infty\end{align*}

is a contradiction to $f\in L^2(\mathbb{R})$ , and we conclude that $\lim_{x\rightarrow\infty}f(x)=0$ . Considering $f(\!-\!x)$ instead of f(x), we also obtain $\lim_{x\rightarrow-\infty}f(x)=0$ .

If $f\in L^1(\mathbb{R})$ and $f'\in L^1(\mathbb{R})$ , similar yet slightly easier arguments also ensure that $\lim_{x\rightarrow\pm\infty}f(x)=0$ .

Finally, define the set $J=\{x, \left| f(x) \right|\geq1 \}$ . Since $\lim_{x\rightarrow\pm\infty}f(x)=0$ , J is a compact set. Thus, there exists a constant C such that $\left| f(x) \right|\leq C$ for all $x\in J$ . Eventually,

\begin{align*}\int_\mathbb{R} \left| f(x) \right|^p dx \leq \int_J C^p dx + \int_{\mathbb{R}\backslash J} \left| f(x) \right|^2 dx < \infty.\end{align*}

Proof of Theorem 3 . In the following, K denotes the constant from the conditions (b) and (c). First, we note that the covariance of $Y_N(t)$ is given by

\begin{align*} Cov\left(Y_N(t_1),Y_N(t_2)\right) = \frac{\Sigma_L}{2\pi} \int_\mathbb{R} e^{i\mu (t_1-t_2) N} A_N^0(Nt_1,\mu) \overline{A_N^0(Nt_2,\mu)} d\mu.\end{align*}

Then we obtain for the Wigner–Ville spectrum and the (time-varying) spectral density

\begin{align*}f_N(t,\lambda)&= \frac{1}{2\pi} \int_{\mathbb{R}} e^{-i\lambda s} \frac{\Sigma_L}{2\pi} \int_{\mathbb{R}} e^{i\mu \left((t+\frac{s}{2N})-(t-\frac{s}{2N})\right) N} A_N^0\bigg(N\bigg(t+\frac{s}{2N}\bigg),\mu\bigg)\\[5pt] &\qquad\qquad\qquad\qquad\qquad\times \overline{A_N^0\bigg(N\bigg(t-\frac{s}{2N}\bigg),\mu\bigg)} d\mu\ ds \\[5pt] &= \frac{\Sigma_L}{(2\pi)^2} \int_{\mathbb{R}} e^{-i\lambda s} \int_{\mathbb{R}} e^{i\mu s} A_N^0\bigg(N\bigg(t+\frac{s}{2N}\bigg),\mu\bigg) \overline{A_N^0\bigg(N\bigg(t-\frac{s}{2N}\bigg),\mu\bigg)} d\mu\ ds \quad \text{ and}{}\\[5pt] f(t,\lambda)&= \frac{\Sigma_L}{2\pi} |A(t,\lambda)|^2= \frac{\Sigma_L }{(2\pi)^2} \int_{\mathbb{R}} e^{-i\lambda s} \int_{\mathbb{R}} e^{i\mu s} |A(t,\lambda)|^2 d\mu\ ds.\end{align*}

We note that, because of the differentiability condition (c), the function $A(t,\cdot)$ is differentiable, $A(t,\cdot)\in L^2$ , and $\frac{d}{d\mu}A(t,\cdot)\in L^2$ . Then an application of Lemma 2 gives $A(t,\cdot)\in L^4$ , which implies $f(t,\cdot)\in L^2$ . Moreover, for all $t\in\mathbb{R}$ , $N\in\mathbb{N}$ , we obtain from Plancherel’s theorem, the Cauchy–Schwarz inequality, and the integration-by-parts formula, for some $C>0$ ,

\begin{align*}&\int_{\mathbb{R}} \big| f_N(t,\lambda) \big|^2 d\lambda= \frac{\Sigma_L}{4\pi^2} \int_{\mathbb{R}} \bigg| \int_{\mathbb{R}} e^{-i\lambda s} \int_{\mathbb{R}} e^{i\mu s} A_N^0\bigg(N\bigg(t+\frac{s}{2N}\bigg),\mu\bigg)\\[5pt] &\qquad\qquad\qquad\qquad\qquad\qquad\qquad\qquad\quad\times \overline{A_N^0\bigg(N\bigg(t-\frac{s}{2N}\bigg),\mu\bigg)} d\mu\ ds\bigg|^2 d\lambda \\[5pt] &=C \int_{\mathbb{R}} \Bigg| \int_{\mathbb{R}} e^{i\mu s} A_N^0\bigg(N\bigg(t+\frac{s}{2N}\bigg),\mu\bigg) \overline{A_N^0\bigg(N\bigg(t-\frac{s}{2N}\bigg),\mu\bigg)} d\mu \Bigg|^2 ds \\[5pt] &\leq C \int_{|s|< 1} \|A_N^0(N(t+\frac{s}{2N}),\cdot)\|_{L^2}^2 \|A_N^0(N(t-\frac{s}{2N}),\cdot)\|_{L^2}^2 ds\\[5pt] &\quad+ C \int_{|s|\geq 1} \Bigg| \int_{\mathbb{R}} e^{i\mu s} A_N^0\bigg(N\bigg(t+\frac{s}{2N}\bigg),\mu\bigg) \overline{A_N^0\bigg(N\bigg(t-\frac{s}{2N}\bigg),\mu\bigg)} d\mu \Bigg|^2 ds \\[5pt] &\leq 2C K^4+ C \int_{|s|< 1} \Bigg|0-\int_{\mathbb{R}} \frac{e^{i\mu s}}{is} \bigg(A_N^0\bigg(N\bigg(t+\frac{s}{2N}\bigg),\mu\bigg) \frac{d}{d\mu}\overline{A_N^0\bigg(N\bigg(t+\frac{s}{2N}\bigg),\mu\bigg)} \\[5pt] &\qquad\qquad\qquad\qquad\qquad\qquad+\frac{d}{d\mu} A_N^0\bigg(N\bigg(t+\frac{s}{2N}\bigg),\mu\bigg) \overline{A_N^0\bigg(N\bigg(t+\frac{s}{2N}\bigg),\mu\bigg)} \bigg)d\mu\Bigg|^2 ds\\[5pt] &\leq 2C K^4+ 4CK^4 \int_{|s|\geq 1} \frac{1}{s^2}ds<\infty,\end{align*}

where Lemma 2 implies $\lim_{\mu\rightarrow\pm\infty}A_N^0(t,\mu)=0$ , since $A_N^0(t,\cdot)\in L^2(\mathbb{R})$ and $\frac{d}{d\mu}A_N^0(t,\cdot)\in L^2(\mathbb{R})$ for all $t\in\mathbb{R}$ . Finally, the integral $\int_\mathbb{R} |f_N(t,\lambda)-f(t,\lambda)|^2 d\lambda$ is well-defined, since $f_N(t,\cdot)$ and $f(t,\cdot)$ are both in $L^2(\mathbb{R})$ for all $t\in\mathbb{R}$ . From Plancherel’s theorem we obtain

\begin{align*}&(2\pi)^2 \int_\mathbb{R} \big| f_N(t,\lambda)-f(t,\lambda) \big|^2 d\lambda \\[5pt] &= \frac{\Sigma_L^2}{2\pi}\int_{\mathbb{R}} \bigg| \frac{1}{\sqrt{2\pi}}\int_{\mathbb{R}} e^{-i\lambda s} \Bigg( \int_{\mathbb{R}} e^{i\mu s} \bigg( A_N^0\bigg(N\bigg(t+\frac{s}{2N}\bigg),\mu\bigg) \overline{A_N^0\bigg(N\bigg(t-\frac{s}{2N}\bigg),\mu\bigg)} \\[5pt] &\qquad\qquad\qquad\qquad\qquad\qquad\qquad\qquad- A(t,\mu) \overline{A(t,\mu)} \bigg) d\mu \Bigg) ds \bigg|^2 d\lambda {}\\[5pt] &= \frac{\Sigma_L^2}{2\pi} \int_{\mathbb{R}} \Bigg| \int_{\mathbb{R}} e^{i\mu s} \bigg( A_N^0\bigg(N\bigg(t+\frac{s}{2N}\bigg),\mu\bigg) \overline{A_N^0\bigg(N\bigg(t-\frac{s}{2N}\bigg),\mu\bigg)} - A(t,\mu) \overline{A(t,\mu)} \bigg) d\mu \Bigg|^2 ds\\[5pt] & \;=\!:\; \frac{\Sigma_L^2}{2\pi} \int_{\mathbb{R}} \bigg| \widehat{a}_N\bigg(\frac{s}{2N}\bigg) \bigg|^2 ds,\end{align*}

where $\widehat{a}_N(\frac{s}{2N}) = \int_{\mathbb{R}} e^{i\mu s} a_N(\frac{s}{2N},\mu) d\mu$ . It remains to show that

(27) \begin{align}\int_{\mathbb{R}} \bigg| \widehat{a}_N\bigg(\frac{s}{2N}\bigg) \bigg|^2 ds\underset{N\rightarrow\infty}{\longrightarrow}0.\end{align}

The proof of (27) consists of several steps. We start by showing that $\widehat{a}_N(\frac{s}{2N}) \underset{N\rightarrow\infty}{\longrightarrow}0$ . Indeed, for fixed $s,t\in\mathbb{R}$ we obtain

\begin{align*}\bigg| \widehat{a}_N\bigg(\frac{s}{2N}\bigg) \bigg| &\leq \int_{\mathbb{R}} \bigg| A_N^0\bigg(N\bigg(t+\frac{s}{2N}\bigg),\mu \bigg) \overline{A_N^0\bigg(N\bigg(t-\frac{s}{2N}\bigg),\mu \bigg)} - A(t,\mu) \overline{A(t,\mu)} \bigg| d\mu \\[5pt] &\leq \int_{\mathbb{R}} \bigg| A_N^0\bigg(N\bigg(t+\frac{s}{2N}\bigg),\mu \bigg) - A(t,\mu) \bigg|\bigg| \overline{A_N^0\bigg(N\bigg(t-\frac{s}{2N}\bigg),\mu \bigg)} \bigg| \\[5pt] &\qquad+ \bigg| A(t,\mu) \bigg|\bigg| \overline{A_N^0\bigg(N\bigg(t-\frac{s}{2N}\bigg),\mu \bigg)} - \overline{A(t,\mu)} \bigg| d\mu \\[5pt] &\leq \bigg\lVert A_N^0\bigg(N\bigg(t+\frac{s}{2N}\bigg),\cdot \bigg) - A(t,\cdot) \bigg\rVert_{L^2} \bigg\lVert A_N^0\bigg(N\bigg(t-\frac{s}{2N}\bigg),\cdot\bigg) \bigg\rVert_{L^2} \\[5pt] &\qquad+ \big\lVert A(t,\cdot) \big\rVert_{L^2}\bigg\lVert A_N^0\bigg(N\bigg(t-\frac{s}{2N}\bigg),\cdot\bigg) - A(t,\cdot) \bigg\rVert_{L^2}.\end{align*}

Now, by the condition (a), it holds that

\begin{align*} \bigg\lVert A_N^0\bigg(N\bigg(t-\frac{s}{2N}\bigg),\cdot\bigg) - A(t,\cdot) \bigg\rVert_{L^2}\rightarrow0\end{align*}

as $N\rightarrow\infty$ . Moreover, using the conditions (b) and (c), there exists a constant D, which may depend on s and t, such that

\begin{align*}\left\lVert A(t,\cdot) \right\rVert_{L^2} &\leq D \quad \text{ and}\\[5pt] \bigg\lVert A_N^0\bigg(N\bigg(t\pm\frac{s}{2N}\bigg),\cdot\bigg) \bigg\rVert_{L^2} &\leq \bigg\lVert A_N^0\bigg(N\bigg(t\pm\frac{s}{2N}\bigg),\cdot\bigg)-A(t,\cdot) \bigg\rVert_{L^2} + \big\lVert A(t,\cdot) \big\rVert_{L^2} \leq D\end{align*}

for sufficiently large N. Thus

(28) \begin{align}\begin{aligned}\bigg| \widehat{a}_N\bigg(\frac{s}{2N}\bigg) \bigg|&\leq\bigg\lVert A_N^0\bigg(N\bigg(t+\frac{s}{2N}\bigg),\cdot\bigg)-A(t,\cdot) \bigg\rVert_{L^2}^2 \bigg\lVert A_N^0\bigg(N\bigg(t-\frac{s}{2N}\bigg),\cdot\bigg) \bigg\rVert_{L^2}^2\\[5pt] &\quad + \left\lVert A(t,\cdot) \right\rVert_{L^2}^2 \bigg\lVert A_N^0\bigg(N\bigg(t-\frac{s}{2N}\bigg),\cdot\bigg)-A(t,\cdot) \bigg\rVert_{L^2}^2\rightarrow 0,\end{aligned}\end{align}

as $N\rightarrow\infty$ .

Next, we show that $|\widehat{a}_N(\frac{s}{2N})|\leq\frac{E}{|s|}$ , for all $s\in\mathbb{R}$ , sufficiently large $N\in\mathbb{N}$ , and some constant $E>0$ , which may depend on t. On the one hand, we have

(29) \begin{align}\int_{\mathbb{R}} \bigg| \frac{d}{d\mu} a_N\bigg(\frac{s}{2N},\mu\bigg) \bigg| d\mu&= \int_{\mathbb{R}} \bigg|\bigg(\frac{d}{d\mu} A_N^0\bigg(N\bigg(t+\frac{s}{2N}\bigg),\mu\bigg)\bigg) \overline{A_N^0\bigg(N\bigg(t-\frac{s}{2N}\bigg),\mu\bigg)} \nonumber \\[5pt] &\qquad+ A_N^0\bigg(N\bigg(t+\frac{s}{2N}\bigg),\mu\bigg) \bigg(\frac{d}{d\mu} \overline{A_N^0\bigg(N\bigg(t-\frac{s}{2N}\bigg),\mu)}\bigg) \nonumber \\[5pt] &\qquad- \bigg(\frac{d}{d\mu}A(t,\mu)\bigg) \overline{A(t,\mu)} - A(t,\mu) \bigg(\frac{d}{d\mu} \overline{A(t,\mu)}\bigg)\bigg| d\mu \nonumber \\[5pt] &\leq \bigg\lVert \frac{d}{d\mu} A_N^0\bigg(N\bigg(t+\frac{s}{2N}\bigg),\cdot\bigg) \bigg\rVert_{L^2}\bigg\lVert A_N^0\bigg(N\bigg(t-\frac{s}{2N}\bigg),\cdot\bigg) \bigg\rVert_{L^2} \\[5pt] &\qquad+ \bigg\lVert A_N^0\bigg(N\bigg(t+\frac{s}{2N}\bigg),\cdot\bigg) \bigg\rVert_{L^2}\bigg\lVert \frac{d}{d\mu} A_N^0\bigg(N\bigg(t-\frac{s}{2N}\bigg),\cdot\bigg) \bigg\rVert_{L^2} \nonumber\\[5pt] &\qquad+ 2 \left\lVert A(t,\cdot) \right\rVert_{L^2}\bigg\lVert \frac{d}{d\mu} A(t,\cdot) \bigg\rVert_{L^2}\leq E,\nonumber \end{align}

where the last inequality follows from (b) and (c). On the other hand, the integration-by-parts formula gives

\begin{align*}\int_{\mathbb{R}} e^{i\mu s} \bigg( \frac{d}{d\mu} a_N\bigg(\frac{s}{2N},\mu\bigg) \bigg) d\mu=\bigg[ e^{i\mu s} a_N\bigg(\frac{s}{2N},\mu\bigg) \bigg]\bigg|_{\mu=-\infty}^\infty - \int_{\mathbb{R}} (is) e^{i\mu s} a_N\bigg(\frac{s}{2N},\mu\bigg) d\mu.\end{align*}

In order to evaluate the limits in the first summand, we first note that (29) implies that $\frac{d}{d\mu}a_N(\frac{s}{2N},\cdot)\in L^1(\mathbb{R})$ . In addition, it holds that $a_N(\frac{s}{2N},\cdot)\in L^1(\mathbb{R})$ , since $A_N^0(t,\cdot),A(t,\cdot)\in L^2(\mathbb{R})$ . Hence, using Lemma 2 we obtain $\lim_{\mu\rightarrow\pm\infty}a_N(\frac{s}{2N},\mu)=0$ . Overall,

(30) \begin{align}\begin{aligned}\int_{\mathbb{R}} e^{i\mu s} \bigg( \frac{d}{d\mu} a_N\bigg(\frac{s}{2N},\mu\bigg) \bigg) d\mu&= (\!-\!is) \widehat{a}_N\bigg(\frac{s}{2N}\bigg),\end{aligned}\end{align}

Combining (29) and (30) we obtain

\begin{align*}\bigg| \widehat{a}_N\bigg(\frac{s}{2N}\bigg) \bigg|\leq \frac{1}{\left| s \right|} \int_{\mathbb{R}} \bigg| \frac{d}{d\mu} a_N\bigg(\frac{s}{2N},\mu\bigg) \bigg| d\mu\leq \frac{E}{\left| s \right|}.\end{align*}

Finally, for $s^*\in\mathbb{R}$ ,

\begin{align*}(2\pi)^2 \int_\mathbb{R} \left| f_N(t,\lambda)-f(t,\lambda) \right|^2 d\lambda &= \frac{\Sigma_L^2}{2\pi} \int_{s\in\mathbb{R}} \bigg| \widehat{a}_N\bigg(\frac{s}{2N}\bigg) \bigg|^2 ds \\[5pt] &= \frac{\Sigma_L^2}{2\pi} \int_{\left| s \right|\geq s^\ast} \bigg| \widehat{a}_N\bigg(\frac{s}{2N}\bigg) \bigg|^2 ds+\frac{\Sigma_L^2}{2\pi} \int_{\left| s \right|<s^\ast} \bigg| \widehat{a}_N\bigg(\frac{s}{2N}\bigg) \bigg|^2 ds\\[5pt] &\leq \frac{\Sigma_L^2E^2}{\pi s^*}+\frac{\Sigma_L^2}{2\pi} \int_{\left| s \right|<s^\ast} \bigg| \widehat{a}_N\bigg(\frac{s}{2N}\bigg) \bigg|^2 ds.\end{align*}

The second term converges to zero by the dominated convergence theorem, where pointwise convergence follows from (28) and a convergent majorant can be obtained from the boundedness conditions in (b) and the Cauchy–Schwarz inequality, noting that the support of the integral is compact. Therefore, for all $\varepsilon>0$ and sufficiently large $s^*$ and N, it holds that

\begin{align*} \int_\mathbb{R} \left| f_N(t,\lambda)-f(t,\lambda) \right|^2 d\lambda &\leq \frac{\Sigma_L^2E^2}{\pi s^*}+\frac{\Sigma_L^2}{2\pi} \int_{\left| s \right|<s^\ast} \bigg| \widehat{a}_N\bigg(\frac{s}{2N}\bigg) \bigg|^2 ds \leq \varepsilon,\end{align*}

which concludes the proof.

Corollary 2. Let $Y_N(t)$ be a sequence of time-varying linear state space processes as in Definition 10, such that both IVPs in (24) are uniformly exponentially stable, the conditions (C1)–(C3) from Proposition 9 hold, and $\sup_{t\in\mathbb{R}}\left\lVert B(t) \right\rVert<\infty$ . Then the sequence of Wigner–Ville spectra tends in mean square to the time-varying spectral density.

Proof. It is sufficient to check the conditions (a), (b), and (c) from Theorem 3.

  1. (a) For $s,t\in\mathbb{R}$ we obtain from Plancherel’s theorem

    \begin{align*}&\bigg\lVert A_N^0\bigg(N\bigg(t\pm\frac{s}{2N}\bigg),\cdot\bigg)-A(t,\cdot) \bigg\rVert_{L^2}^2 = 4\pi^2 \bigg\lVert g_N^0\bigg(N\bigg(t\pm\frac{s}{2N}\bigg),\cdot\bigg)-g(t,\cdot) \bigg\rVert_{L^2}^2 \\[5pt] &= 4\pi^2 \int_\mathbb{R} \mathbb{1}_{\{u\leq 0\}} \bigg| B\bigg(t\pm\frac{s}{2N}\bigg)^{\prime} \Psi_{N,t\pm\frac{s}{2N}}^0(0,u) C\bigg(\frac{u}{N}+t\pm\frac{s}{2N}\bigg)\\[5pt] &\qquad\qquad\qquad\qquad- B(t)' \Psi_{t}(0,u) C(t) \bigg|^2 du,\end{align*}
    which tends to zero as $N\rightarrow\infty$ by the dominated convergence theorem. Pointwise convergence is secured by the continuity of A, B, and C in (C1) and the continuity of the solution of an IVP on the input (see the proof of Proposition 9). Since the sequence $Y_N(t)$ is uniformly exponentially stable, we have $\left\lVert \Psi_{N,t}^0(s_1,s_0) \right\rVert \leq \gamma e^{-\lambda(s_1-s_0)}$ for some $\gamma, \lambda >0$ and all $s_1>s_0$ . Therefore, a convergent majorant can be obtained by noting that
    \begin{align*}&\mathbb{1}_{\{u\leq 0\}} \bigg| B\bigg(t\pm\frac{s}{2N}\bigg)' \Psi_{t\pm\frac{s}{2N}}(0,u) C\bigg(\frac{u}{N}+t\pm\frac{s}{2N}\bigg) \bigg|\\[5pt] &\leq \mathbb{1}_{\{u\leq 0\}} \Bigg(\sup_{t\in\mathbb{R}}\|B(t)\|\Bigg) \gamma e^{\lambda u} \Bigg(\sup_{t\in\mathbb{R}}\left\lVert C(t) \right\rVert\Bigg).\end{align*}
  2. (b) For $t\in\mathbb{R}$ and $N\in\mathbb{N}$ it holds that $\left\lVert \Psi_{N,t}^0(s_1,s_0) \right\rVert \leq \gamma e^{-\lambda(s_1-s_0)}$ and $\left\lVert \Psi_t(s_1,s_0) \right\rVert \leq \gamma e^{-\lambda(s_1-s_0)}$ for some $\gamma, \lambda >0$ and all $s_1>s_0$ . Thus

    \begin{align*}\Big\lVert A_N^0(Nt,\cdot) \Big\rVert_{L^2}^2 &= 4\pi^2 \Big\lVert g_N^0(Nt,\cdot) \Big\rVert_{L^2}^2\\[5pt] & = 4\pi^2 \int_\mathbb{R} \mathbb{1}_{\{u\leq 0\}} \bigg| B(t)' \Psi_{N,t}^0(0,u) C\bigg(\frac{u}{N}+t\bigg) \bigg|^2 du \\[5pt] &\leq\frac{2\pi^2\gamma^2}{\lambda} \Bigg(\sup_{t\in\mathbb{R}}\|B(t)\|\Bigg)^2 \Bigg(\sup_{t\in\mathbb{R}}\|C(t)\|\Bigg)^2 <\infty \quad \text{ and}{}\\[5pt] \left\lVert A(t,\cdot) \right\rVert_{L^2}^2&= 4\pi^2 \left\lVert g(t,\cdot) \right\rVert_{L^2}^2 = 4\pi^2 \int_\mathbb{R} \mathbb{1}_{\{u\leq 0\}} \left\lVert B(t)' \Psi_t(0,u) C(u) \right\rVert^2 du \\[5pt] &\leq\frac{2\pi^2\gamma^2}{\lambda} \Bigg(\sup_{t\in\mathbb{R}}\left\lVert B(t) \right\rVert\Bigg)^2\Bigg(\sup_{t\in\mathbb{R}}\left\lVert C(t) \right\rVert\Bigg)^2 <\infty.\end{align*}
  3. (c) Since $Y_N(t)$ is uniformly exponentially stable, [Reference Katznelson20, Theorem 1.6] implies that

    \begin{align*}\frac{d}{d\mu} A_N^0(Nt,\mu) &= \int_\mathbb{R} e^{-i\mu u} (\!-\!iu) g_N^0(t,u) du \text{ and} \\[5pt] \frac{d}{d\mu} A(t,\mu) &= \int_\mathbb{R} e^{-i\mu u} (\!-\!iu) g(t,u) du,\end{align*}
    which are again in $L^2(\mathbb{R})$ , since
    \begin{align*}\bigg\lVert \frac{d}{d\mu} A_N^0(Nt,\cdot) \bigg\rVert_{L^2}^2&= 4\pi^2 \Big\lVert (\!-\!i\cdot) g_N^0(Nt,\cdot) \Big\rVert_{L^2}^2 \\[5pt] &= 4\pi^2 \int_\mathbb{R} \mathbb{1}_{\{u\leq 0\}} \bigg| (\!-\!iu) B(t)' \Psi_{N,t}^0(0,u) C\bigg(\frac{u}{N}+t\bigg) \bigg|^2 du \\[5pt] &\leq 4\pi^2 \Bigg(\sup_{t\in\mathbb{R}}\left\lVert B(t) \right\rVert\Bigg)^2 \Bigg(\gamma^2 \int_{-\infty}^0 u^2 e^{2\lambda u} du\Bigg) \Bigg(\sup_{t\in\mathbb{R}}\left\lVert C(t) \right\rVert\Bigg)^2 \\[5pt] &= \frac{\gamma^2\pi^2}{\lambda^3} \Bigg(\sup_{t\in\mathbb{R}}\left\lVert B(t) \right\rVert\Bigg)^2\Bigg(\sup_{t\in\mathbb{R}}\left\lVert C(t) \right\rVert\Bigg)^2<\infty\text{ and analogously} \\[5pt] \bigg\lVert \frac{d}{d\mu} A(t,\cdot) \bigg\rVert_{L^2}^2&\leq \frac{\gamma^2\pi^2}{\lambda^3} \Bigg(\sup_{t\in\mathbb{R}}\left\lVert B(t) \right\rVert\Bigg)^2\Bigg(\sup_{t\in\mathbb{R}}\left\lVert C(t) \right\rVert\Bigg)^2<\infty.\end{align*}

Acknowledgements

The third author was supported by the scholarship program of the Hanns-Seidel Foundation, funded by the Federal Ministry of Education and Research.

Funding information

There are no funding bodies to thank in relation to the creation of this article.

Competing interests

There were no competing interests to declare which arose during the preparation or publication process of this article.

References

Applebaum, D. (2009). Lévy Processes and Stochastic Calculus, 2nd edn. Cambridge University Press.10.1017/CBO9780511809781CrossRefGoogle Scholar
Baake, M. and Schlägel, U. (2011). The Peano–Baker series. Proc. Steklov Inst. Math. 275, 155159.10.1134/S0081543811080098CrossRefGoogle Scholar
Bardet, J. M., Doukhan, P. and Wintenberger, O. (2020). Contrast estimation of general locally stationary processes using coupling. Preprint. Available at https://arxiv.org/abs/2005.07397.Google Scholar
Benth, F. E., Klüppelberg, C., Müller, G. and Vos, L. (2014). Futures pricing in electricity markets based on stable CARMA spot models. Energy Econom. 44, 392406.10.1016/j.eneco.2014.03.020CrossRefGoogle Scholar
Benmahammed, K. (1987). Model reduction of uniformly controllable continuous time varying linear systems. In Proc. 1987 American Control Conference, Institute of Electrical and Electronics Engineers, Piscataway, NJ, pp. 1500–1503.Google Scholar
Bernstein, D. S. (2009). Matrix Mathematics: Theory, Facts, and Formulas, 2nd edn. Princeton University Press.10.1515/9781400833344CrossRefGoogle Scholar
Brockett, R. W. (1970). Finite Dimensional Linear Systems. John Wiley, New York.Google Scholar
Brockwell, P. J. and Davis, R. A. (1996). Time Series: Theory and Methods. Springer, New York.Google Scholar
Bruscato, A. and Toloi, C. M. C. (2004). Spectral analysis of non-stationary processes using the Fourier transform. Brazilian J. Prob. Statist. 18, 69102.Google Scholar
Chandrasekharan, K. (1989). Classical Fourier Transforms. Springer, Berlin.10.1007/978-3-642-74029-9CrossRefGoogle Scholar
Dahlhaus, R. (1996). On the Kullback–Leibler information divergence of locally stationary processes. Stoch. Process. Appl. 62, 139168.10.1016/0304-4149(95)00090-9CrossRefGoogle Scholar
Dahlhaus, R. (1997). Fitting time series models to nonstationary processes. Ann. Statist. 25, 137.10.1214/aos/1034276620CrossRefGoogle Scholar
Dahlhaus, R. (2000). A likelihood approximation for locally stationary processes. Ann. Statist. 28, 17621794.10.1214/aos/1015957480CrossRefGoogle Scholar
Dahlhaus, R. (2012). Locally stationary processes. In Time Series Analysis: Methods and Applications (Handbook Statist. 30), North-Holland, Amsterdam, pp. 351–413.10.1016/B978-0-444-53858-1.00013-2CrossRefGoogle Scholar
Dahlhaus, R. and Polonik, W. (2009). Empirical spectral processes for locally stationary time series. Bernoulli 15, 139.10.3150/08-BEJ137CrossRefGoogle Scholar
Dahlhaus, R., Richter, S. and Wu, W. B. (2019). Towards a general theory for nonlinear locally stationary processes. Bernoulli 25, 10131044.10.3150/17-BEJ1011CrossRefGoogle Scholar
Dahlhaus, R. and Subba Rao, T. (2006). Statistical inference for time-varying ARCH processes. Ann. Statist. 34, 10751114.10.1214/009053606000000227CrossRefGoogle Scholar
Flandrin, P. and Martin, W. (1984). A general class of estimators for the Wigner–Ville spectrum of non-stationary processes. In Analysis and Optimization of Systems: Proc. Sixth International Conference on Analysis and Optimization of Systems, Nice, June 19–22, 1984, Part 1 (Lecture Notes Control Inf. Sci. 62), Springer, Berlin, pp. 15–23.10.1007/BFb0004941CrossRefGoogle Scholar
Horn, R. A. and Johnson, C. R. (1990). Matrix Analysis. Cambridge University Press.Google Scholar
Katznelson, Y. (2004). An Introduction to Harmonic Analysis, 3rd edn. Cambridge University Press.10.1017/CBO9781139165372CrossRefGoogle Scholar
Koo, B. and Linton, O. (2012). Estimation of semiparametric locally stationary diffusion models. J. Econometrics 170, 210233.10.1016/j.jeconom.2012.05.003CrossRefGoogle Scholar
Krylov, N. V. (2002). Introduction to the Theory of Random Processes. American Mathematical Society, Providence, RI.10.1090/gsm/043CrossRefGoogle Scholar
Künsch, H. R. (1995). A note on causal solutions for locally stationary AR-processes. Preprint. Available at ftp://ess.r-project.org/users/hkuensch/localstat-ar.pdf.Google Scholar
Kurisu, D. (2022). Nonparametric regression for locally stationary random fields under stochastic sampling design. Bernoulli 28, 12501275.10.3150/21-BEJ1385CrossRefGoogle Scholar
Larsson, E. K. and Mossberg, M. (2004). Fast and approximative estimation of continuous-time stochastic signals from discrete-time data. In 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing, Institute of Electrical and Electronics Engineers, Piscataway, NJ, pp. 529–532.Google Scholar
Lukes, D. N. (1982). Differential Equations: Classical to Controlled. Academic Press, London.Google Scholar
Mammen, E. (2007). Nonparametric estimation of locally stationary Hawkes processes. Preprint. Available at https://arxiv.org/abs/1707.04469.Google Scholar
Marquardt, T. and Stelzer, R. (2007). Multivariate CARMA processes. Stoch. Process. Appl. 117, 96120.10.1016/j.spa.2006.05.014CrossRefGoogle Scholar
Martin, W. and Flandrin, P. (1985). Wigner–Ville spectral analysis of nonstationary processes. IEEE Trans. Acoust. Speech Signal Process. 33, 14611470.10.1109/TASSP.1985.1164760CrossRefGoogle Scholar
Matsuada, Y. and Yajima, Y. (2018). Locally stationary spatio-temporal processes. Japanese J. Statist. Data Sci. 1, 4157.10.1007/s42081-018-0003-9CrossRefGoogle Scholar
Priestley, M. B. (1994). Spectral Analysis and Time Series. Academic Press, London.Google Scholar
Ramar, K. and Ramaswami, B. (1971). Transformation of time-variable multi-input systems to a canonical form. IEEE Trans. Automatic Control 16, 371374.10.1109/TAC.1971.1099740CrossRefGoogle Scholar
Ramaswami, B. and Ramar, K. (1969). On the transformation of time-variable systems to the phase-variable canonical form. IEEE Trans. Automatic Control 14, 417419.10.1109/TAC.1969.1099215CrossRefGoogle Scholar
Roueff, F. and von Sachs, R. (2019). Time-frequency analysis of locally stationary Hawkes processes. Bernoulli 25, 13551385.10.3150/18-BEJ1023CrossRefGoogle Scholar
Roueff, F., von Sachs, R. and Sansonnet, L. (2016). Locally stationary Hawkes processes. Stoch. Process. Appl. 126, 17101743.10.1016/j.spa.2015.12.003CrossRefGoogle Scholar
Royden, H. L. (1988). Real Analysis, 3rd edn. Macmillan, New York.Google Scholar
Rudin, W. (1991). Functional Analysis, 2nd edn. McGraw-Hill, New York.Google Scholar
Rugh, W. J. (1996). Linear System Theory, 2nd edn. Prentice-Hall, Englewood Cliffs, NJ.Google Scholar
Sato, K. I. (2013). Lévy Processes and Infinitely Divisible Distributions. Cambridge University Press.Google Scholar
Sato, K.-I. (2014). Stochastic integrals with respect to Lévy processes and infinitely divisible distributions. Sugaku Expositions 27, 1942.Google Scholar
Schlemm, E. and Stelzer, R. (2012). Multivariate CARMA processes, continuous-time state space models and complete regularity of the innovations of the sampled processes. Bernoulli 18, 4663.10.3150/10-BEJ329CrossRefGoogle Scholar
Silverman, L. (1966). Transformation of time-variable systems to canonical (phase-variable) form. IEEE Trans. Automatic Control 11, 300303.10.1109/TAC.1966.1098312CrossRefGoogle Scholar
Surulescu, N. M. (2010). On some classes of continuous-time series models and their use in financial economics. Doctoral Thesis, Ruprecht-Karls-Universität Heidelberg.Google Scholar
Vogt, M. (2012). Nonparametric regression for locally stationary time series. Ann. Statist. 40, 26012633.10.1214/12-AOS1043CrossRefGoogle Scholar
Vogt, M. and Dette, H. (2015). Detecting gradual changes in locally stationary processes. Ann. Statist. 43, 713740.10.1214/14-AOS1297CrossRefGoogle Scholar
Walter, W. (2000). Gewöhnliche Differentialgleichungen, 7th edn. Springer, Berlin.10.1007/978-3-642-57240-1CrossRefGoogle Scholar
Wu, M.-A. and Sherif, A. (1976). On the commutative class of linear time-varying systems. Internat. J. Control. 23, 433444.10.1080/00207177608922171CrossRefGoogle Scholar