We use cookies to distinguish you from other users and to provide you with a better experience on our websites. Close this message to accept cookies or find out how to manage your cookie settings.
To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
Graph-based semi-supervised learning methods combine the graph structure and labeled data to classify unlabeled data. In this work, we study the effect of a noisy oracle on classification. In particular, we derive the maximum a posteriori (MAP) estimator for clustering a degree corrected stochastic block model when a noisy oracle reveals a fraction of the labels. We then propose an algorithm derived from a continuous relaxation of the MAP, and we establish its consistency. Numerical experiments show that our approach achieves promising performance on synthetic and real data sets, even in the case of very noisy labeled data.
We investigate some aspects of the problem of the estimation of birth distributions (BDs) in multi-type Galton–Watson trees (MGWs) with unobserved types. More precisely, we consider two-type MGWs called spinal-structured trees. This kind of tree is characterized by a spine of special individuals whose BD $\nu$ is different from the other individuals in the tree (called normal, and whose BD is denoted by $\mu$). In this work, we show that even in such a very structured two-type population, our ability to distinguish the two types and estimate $\mu$ and $\nu$ is constrained by a trade-off between the growth-rate of the population and the similarity of $\mu$ and $\nu$. Indeed, if the growth-rate is too large, large deviation events are likely to be observed in the sampling of the normal individuals, preventing us from distinguishing them from special ones. Roughly speaking, our approach succeeds if $r\lt \mathfrak{D}(\mu,\nu)$, where r is the exponential growth-rate of the population and $\mathfrak{D}$ is a divergence measuring the dissimilarity between $\mu$ and $\nu$.
We study the community detection problem on a Gaussian mixture model, in which vertices are divided into $k\geq 2$ distinct communities. The major difference in our model is that the intensities for Gaussian perturbations are different for different entries in the observation matrix, and we do not assume that every community has the same number of vertices. We explicitly find the necessary and sufficient conditions for the exact recovery of the maximum likelihood estimation, which can give a sharp phase transition for the exact recovery even though the Gaussian perturbations are not identically distributed; see Section 7. Applications include the community detection on hypergraphs.
In this paper we study the drift parameter estimation for reflected stochastic linear differential equations of a large signal. We discuss the consistency and asymptotic distributions of trajectory fitting estimator (TFE).
Let $(Z_n)_{n\geq0}$ be a supercritical Galton–Watson process. Consider the Lotka–Nagaev estimator for the offspring mean. In this paper we establish self-normalized Cramér-type moderate deviations and Berry–Esseen bounds for the Lotka–Nagaev estimator. The results are believed to be optimal or near-optimal.
Bifurcating Markov chains (BMCs) are Markov chains indexed by a full binary tree representing the evolution of a trait along a population where each individual has two children. We provide a central limit theorem for additive functionals of BMCs under
$L^2$
-ergodic conditions with three different regimes. This completes the pointwise approach developed in a previous work. As an application, we study the elementary case of a symmetric bifurcating autoregressive process, which justifies the nontrivial hypothesis considered on the kernel transition of the BMCs. We illustrate in this example the phase transition observed in the fluctuations.
where 0 ≤ θj ≤ 1, $S_n=\sum _{j=1}^nX_j$ and ${\cal F}_n=\sigma \{X_1,\ldots , X_n\}$. The aim of this paper is to establish the strong law of large numbers which extend some known results, and prove the moderate deviation principle for the correlated Bernoulli model.
We investigate the large deviation properties of the maximum likelihood estimators for the Ornstein-Uhlenbeck process with shift. We propose a new approach to establish large deviation principles which allows us, via a suitable transformation, to circumvent the classical nonsteepness problem. We estimate simultaneously the drift and shift parameters. On the one hand, we prove a large deviation principle for the maximum likelihood estimates of the drift and shift parameters. Surprisingly, we find that the drift estimator shares the same large deviation principle as the estimator previously established for the Ornstein-Uhlenbeck process without shift. Sharp large deviation principles are also provided. On the other hand, we show that the maximum likelihood estimator of the shift parameter satisfies a large deviation principle with a very unusual implicit rate function.
Self-exciting point processes (SEPPs), or Hawkes processes, have found applications in a wide range of fields, such as epidemiology, seismology, neuroscience, engineering, and more recently financial econometrics and social interactions. In the traditional SEPP models, the baseline intensity is assumed to be a constant. This has restricted the application of SEPPs to situations where there is clearly a self-exciting phenomenon, but a constant baseline intensity is inappropriate. In this paper, to model point processes with varying baseline intensity, we introduce SEPP models with time-varying background intensities (SEPPVB, for short). We show that SEPPVB models are competitive with autoregressive conditional SEPP models (Engle and Russell 1998) for modeling ultra-high frequency data. We also develop asymptotic theory for maximum likelihood estimation based inference of parametric SEPP models, including SEPPVB. We illustrate applications to ultra-high frequency financial data analysis, and we compare performance with the autoregressive conditional duration models.
A recursive scheme is proposed for identifying a single input single output (SISO) Wiener-Hammerstein system, which consists of two linear dynamic subsystems and a sandwiched nonparametric static nonlinearity. The first linear block is assumed to be a finite impulse response (FIR) filter and the second an infinite impulse response (IIR) filter. By letting the input be a sequence of mutually independent Gaussian random variables, the recursive estimates for coefficients of the two linear blocks and the value of the static nonlinear function at any fixed given point are proven to converge to the true values, with probability one as the data size tends to infinity. The static nonlinearity is identified in a nonparametric way and no structural information is directly used. A numerical example is presented that illustrates the theoretical results.
In this paper we study asymptotic consistency of law invariant convex risk measures and the corresponding risk averse stochastic programming problems for independent, identically distributed data. Under mild regularity conditions, we prove a law of large numbers and epiconvergence of the corresponding statistical estimators. This can be applied in a straightforward way to establish convergence with probability 1 of sample-based estimators of risk averse stochastic programming problems.
In this paper we study the asymptotic properties of the canonical plugin estimates for law-invariant coherent risk measures. Under rather mild conditions not relying on the explicit representation of the risk measure under consideration, we first prove a central limit theorem for independent and identically distributed data, and then extend it to the case of weakly dependent data. Finally, a number of illustrating examples is presented.
The important task of evaluating the impact of random parameters on the output of stochastic ordinary differential equations (SODE) can be computationally very demanding, in particular for problems with a high-dimensional parameter space. In this work we consider this problem in some detail and demonstrate that by combining several techniques one can dramatically reduce the overall cost without impacting the predictive accuracy of the output of interests. We discuss how the combination of ANOVA expansions, different sparse grid techniques, and the total sensitivity index (TSI) as a pre-selective mechanism enables the modeling of problems with hundred of parameters. We demonstrate the accuracy and efficiency of this approach on a number of challenging test cases drawn from engineering and science.
Results on asymptotic normality for the maximum likelihood estimate in hidden Markov models are extended in two directions. The stationarity assumption is relaxed, which allows for a covariate process influencing the hidden Markov process. Furthermore, a class of estimating equations is considered instead of the maximum likelihood estimate. The basic ingredients are mixing properties of the process and a general central limit theorem for weakly dependent variables.
We consider a critical discrete-time branching process with generation dependent immigration. For the case in which the mean number of immigrating individuals tends to ∞ with the generation number, we prove functional limit theorems for centered and normalized processes. The limiting processes are deterministically time-changed Wiener, with three different covariance functions depending on the behavior of the mean and variance of the number of immigrants. As an application, we prove that the conditional least-squares estimator of the offspring mean is asymptotically normal, which demonstrates an alternative case of normality of the estimator for the process with nondegenerate offspring distribution. The norming factor is where α(n) denotes the mean number of immigrating individuals in the nth generation.
We consider an epidemic model where the spread of the epidemic can be described by a discrete-time Galton-Watson branching process. Between times n and n + 1, any infected individual is detected with unknown probability π and the numbers of these detected individuals are the only observations we have. Detected individuals produce a reduced number of offspring in the time interval of detection, and no offspring at all thereafter. If only the generation sizes of a Galton-Watson process are observed, it is known that one can only estimate the first two moments of the offspring distribution consistently on the explosion set of the process (and, apart from some lattice parameters, no parameters that are not determined by those moments). Somewhat surprisingly, in our context, where we observe a binomially distributed subset of each generation, we are able to estimate three functions of the parameters consistently. In concrete situations, this often enables us to estimate π consistently, as well as the mean number of offspring. We apply the estimators to data for a real epidemic of classical swine fever.
We investigate a sequence of Galton-Watson branching processes with immigration, where the offspring mean tends to its critical value 1 and the offspring variance tends to 0. It is shown that the fluctuation limit is an Ornstein-Uhlenbeck-type process. As a consequence, in contrast to the case in which the offspring variance tends to a positive limit, it transpires that the conditional least-squares estimator of the offspring mean is asymptotically normal. The norming factor is n3/2, in contrast to both the subcritical case, in which it is n1/2, and the nearly critical case with positive limiting offspring variance, in which it is n.