1. Introduction
In studies of extreme, or rare, features in a point process configuration, Poisson limits and extreme value distributions naturally appear; see, for instance, [Reference Bobrowski, Schulte and Yogeshwaran7], [Reference Otto20], or [Reference Owada21]. For geometric structures without a significant local dependence, Poisson limits are not unexpected. In this case, high scores essentially arrive as isolated points, which makes analysis relatively simple. In more complicated cases, one can try to remove clumps of large scores in the language of Aldous [Reference Aldous1]. However, if we simply substitute a clump (or a cluster) of large scores by a single score then we would loose important information about the geometric properties of the limiting extremal objects. Here we derive and present a mathematically rigorous solution to this problem.
In particular, we present techniques suitable to analyse extremes of marked point processes in Euclidean space. Intuitively, these processes can be viewed as a sequence of scores (i.e. random values) recorded at random locations. We concentrate on points whose scores are high with the aim to understand the appearance of other such points nearby, meaning that we allow extreme scores arising in clusters. Such clustering phenomenon has been well described by Aldous [Reference Aldous1]; see [Reference Chenavier and Robert9] for a recent analysis of extremes in random tessellations.
Our key concept is that of the tail configuration, which is closely related to the tail process introduced in [Reference Basrak and Segers4] and since then widely used in time series [Reference Dombry, Hashorva and Soulier13], [Reference Janßen15], [Reference Kulik and Soulier17]. For stationary time series, this process appears after conditioning upon a high score at zero. When adapting this idea to a spatial setting, we need to work with Palm versions of marked point processes, which necessarily has a point located at the origin, and then condition upon the fact that the score at 0 is high. Passing to the limit involves scaling down the scores, but often also requires scaling of the locations.
The resulting tail configuration provides a local description of the clustering phenomenon related to high scores. Looking at high scores globally in a big domain shows that they build islands (clusters of points) in space. Our main result, Theorem 2, provides a limit theorem showing convergence of such a global cluster process to a marked Poisson process in the space, whose marks are point processes themselves considered equivalent up to translation. In this way we factor out positions of the clusters and explicitly describe the limiting distribution and the extremal index that provides the information about the mean cluster size. Our result can be compared to its classical counterpart for discrete time stationary sequences as presented in [Reference Basrak, Planinić and Soulier5, Theorem 3.6] or [Reference Kulik and Soulier17, Theorem 6.1.4].
Although our theory applies much more generally, we illustrate our results on two examples with scores obtained from a background Poisson point process. In the first one, the scores are simply reciprocals to the distance between a point and its kth nearest neighbor. As a special case of this example, for
$k=1$
, we describe the limiting structure of the process of points with large inradii in the Poisson–Voronoi tessellation, studied in [Reference Chenavier and Robert9] in dimension two. In our second example, the points are initially marked by independent, identically distributed (i.i.d.) random values, but the actual score at a point t say, depends also on the (weighted) values at the points in a possibly random neighborhood of t. The example can be seen as a generalization of the moving maxima model from time series analysis. But here, we are particularly interested to see how large values propagate in such a random network.
The paper is organized as follows. Section 2.1 sets up our basic definitions. Section 3 introduces the tail process and its spectral counterpart. Its central result is a certain invariance property of the tail configuration (Theorem 1), which is related to the time-change formula from time series; see [Reference Basrak and Segers4], cf. also [Reference Last18] and [Reference Planinić24] for a discussion on the connection to standard Palm theory. Section 4 contains the main result that provides a Poisson approximation of extremal clusters. The construction involves the standard idea of splitting the space into rectangular blocks. Three main assumptions consist of conditions on block sizes and dependence within and between extremal blocks; see Section 4.4. In Section 4.5 we discuss the representation of the extremal index and the distribution of the typical cluster. The key concept here is the idea of an anchoring function, which is motivated by a similar concept from random fields indexed over
$\mathbb{Z}^d$
; see [Reference Basrak and Planinić3]. The proof of the main result is postponed to Section 6. Section 5.1 treats in detail the case of scores derived from the neighboring structure of a stationary Poisson process in Euclidean space. In Section 5.2 we deal with the moving maxima model. In both examples the main steps consist of determining the tail configuration and then checking the appropriate dependence conditions.
Consider first a very simple motivating example.
Example 1. Let
$ P = \sum \delta_{t}$
denote a homogeneous Poisson process on
$\mathbb{R}^d$
independently marked by i.i.d. points
$(h_t,\varepsilon_t,\zeta_t)$
in
$\mathbb{R}^d \times \{0,1\} \times\mathbb{R}_+$
. In particular,
$\varepsilon_t$
are i.i.d. Bernoulli random variables, with success probability p, say. Assume, for simplicity, that all three components of the mark are independent, that the
$h_t$
s have a symmetric continuous distribution around the origin with bounded support and the
$\zeta_t$
s have a regularly varying distribution with index
$\alpha>0$
, i.e.
$\mathbb{P} (\zeta > u) = L(u) u ^{-\alpha}$
for some slowly varying function L. Consider now the following simple Poisson cluster process:

Thus, at each point t of the background Poisson process P, in X we observe a score
$\zeta_t$
, which is then with probability p repeated at a shifted location
$t+h_t$
. Suppose now that we observe the process X on a hypercube
$[0,\tau]^d$
for
$\tau\to\infty$
. Note that we can always find a function
$ a_\tau$
so that
$\tau^d \mathbb{P}(\zeta>a_\tau \varepsilon) \to\varepsilon^{-\alpha}$
for any
$\varepsilon >0$
. Unsurprisingly, rescaling the locations and scores of points of X leads to a nontrivial limit. Indeed, this is immediate in the case
$p=0$
, in which case there are no clusters in X and, therefore (in the vague topology as explained below),

where
$\{(U_i,\Gamma_i),i\geq1\}$
are points of the Poisson process on
$[0,1]^d\times \mathbb{R}_+$
with the intensity measure being the product of the Lebesgue measure on
$[0,1]^d$
and the Lebesgue measure on
$\mathbb{R}_+$
. If
$\smash{p=\frac{1}{2}}$
, a similar result holds; however, the large scores in X come in clusters of size 1 or 2, so that the limit becomes a compound Poisson process. Without further adjustments, the clusters will collapse to a single location in the limit. Below we explain how to prove a version of this limiting result that also preserves the shape of the cluster in the limit under relatively general assumptions.
2. Random counting measures and marked point processes
2.1. Basic definitions
Consider the space
$\mathbb{E}\;:\!=\;\mathbb{R}^d\times(0,\infty)$
with the standard product topology and the Borel
$\sigma$
algebra
$\mathcal{B}(\mathbb{E})$
. A point in
$\mathbb{E}$
is written as (t, s), where
$t\in\mathbb{R}^d$
is said to be the position and
$s>0$
is the mark or score (at t).
A Borel measure
$\mu$
on
$\mathbb{E}$
is said to be a counting measure if it takes values
$\{0,1,2,\ldots\}\cup\{\infty\}$
on
$\mathcal{B}(\mathbb{E})$
. Denote by
$\mu'$
the projection of
$\mu$
on
$\mathbb{R}^d$
, that is,
$\mu'(B)=\mu(B\times(0,\infty))$
for each Borel B in
$\mathbb{R}^d$
. We call a counting measure simple if its projection on
$\mathbb{R}^d$
is simple, that is,
$\mu(\{t\}\times(0,\infty)) \leq 1$
for all
$t\in\mathbb{R}^d$
. We write
$(t,s)\in\mu$
and
$\mu(t)={\mu(\{t\})=}s$
if
$\mu(\{(t,s)\})=1$
; for convenience, if
$\mu(\{t\}\times (0,\infty))=0$
, we sometimes write
$\mu(t)=0$
. Each simple counting measure
$\mu$
is uniquely represented by the set of its atoms. To emphasize this, we write

or equivalently,
$\mu=\{(t_i,s_i)\colon i=1,\dots,k\}$
, where k is the total number of atoms in
$\mu$
(which may be infinite). For a Borel function
$f\colon\mathbb{E}\to\mathbb{R}$
, denote

In the following, it is essential to single out families of sets where counting measures take finite values. Introduce subfamilies
$\mathcal{B}_{11}, \mathcal{B}_{10}, \mathcal{B}_{01} \subseteq\mathcal{B}(\mathbb{E})$
:
-
(i)
$A\in \mathcal{B}_{11}$ if
$A\subseteq B\times (\varepsilon,\infty)$ for some
$B\subseteq \mathbb{R}^d$ bounded and
$\varepsilon>0$ ;
-
(ii)
$A\in \mathcal{B}_{10}$ if
$A\subseteq B\times (0,\infty)$ for some
$B\subseteq \mathbb{R}^d$ bounded;
-
(iii)
$A\in \mathcal{B}_{01}$ if
$A\subseteq \mathbb{R}^d\times(\varepsilon,\infty)$ for some
$\varepsilon>0$ .
For consistency, we sometimes write
$\mathcal{B}_{00}\;:\!=\;\mathcal{B}(\mathbb{E})$
. These families provide examples of a boundedness or bornology on
$\mathbb{E}$
(see [Reference Basrak and Planinić2]), and clearly satisfy
$\mathcal{B}_{11}=\mathcal{B}_{10}\cap\mathcal{B}_{01}$
.
Let
$\mathcal{N}_{ij}$
denote the family of simple counting measures with finite values on
$\mathcal{B}_{ij}$
,
$i,j\in\{0,1\}$
. For convenience, in what follows denote
$\mathcal{N}\;:\!=\;\mathcal{N}_{11}$
. Note that
$\mathcal{N}_{00} \subset \mathcal{N}_{01}\cap \mathcal{N}_{10}$
and
$\mathcal{N}_{01}\cup \mathcal{N}_{10} \subset \mathcal{N}_{11} =\mathcal{N}$
. The families
$\mathcal{N}_{ij}$
are equipped with the vague topology determined by the choice of the boundedness; see [Reference Basrak and Planinić2].
Definition 1. Counting measures
$(\mu_n)_{n\in \mathbb{N}}$
from
$\mathcal{N}_{ij}$
with
$i,j\in \{0,1\}$
are said to converge to
$\mu\in \mathcal{N}_{ij}$
as
$n\to\infty$
in
$\mathcal{B}_{ij}$
or
$\mathcal{B}_{ij}$
vaguely (notation
$\mu_n\overset{\mathrm{v}}{\longrightarrow} \mu$
) if
$\mu_n(f)\to\mu(f)$
as
$n\to\infty$
for all continuous bounded functions
$f\;:\;\mathbb{E}\to\mathbb{R}$
whose support is in some
$B\in\mathcal{B}_{ij}$
.
The notion of
$\mathcal{B}_{ij}$
-vague convergence on
$\mathcal{N}_{ij}$
can be seen as convergence with respect to the smallest topology on
$\mathcal{N}_{ij}$
that makes the mappings
$\mu\mapsto\mu(f)$
continuous for all continuous bounded functions
$f\colon\mathbb{E}\to\mathbb{R}$
whose support is in some
$B\in\mathcal{B}_{ij}$
; call this topology the
$\mathcal{B}_{ij}$
-vague topology. Since the extension of this topology to the larger space of all Borel measures on
$\mathbb{E}$
that are finite on elements of
$\mathcal{B}_{ij}$
is known to be Polish (see [Reference Kallenberg16, Theorem 4.2] and [Reference Basrak and Planinić2, Theorem 3.1]), the
$\mathcal{B}_{ij}$
-vague topology on
$\mathcal{N}_{ij}$
is separable and metrizable. We have used the phrase
$\mathcal{B}_{ij}$
vague instead of simply vague, since we consider
$\mathcal{N}_{ij}$
with respect to the (weaker)
$\mathcal{B}_{i'j'}$
-vague topology whenever
$\mathcal{N}_{ij}\subseteq\mathcal{N}_{i'j'}$
, which is equivalent to
$\mathcal{B}_{i'j'}\subseteq\mathcal{B}_{ij}$
.
In what follows, the choice of a particular vague topology will depend on what kind of points in
$\mathbb{E}$
we want to control. For example, in the definition of the tail configuration below, we use the
$\mathcal{B}_{11}$
-vague topology since we want to control extremal scores located in a bounded neighborhood of a typical extremal score that is assumed to be at the origin.
Define the shift operators
$\varphi_{z}$
,
$z\in \mathbb{R}^d$
, on
$\mathbb{E}$
by letting
$ \varphi_{z}(t,s) \;:\!=\; (t-z,s)$
, and let

Thus, if
$\mu$
at z has score s then s becomes the score of
$\varphi_{z} \mu$
at 0, so that the shift applies only to positions, leaving the scores unchanged. Since the families
$\mathcal{B}_{ij}$
are invariant under shifts, the families
$\mathcal{N}_{ij}$
are also invariant. Observe that the mapping
$(z,\mu)\mapsto \varphi_{z}\mu$
from
$\mathbb{R}^d\times \mathcal{N}_{ij}$
to
$\mathcal{N}_{ij}$
is continuous if
$\mathcal{N}_{ij}$
is equipped with the vague topology generated by any
$\mathcal{B}_{i'j'}\subseteq\mathcal{B}_{ij}$
.
The sets
$\mathcal{N}_{ij}$
,
$i,j=0,1$
, are equipped with the Borel
$\sigma$
algebra generated by the maps
$B\mapsto\mu(B)$
for all
$B\in\mathcal{B}(\mathbb{E})$
; this coincides with the Borel
$\sigma$
algebra generated by
$\mathcal{B}_{ij}$
-vaguely open sets. A random counting measure is a random element X in
$\mathcal{N}$
. It is called stationary if
$\varphi_{z}X$
and X coincide in distribution for all
$z\in\mathbb{R}^d$
.
If a random counting measure X takes values from a smaller family
$\mathcal{N}_{10}$
, then X is called a marked point process on
$\mathbb{R}^d$
(with marks in
$(0,\infty)$
). Then, for all bounded
$A\subset\mathbb{R}^d$
, we have
$X(A\times(0,\infty))<\infty$
almost surely (a.s.), that is, the number of
$t\in A$
such that
$(t,s)\in X$
for some
$s>0$
, is a.s. finite. We assume throughout that this number has a finite mean. If X is stationary, the expected value of
$X(A\times(0,\infty))$
is proportional to the Lebesgue measure of
$A\in\mathcal{B}(\mathbb{R}^d)$
. The coefficient of proportionality
$\lambda$
is said to be the intensity of X. Later on we usually assume that
$\lambda=1$
.
Each stationary marked point process X on
$\mathbb{R}^d$
of finite intensity admits its Palm version
$\tilde{X}$
, which is a marked point process on
$\mathbb{R}^d$
, satisfying refined Campbell’s theorem

for all measurable
$h\colon\mathbb{R}^d\times \mathcal{N}_{10}\to\mathbb{R}_+$
. The Palm version
$\tilde{X}$
has the invariance property

for all measurable
$h\colon\mathbb{R}^d\times \mathcal{N}_{10} \to\mathbb{R}_+$
; see [Reference Daley and Vere-Jones12, Theorem 13.2.VIII]. Note that
$\tilde{X}$
a.s. contains the point
$(0,\xi)$
; the random variable
$\xi\;:\!=\;\tilde{X}(0)$
is said to be the score of the Palm version at the origin.
Let
$\smash{\overset{\mathrm{w}}{\longrightarrow}}$
denote weak convergence of probability measures and
$\smash{\overset{\mathrm{d}}{\longrightarrow}}$
the corresponding convergence in distribution. Distributional convergence of random counting measures in
$\mathcal{N}$
is understood with respect to a particular version of the vague topology, and so relies on the choice of the corresponding boundedness. It is well known that
${X_n\overset{\mathrm{d}}{\longrightarrow} X}$
in
$\mathcal{B}_{ij}$
if and only if the Laplace functionals of
$X_n$
,

converge to
$L_f(X)$
as
$n\to\infty$
for all continuous
$f\colon\mathbb{E}\to [0,\infty)$
with support in
$\mathcal{B}_{ij}$
; see [Reference Kallenberg16, Theorem 4.11].
2.2. A general construction of scores
In our main examples we deal with marked point processes X derived from a marked Poisson point process using the following general construction. Let P be an independently marked stationary Poisson process in
$\mathbb{R}^d$
, where
$\mathbb{R}^d$
is the space of locations and the marks take values from
$(0,\infty)$
. Note that trivial amendments make it possible to consider the marks taking values in a general Polish space, and allow this construction to be applied to a general marked point process. The intensity measure of P is the product of the Lebesgue measure on
$\mathbb{R}^d$
(possibly scaled by a constant) and a probability measure m on
$(0,\infty)$
.
Consider a measurable function
$\psi\colon \mathbb{R}^d \times{\mathcal{N}_{10}} \to (0,\infty)$
such that
$\psi(t-z,\varphi_z \mu) =\psi(t,\mu)$
for all
$z\in\mathbb{R}^d$
. In the following
$\psi$
is called a scoring function. For
$ \mu \;:\!=\; \textstyle\sum \delta_{(t,z)} \in{\mathcal{N}_{10}}$
, denote

This defines a mapping from
$\mathcal{N}_{10}$
to
$\mathcal{N}_{10}$
. While this mapping does not change locations of points, it equips each point
$t\in\mu'$
with a new score
$s=\psi(t,\mu)$
. The same construction can be clearly applied to a Poisson process in
$\mathbb{R}^d$
without marks, which fits in the above framework by letting all the marks equal to 0 say.
The shift-invariance property of
$\psi$
implies that

By the Poisson assumption, the Palm version of P is given by

where
$\zeta$
has distribution m and is independent of P.
Lemma 1. The Palm version of
$\Psi(P)$
is given by
$\Psi(\tilde{P})$
.
Proof. The Palm version
$\tilde{P}$
satisfies

Therefore, (2.3) yields

By the definition of Palm measure, the left-hand side is

so that the Palm version of
$\Psi(P)$
is indeed
$\Psi(\tilde{P})$
.
3. Tail configuration
Define a family of scaling operators
$T_{v,u}\colon\mathbb{E}\to\mathbb{E}$
,
$u,v>0$
, by

For every
$\mu\in\mathcal{N}$
, define its scaled version
$T_{v,u}\mu$
by letting
$(T_{v,u}\mu)(A)\;:\!=\;\mu(T_{v^{-1},u^{-1}}A)$
for all Borel A. Equivalently,
$T_{v,u}\mu = \textstyle\sum \delta_{(t/v,s/u)}$
if
$\mu =\textstyle\sum \delta_{(t,s)}$
, meaning that the atoms of
$T_{v,u}\mu$
are obtained by applying the transformation
$T_{v,u}$
to the atoms of
$\mu$
.
In the following we mostly work with counting measures scaled by
${T_{r(u),u}}$
for
$u>0$
, where a function
$r\colon(0,\infty)\to(0,\infty)$
is fixed and regularly varying at infinity, i.e.

for some
$\beta\in \mathbb{R}$
and a slowly varying function l. We refer to r and
$\beta$
as the scaling function and scaling index, respectively. Note that
$\beta=0$
and
$r(u)\equiv 1$
are allowed.
Definition 2. Fix a function r that is regularly varying at infinity. Let
$X\in\mathcal{N}_{10}$
be a stationary marked point process on
$\mathbb{E}$
with Palm version
$\tilde{X}$
and the score at the origin being
$\xi$
. If there exists a random counting measure
$Y\in\mathcal{N}$
such that
$Y(\{0\}\times (1,\infty))=1$
a.s. and

with respect to the
$\mathcal{B}_{11}$
-vague topology, then Y is called the tail configuration of X (with respect to the scaling function r).
Note that the tail configuration Y is assumed to be simple and it necessarily contains the point
$(0,\eta)$
with
$\mathbb{P}(\eta > 1)=1$
. We call the random variable
$\eta$
the tail score at the origin. While X and
$\tilde{X}$
are marked point processes and, thus, belong to
$\mathcal{N}_{10}$
, the tail configuration Y in general takes values in
$\mathcal{N}$
, which is a larger family.
Example 2. (Continuation of Example 1.) For the stationary point process X of our initial example, it is straightforward to see that the Palm version of
$\tilde{X}$
is
$\smash{\tilde{X}\stackrel{d}{=} X + C_0}$
, where
$C_0$
is the Palm version of the typical cluster of X independent of X; see [Reference Chiu, Stoyan, Kendall and Mecke10, Section 5.3]. More precisely,
$\smash{C_0 \stackrel{d}{=} \delta_{(0,\xi)} + I \delta_{(h_0,\xi)}}$
. Here I represents a Bernoulli random variable independent of the random pair
$(h_0,\xi)$
, which has the same distribution as any of the pairs
$(h_t,\zeta_t)$
. However, owing to the size biasing phenomenon, I has a different distribution from
$\varepsilon_t$
: in the case
$p=\frac{1}{2}$
, for instance,
$\mathbb{P}(I=1) = \smash{\frac{2}{3}}$
. Observe that, owing to the regular variation assumption, the distributions
$\mathbb{P}(\xi /u \in\cdot \mid \xi>u)$
converge to a Pareto distribution with parameter
$\alpha>0$
. Moreover, the independence of X and
$C_0$
implies that
$\smash{T_{1,u} X \overset{\mathrm{d}}{\longrightarrow} 0}$
even if we condition on
$\xi>u$
, therefore, for
$r(u) \equiv 1$
,

where

and where
$h_0$
is independent of
$\eta$
, which has the Pareto distribution with index
$\alpha$
.
It is convenient to denote

Observe that the tail process Y a.s. belongs to
$\mathcal{A}_1$
. For
$c>0$
,
$x\in \mathbb{R}^d$
, by
$B_c(x)$
we denote the open Euclidean ball around x of radius c, and set
$B_c\;:\!=\;B_c(0)$
.
Proposition 1. If (3.3) holds then the score at the origin
$\xi$
has a regularly varying tail, that is,

for some
$\alpha>0$
, and the tail score at the origin
$\eta$
is
$\mathrm{Pareto}(\alpha)$
distributed, that is,

Proof. Assume that
$\mu,\mu_1,\mu_2,\dots $
are counting measures in
$\mathcal{A}_1$
such that
$\smash{\mu_n \overset{\mathrm{v}}{\longrightarrow} \mu}$
in
$\mathcal{B}_{11}$
. Since
$\mu \in \mathcal{A}_1$
, we can always find a bounded set
$B\in \mathcal{B}_{11}$
of the form
$B= B_\varepsilon \times(1+\varepsilon,\infty)$
such that
$\mu(B) = 1$
(i.e.
$(0,\mu(0))$
is the only point of
$\mu$
in B) and
$\mu (\partial B) = 0$
. Convergence
$\smash{\mu_n\overset{\mathrm{v}}{\longrightarrow} \mu}$
implies (see, e.g., [Reference Basrak and Planinić2, Proposition 2.8]) that
$\mu_n(0) \to\mu(0)$
. In other words, the score at the origin is a continuous function on
$\mathcal{N} \cap \mathcal{A}_1$
. By a continuous mapping argument,
$\xi/u = {T_{r(u),u}} \tilde{X} (0)$
, conditionally on
$\xi >u$
, converges in distribution to
$\eta = Y(0)$
as
$u \to\infty$
. More precisely,

for all
$y> 0$
that are continuity points for
$\eta$
. Standard arguments now yield that (3.5) holds for some
$\alpha>0$
(derived by analysing the tail of
$\xi$
), and (3.6) follows immediately.
The constant
$\alpha>0$
from (3.6) will be called the tail index of X,
$\xi$
, and
$\eta$
. Observe that the scaling index of the function r from (3.2) is not related to
$\alpha$
. Furthermore, the point process

in
$\mathcal{N}$
is called the spectral tail configuration of X. By definition,
$\Theta$
a.s. contains the point (0,1).
Proposition 2. The spectral tail configuration is independent of
$\eta$
and satisfies

Proof. Recall the set
$\mathcal{A}_y$
from (3.4) and note that
$\mathcal{A}_1$
consists of all
$\mu\in\mathcal{N}$
such that
$\mu(\{0\}\times(1,\infty))\geq1$
. Consider the family of mappings
$H_u$
,
$u>0$
, defined by
$H_u(\mu)\;:\!=\;(s,r(su)/r(u), \mu)$
, where
$s=\mu(0)$
is the score at the origin of
$\mu\in\mathcal{A}_1$
.
Let
$\smash{\mu_u\overset{\mathrm{v}}{\longrightarrow} \mu}$
(in
$\mathcal{B}_{11}$
) as
$u\to\infty$
for some
$\mu_u$
and
$\mu$
from
$\mathcal{A}_1$
. Denote
$s_u\;:\!=\;\mu_u(0)$
and
$s\;:\!=\;\mu(0)$
. Then
$s_u\to s$
as
$u\to\infty$
. Since r is regularly varying with index
$\beta$
, the convergence
$r(yu)/r(u)\to y^{\beta}$
as
$u\to\infty$
holds locally uniformly in y on
$(0,\infty)$
, see [Reference Resnick26, Proposition 2.4], so that
$r(s_u u)/r(u)\to s^{\beta}$
as
$u\to\infty$
. Therefore,
$H_u(\mu_u)\to (s, s^{\beta}, \mu)$
as
$u\to\infty$
. The extended continuous mapping theorem applied to (3.3) (see [Reference Billingsley6, Theorem 5.5]) yields

on
$(1,\infty)\times (0,\infty)\times \mathcal{N}$
. Another application of the continuous mapping theorem yields

on
$(1,\infty)\times \mathcal{N}$
. This yields (3.8) and, together with (3.5),

for all
$y\ge 1$
and all Borel subsets
$B\subseteq \mathcal{N}$
such that
$\mathbb{P}(\Theta\in \partial B)=0$
. This implies that
$\eta$
and
$\Theta$
are independent, since the class of all such Bs (denoted by
$\mathcal{S}$
) is closed under finite intersections and generates the Borel
$\sigma$
algebra on
$\mathcal{N}$
. The latter fact follows, since the vague topology on
$\mathcal{N}$
is separable and metrizable, so we can represent every open subset of
$\mathcal{N}$
as a countable union of open balls that are elements of
$\mathcal{S}$
.
To conclude this section, we show that the invariance property (2.2) of the Palm distribution
$\tilde{X}$
induces a similar property of the tail configuration Y that, as in [Reference Planinić24, Section 2], could be called exceedance-stationarity; cf. also [Reference Last18].
Theorem 1. For every measurable
$h\colon\mathbb{R}^d\times\mathcal{N} \to [0,\infty)$
,

Proof. Since
${T_{r(a_{u}),u}}$
scales the scores with
$u^{-1}$
, a score in
${T_{r(a_{u}),u}}\tilde{X}$
exceeding 1 corresponds to a score in
$\tilde{X}$
exceeding u. Thus, by (2.2),

We aim to show that both sides (if normalized by
$\mathbb{P}(\xi>u)$
) converge to the corresponding sides of (3.9). However, a direct application of (3.3) is not possible since the functionals

for
$\mu\in \mathcal{A}_1$
(see (3.4)), are not bounded, even if h is bounded.
Fix a bounded continuous function
$h\colon\mathbb{R}^d\times\mathcal{N}\to[0,\infty)$
such that, for some
$c>0$
,
$h(t,\mu)=0$
for all
$t\notin B_c$
. Furthermore, fix
$k\in\mathbb{N}$
,
$a>2c$
, and consider the maps
$H_1,H_2\colon\mathcal{A}_{1}\to [0,\infty)$
given by

Both maps are bounded by
$k\sup h$
since
$H_1(\mu)=H_2(\mu)=0$
whenever
$\mu(B_c\times (1,\infty))>k$
. Moreover, we claim that
$H_1$
and
$H_2$
are continuous on all
$\mu \in \mathcal{A}_1$
such that
-
(i)
$\|t-x\|\neq a$ for all
$(t,s), (x,v) \in \mu$ , and
-
(ii)
$s\neq 1$ for all
$(t,s)\in \mu$ .
Denote by
$C_a$
the set of all such
$\mu$
s. If
$a>2c$
, the indicators
$\textbf{1}_{\{(\varphi_{t}\mu)(B_a\times(1,\infty))\leq k\}}$
,
$t\in B_c$
, depend on points of
$\mu$
in
$B_{2a}\times (1,\infty)$
. Since for each
$\mu \in C_a$
there exists an
$\varepsilon>0$
such that
$\mu(\partial(B_{2a+\varepsilon} \times (1,\infty)))=0$
, and since
$B_{2a+\varepsilon}\times (1,\infty)$
is in
$\mathcal{B}_{11}$
, properties of
$\mathcal{B}_{11}$
-vague convergence (see [Reference Basrak and Planinić2, Proposition 2.8]) imply that maps
$H_1$
and
$H_2$
are continuous on
$C_a$
.
Since Y has at most countably many points (as any other random element of
$\mathcal{N}$
), it is easy to show that, for all but at most countably many
$a>0$
, it a.s. holds that
$\|t-x\|\neq a$
for all
$(t,s), (x,v) \in Y$
. Furthermore, since

where
$\eta$
has a nonatomic distribution and is independent of the spectral tail configuration
$\Theta$
, it immediately follows that with probability zero Y contains (t, s) with
$s=1$
, equivalently,
$\Theta$
contains
$(t,\theta)$
with
$\theta=\eta^{-1}$
. Thus, we can find
$a>2c$
such that
$\mathbb{P}(Y\in C_a)=1$
.
Since
$\mathbb{E} [H_1({T_{r(a_{u}),u}}\tilde{X}) \mid \xi >u]= \mathbb{E}[H_2({T_{r(a_{u}),u}}\tilde{X}) \mid \xi >u]$
for all
$u>0$
(argue exactly as in the beginning of the proof), applying (3.3) we obtain
$\mathbb{E} H_1(Y)=\mathbb{E} H_2(Y)$
. Letting
$k\to\infty$
yields (3.9) for all nonnegative continuous bounded functions h that vanish for
$t\notin B_c$
. We can further remove the latter restriction by letting
$c\to\infty$
. We claim that this ensures that (3.9) holds for all nonnegative measurable functions.
Observe first that, when viewed as functions of h, both sides of (3.9) define a Borel measure on
$\mathbb{R}^d \times\mathcal{N}$
—denote them by
$\nu_1$
and
$\nu_2$
. Since these two measures coincide on all nonnegative bounded continuous functions, then they coincide on the
$\pi$
system of all open subsets of
$\mathbb{R}^d \times\mathcal{N}$
. Measures
$\nu_1$
and
$\nu_2$
are in general unbounded, but they take finite values on the sets of the form
$B_c\times C_{a,k},$
where
$C_{a,k}=\{\mu \in \mathcal{N} \colon\mu(B_a \times (1,\infty)) \leq k\}$
for
$c>0, a>2c, k\in \mathbb{N}$
. While the set
$C_{a,k}$
is not open in the
$\mathcal{B}_{11}$
-vague topology, its subset

is open, which implies that

for all
$c>0, a>2c, k\in \mathbb{N}$
. As already explained above, for every fixed
$c>0$
, we can find
$a=a(c)>2c$
such that
$\mathbb{P}(Y(\partial(B_a\times (1,\infty))=0)=1$
. With this choice of a,
$\nu_1$
and
$\nu_2$
put zero mass on
$C_{a.k}\setminus C_{a,k}'$
for all
$k\in \mathbb{N}$
so, in particular,

for all
$k\in \mathbb{N}$
. Since
$\mu(B_c\times (1,\infty))<\infty$
for all
$\mu \in \mathcal{N}$
, we have
$ C_{a,k} \uparrow \mathcal{N}$
as
$k\to\infty$
. Thus, by letting
$k\to\infty$
and then
$c\to\infty$
, we find that
$\nu_1$
and
$\nu_2$
coincide on
$(\mathbb{R}^d\times \mathcal{N},\mathcal{B}(\mathbb{R}^d\times \mathcal{N}))$
, which proves the claim.
Remark 1. Exceedance-stationarity property (3.9) and the polar decomposition from Proposition 2 yield

for every measurable
$h\colon\mathbb{R}^d\times\mathcal{N} \to [0,\infty)$
; see [Reference Planinić24, Remark 2.11]. This property of the spectral tail configuration can be seen as the analogue of the time-change formula known to characterize the class of all spectral tail processes (and thus tail processes) of regularly varying time series; see [Reference Dombry, Hashorva and Soulier13], [Reference Janßen15].
4. Poisson approximation for extremal clusters
In what follows assume that X is a stationary marked point process on
$\mathbb{R}^d$
of unit intensity with marks (scores) in
$(0,\infty)$
, which admits a tail configuration Y in the sense of Definition 2. The main goal of this section is to describe the limiting behavior of scores of X in
$[0,\tau]^d$
that exceed a suitably chosen high threshold, as
$\tau$
and the threshold size tend to infinity.
4.1. Extremal blocks
Let
$(a_\tau)_{\tau>0}$
be a family of positive real numbers chosen such that

where the first equality follows from refined Campbell’s theorem (2.1). By (3.5),

Let
$(b_\tau)_{\tau>0}$
be a family of positive real numbers such that
$b_\tau/\tau\to 0$
as
$\tau\to\infty$
. Divide the hypercube
$[0,\tau]^d$
into blocks of side length
$b_\tau$
defined as

for
$\boldsymbol{i}=(i_1,\dots,i_d) \in I_{\tau}\;:\!=\; \{1,\dots, k_\tau\}^d$
, where
$k_\tau \;:\!=\; \lfloor \tau/b_\tau\rfloor$
. Technically speaking, we are dividing the hypercube
$[0,k_{\tau}b_{\tau}]^d$
. However, in applications this edge effect is easily seen to be negligible. For every
$\boldsymbol{i} \in I_{\tau}$
, define

which is the restriction of X to
$J_{\tau,\boldsymbol{i}}\times(0,\infty)$
.
For fixed
$\tau$
and
$\varepsilon$
, think of clusters of extremal scores of X as blocks
$X_{\tau,\boldsymbol{i}}$
that contain at least one score exceeding
$a_\tau \varepsilon$
. For every
$\boldsymbol{i}\in I_\tau$
, by (4.2) and since
$b_{\tau}/\tau \to 0$
,

where we bound the probability with expectation, then used refined Campbell’s theorem (2.1) and, finally, (4.1).
4.2. Space for extremal blocks
Recall that X is a random element of the space
$\mathcal{N}_{10}$
so that the blocks
$X_{\tau, \boldsymbol{i}}$
can be considered as elements of
$\mathcal{N}_{01}$
that consists of simple counting measures on
$\mathbb{E}$
with finite values on
$\mathbb{R}^d\times(\varepsilon,\infty)$
for all
$\varepsilon>0$
. Recall further that
$\mathcal{N}_{01}\subset\mathcal{N}=\mathcal{N}_{11}$
and that, on
$\mathcal{N}_{01}$
,
$\mathcal{B}_{01}$
topology is stronger than the
$\mathcal{B}_{11}$
topology. We now define a metric
${\textsf{m}}$
, generating the
$\mathcal{B}_{01}$
-vague topology on
$\mathcal{N}_{01}$
.
Let
$\mu,\nu\in \mathcal{N}_{01}$
be such that
$\mu(\mathbb{E}),\nu(\mathbb{E})<\infty$
(i.e.
$\mu,\nu\in\mathcal{N}_{00}$
). If
$\mu(\mathbb{E})\neq \nu(\mathbb{E})$
set
${\textsf{m}}_0(\mu, \nu)=1$
, and if
$\mu(\mathbb{E})=\nu(\mathbb{E})=k\in\mathbb{N}_0$
and
$\mu=\textstyle\sum_{i=1}^k \delta_{(t_i,s_i)}$
,
$\nu=\textstyle\sum_{i=1}^k \delta_{(t_i',s_i')}$
, define

where the minimum is taken over all permutations
$\Pi$
of
$\{1,\dots,k\}$
. Note that
${\textsf{m}}_0$
is a metric generating the weak (that is,
$\mathcal{B}_{00}$
-vague) topology on
$\mathcal{N}_{00}$
; see [Reference Schuhmacher and Xia27, Proposition 2.3]. While the authors of [Reference Schuhmacher and Xia27] assume that the ground space is compact, an easy argument justifies the claim for the space
$\mathbb{E}=\mathbb{R}^d\times(0,\infty)$
. Observe also that
${\textsf{m}}_0$
is by construction bounded by 1 and shift invariant, that is,
${\textsf{m}}_0(\varphi_{y}\mu,\varphi_{y}\nu)={\textsf{m}}_0(\mu,\nu)$
for all
$y\in \mathbb{R}^d$
and
$\mu,\nu\in \mathcal{N}_{00}$
.
For general
$\mu,\nu \in \mathcal{N}_{01}$
, set

where
$\mu^{1/u}$
(and similarly
$\nu^{1/u}$
) is the restriction of
$\mu$
on
$\mathbb{R}^d\times (1/u,\infty)$
. It can be shown that this is indeed a metric on
$\mathcal{N}_{01}$
that is bounded by 1, is shift invariant, and that it generates the
$\mathcal{B}_{01}$
-vague topology. The latter claim follows since
$\mu_n$
converges to
$\mu$
in
$\mathcal{N}_{01}$
with respect to this topology if and only if
$\mu_n^{1/u}$
converges weakly to
$\mu^{1/u}$
for Lebesgue almost all
$u\in(0,\infty)$
; see [Reference Daley and Vere-Jones11, Proposition A2.6.II] and [Reference Morariu-Patrichi19].
We actually work with a quotient space of
$\mathcal{N}_{01}$
. For
$\mu,\nu\in \mathcal{N}_{01}$
, set
$\mu\sim \nu$
if
$\varphi_{y}\mu=\nu$
for some
$y\in \mathbb{R}^d$
, and denote by
$\tilde{\mathcal{N}}_{01}$
the quotient space of shift-equivalent counting measures in
$\mathcal{N}_{01}$
. Denote by

the equivalence class of
$\mu\in \mathcal{N}_{01}$
. Define

Lemma 2. Function
$\tilde{{\textsf{m}}}$
is a metric on
$\tilde{\mathcal{N}}_{01}$
, and
$(\tilde{\mathcal{N}}_{01},\tilde{{\textsf{m}}})$
is a separable metric space. Moreover,
$\tilde{{\textsf{m}}}([\mu_n], [\mu])\to 0$
as
$n\to\infty$
(denoted by
$[\mu_n]\overset{\mathrm{v}}{\longrightarrow}[\mu]$
) for
$\mu_n,\mu\in \mathcal{N}_{01}$
if and only if there exist
$y_n\in \mathbb{R}^d$
,
$n\in \mathbb{N}$
, such that
$\varphi_{y_n}\mu_n\overset{\mathrm{v}}{\longrightarrow} \mu$
in
$\mathcal{N}_{01}$
.
Proof. Since
${\textsf{m}}$
is shift invariant,

for all
$\mu, \nu\in \mathcal{N}_{01}$
. It is now easy to show that (4.6) implies that
$\tilde{{\textsf{m}}}$
is a pseudo-metric on
$\tilde{\mathcal{N}}_{01}$
, and that
$(\tilde{\mathcal{N}}_{01},\tilde{{\textsf{m}}})$
is separable since
$(\mathcal{N}_{01}, {\textsf{m}})$
is separable. This follows by a direct application of [Reference Planinić23, Lemma 2.5.1]; the only nontrivial step is to show that
$\tilde{{\textsf{m}}}$
satisfies the triangle inequality.
To show that
$\tilde{{\textsf{m}}}$
is actually a metric, assume that
$\tilde{{\textsf{m}}}([\mu], [\nu])=0$
for some
$\mu,\nu\in\mathcal{N}_{01}$
. By (4.6), there exists a sequence
$(z_n)_{n\in\mathbb{N}} \subseteq \mathbb{R}^d$
such that
${\textsf{m}}(\varphi_{z_n}\mu, \nu)\to 0$
. If
$\nu$
is the null measure then it follows easily that
$\mu$
is also the null measure, so that
$[\mu]=[\nu]$
. If
$\nu$
is not the null measure, for
$\varepsilon>0$
small enough, we have
$1\leq \nu(\mathbb{R}^d \times (\varepsilon,\infty))<\infty$
. Since
$\varphi_{z_n}$
only translates positions, the sequence
$(z_n)_n$
must be bounded. Indeed, otherwise, for infinitely many ns, we have
${\textsf{m}}_0((\varphi_{z_n}\mu)^{1/u}, \nu^{1/u})=1$
for
$1/u\leq \varepsilon$
and, in particular,
${\textsf{m}}(\varphi_{z_n}\mu,\nu) \geq \textrm{e}^{1/\varepsilon} - 1 >0$
. Thus, there exists a subsequence
$(z_{n_k})_{k\in \mathbb{N}}$
such that
$z_{n_k} \to z \in \mathbb{R}^d$
as
$k\to\infty$
. Since
$\lim_{k\to\infty}{\textsf{m}}(\varphi_{z_{n_k}}\mu,\varphi_{z}\mu)\to 0$
, and since
${\textsf{m}}$
is a metric, we conclude that
$\varphi_{z}\mu = \nu$
, i.e.
$[\mu]=[\nu]$
.
Next, recall the scaling operator
${T_{r(a_{u}),u}}$
defined at (3.1). Observe that

for all
$u>0$
,
$y\in \mathbb{R}^d$
and
$\mu\in \mathcal{N}_{01}$
. Since
$[{T_{r(a_{u}),u}}\mu]=[{T_{r(a_{u}),u}}\nu]$
whenever
$\mu\sim\nu$
, the scaling operator
${T_{r(a_{u}),u}} [\mu]\;:\!=\;[{T_{r(a_{u}),u}}\mu]$
is well defined on
$\tilde{\mathcal{N}}_{01}$
. For
$\mu\in\mathcal{N}_{01}$
, denote by

the maximal score of
$\mu$
, which is necessarily finite. Observe that the maximal score is shift invariant and, thus, it is well defined on
$\tilde{\mathcal{N}}_{01}$
.
Finally, for every
$\mu\in \mathcal{N}_{01}$
such that
$M(\mu)>0$
(that is,
$\mu$
is not the null measure), define

where the minimum is taken with respect to the lexicographic order on
$\mathbb{R}^d$
. Thus,
$A^{\mathrm{fm}}(\mu)$
is the position where the maximal score of
$\mu$
is attained; if there is a tie, the first position is chosen. This is well defined since
$\mu$
has at most finitely many scores exceeding any
$\varepsilon>0$
.
4.3. Main result
For each
$\tau>0$
, consider the point process

of (rescaled) blocks together with their ‘positions’; observe that
$\boldsymbol{i} b_{\tau}$
is one of the corners of
$J_{\tau,\boldsymbol{i}}$
defined at (4.3). While the definition of
$N_\tau$
also depends on the choice of the block size length
$b_{\tau}$
, for simplicity, we do not include it in the notation. The exact choice of the position component is immaterial: subsequent results hold if it is arbitrarily chosen inside the block using any deterministic or randomized procedure.
Let
$\tilde{\mathcal{N}}_{01}^{*}$
be equal to the space
$\tilde{\mathcal{N}}_{01}$
, but without the null measure. Furthermore, let
$\boldsymbol{N}$
be the space of all counting measures on
$[0,1]^d \times\tilde{\mathcal{N}}_{01}^{*}$
that are finite on all Borel sets
$B\subseteq[0,1]^d \times \tilde{\mathcal{N}}_{01}^{*}$
such that, for some
$\varepsilon=\varepsilon(B)>0$
,
$M(\mu)>\varepsilon$
for all
$(t,\mu)\in B$
. Equip
$\boldsymbol{N}$
with the vague topology generated by the same family of sets.
Theorem 2. Let X be a stationary marked point process on
$\mathbb{E}$
that admits a tail configuration Y in the sense of Definition 2 with tail index
$\alpha>0$
and a scaling function r of scaling index
$\beta\in\mathbb{R}$
. Let
$\Theta$
be the corresponding spectral tail configuration defined in (3.7). Finally, fix a family of block side lengths
$(b_{\tau})_{\tau>0}$
.
If Assumptions 1, 2, and 3 below hold, then
$\mathbb{P}(Y \in\mathcal{N}_{01})= \mathbb{P}(\Theta \in \mathcal{N}_{01})=1$
and

in
$\boldsymbol{N}$
, where
$\{(U_i,\Gamma_i,Q_i),i\geq1\}$
are points of the Poisson process on
$[0,1]^d\times \mathbb{R}_+\times \mathcal{N}_{01}$
with the intensity measure being the product of the Lebesgue measure on
$[0,1]^d$
, the Lebesgue measure on
$\mathbb{R}_+$
scaled by

and the probability distribution of a random element Q in
$\mathcal{N}_{01}$
given by

The proof of Theorem 2 is presented in Section 6. The mark component in the limiting point process N from (4.9) can be viewed as the scaling transformation of the equivalence classes
$[Q_i]$
by multiplying the scores with
$\smash{y_i\;:\!=\;\Gamma_i^{-1/\alpha}}$
and positions with
$\smash{y_i^\beta}$
. Note that
$\{y_i,i\geq1\}$
form a Poisson process on
$(0,\infty)$
with intensity
${\vartheta} \alpha y^{\alpha-1}\mathrm{d} y$
. Furthermore, note that the point process Q with distribution (4.11) necessarily satisfies
$M(Q)=1$
a.s.
If
$X_{\tau, (j)}$
and
$\boldsymbol{i}_{(j)} b_{\tau}$
,
$j=1,2,\dots,k_{\tau}^d$
, denote the original blocks and their positions, relabelled so that

the continuous mapping theorem applied to (4.9) yields the convergence

in the space
$([0,1]^d\times \tilde{\mathcal{N}}_{01})^{k}$
for every fixed
$k\geq 1$
. In particular, for
$k=1$
, this (modulo some edge effects that are easily shown to be negligible) implies that

Thus, the limiting distribution of the rescaled maximal score of X in
$[0,\tau]^d$
is the nonstandard Fréchet distribution. Since the point process of locations in X is assumed to be a unit rate stationary process and since the marginal score satisfies (4.2), the value of
$\vartheta$
deserves to be called the extremal index of X.
The second ingredient of the limiting point process in (4.9) is the distribution of Q that can be seen as the asymptotic distribution of a normalized typical cluster of exceedances of X. In contrast, the tail configuration Y is not typical. Since it contains what can be intuitively understood as a uniformly selected exceedance in X, the distribution of Y is biased towards clusters with more exceedances. In fact, the relationship between the tail configuration and the typical cluster of exceedances of X is similar to the relationship between a stationary point process on
$\mathbb{R}^d$
and its Palm version, and this relationship is discussed in detail in [Reference Planinić24] for random fields over
$\mathbb{Z}^d$
.
Example 3. (Continuation of Example 1.) Recall that for the stationary point process X of our initial example, we found the tail configuration in the form
$Y = \delta_{(0,\eta)} + I\delta_{(h_0,\eta)}$
, with the distribution of
$I, h_0,$
and
$\eta$
described above. By (3.7) the spectral tail configuration has the form
$\Theta = \delta_{(0,1)} + I \delta_{(h_0,1)}$
. For
$p =\frac{1}{2}$
, we have
$\mathbb{P}(I=1) = \tfrac{2}{3}$
, therefore,

Finally, it is not difficult to show that
$\smash{Q = \delta_{(0,1)} + \varepsilon\delta_{(h^+_0,1)},}$
where
$\smash{h^+_0}$
has the distribution of
$h_0$
restricted to the points larger than 0 in the lexicographical order on
$\mathbb{R}^d$
and
$\varepsilon$
is an independent Bernoulli random variable with
$\mathbb{P}(\varepsilon=1)=\frac{1}{2}$
.
4.4. Assumptions of Theorem 2
Fix a family of positive real numbers
$(b_{\tau})_{\tau>0}$
that represent the block side lengths.
Assumption 1. (On scaling.) The family
$(b_{\tau})_{\tau>0}$
satisfies

By regular variation of the scaling function r, (4.13) yields that
$r(a_{\tau}\varepsilon)/b_\tau\to 0$
as
$\tau\to\infty$
for all
$\varepsilon>0$
as well. If the scaling function r is a constant (as it always was for the case of random fields over
$\mathbb{Z}^d$
), then necessarily
$b_{\tau}\to\infty$
. However, if
$r(a_{\tau})\to 0$
, we can take
$(b_{\tau})_{\tau}$
to be a constant or even such that
$b_{\tau}\to0$
.
Recall that, for
$B\in \mathcal{B}(\mathbb{R}^d)$
, we denote by
$\mu_B$
the restriction of
$\mu\in \mathcal{N}$
to
$B\times (0,\infty)$
. Furthermore, recall that
$\tilde{X}$
denotes a Palm version of X and
$\xi=\tilde{X}(0)$
.
Assumption 2. (On dependence within a block/anticlustering.) For all
$\varepsilon,\delta,c>0$
,

where

Assumption 2 concerns the maximum score of
$\tilde{X}$
in the annulus
$C_{\tau,u} = B_{b_\tau c}\setminus B_{r(a_\tau \varepsilon)u}$
, which, for a sufficiently large u, is far away from the origin and still relatively small compared to the size of
$[0,\tau]^d$
. The assumption simply states that if we condition on a large score at the origin, we are unlikely to see another large value in such an area. Roughly speaking, it prevents clustering of large scores beyond a certain distance. For a further illustration of this condition, see Example 4.
Our next and final assumption essentially requires that extremal blocks asymptotically behave as if they were independent. To state it, we introduce some additional notation. Let
$\mathcal{F}$
be the family of all shift-invariant measurable functions
$f\colon\mathcal{N}_{01} \to[0,\infty)$
such that
$f(0) = 0$
, and for which there exists some
$\delta>0$
such that, for all
$\mu\in \mathcal{N}_{01}$
,

where
$\mu^{\delta}$
denotes the restriction of
$\mu$
to
$\mathbb{R}^d\times (\delta,\infty)$
, that is, the value of f depends only on scores of
$\mu$
that are larger than
$\delta$
.
For a family of positive real numbers
$(l_{\tau})_{\tau}$
, for every
$\tau>0$
and
$\boldsymbol{i}=(i_1,\dots,i_d)\in I_{\tau}$
, cut off the edges of
$J_{\tau, \boldsymbol{i}}$
by
$l_{\tau}$
, that is, consider

and define the corresponding trimmed block

Assumption 3. (On dependence between extremal blocks.) There exists a family
$(l_{\tau})_{\tau>0}$
, satisfying

and such that

as
$\tau\to\infty$
for any family of functions
$f_{\tau, \boldsymbol{i}} \in \mathcal{F}$
,
$\tau>0, \boldsymbol{i} \in I_{\tau}$
, which satisfy (4.15) for the same
$\delta>0$
.
Informally speaking, (4.17) holds if extremal scores are only locally dependent. The crucial issue in (4.17) is that the value of
$f_{\tau,\boldsymbol{i}}({T_{r(a_{\tau}),a_{\tau}}}\widehat{X}_{\tau,\boldsymbol{i}})$
depends only on points
$(t,s)\in \widehat{X}_{\tau,\boldsymbol{i}}$
with
$s>a_{\tau}\delta$
, and that, for any
$(t,s)\in\widehat{X}_{\tau,\boldsymbol{i}}$
and
$(t',s')\in\widehat{X}_{\tau,\boldsymbol{i}'}$
for
$\boldsymbol{i}\neq\boldsymbol{i}'$
, we have
$|t-t'|\geq l_{\tau}$
.
Example 4. (Continuation of Example 1.) Recall that
$ X = \sum \delta_{(t,\zeta_t)} +\varepsilon_t \delta_{(t+h_t,\zeta_t)} $
where i.i.d. points
$(h_t,\varepsilon_t,\zeta_t)$
have independent components with the first having bounded support, i.e.
$\mathbb{P}(\| h\| \leq H) = 1$
for some constant
$H\geq 0$
. If we set
$r\equiv 1$
, Assumption 1 holds for any
$b_\tau \to\infty$
and
$b_\tau= o(\tau)$
. Assumption 2 is also easily verified in this context: simply recall that
$\tilde{X} = X+C_0$
with
${C_0 \stackrel{d}{=} \delta_{(0,\xi)} + I \delta_{(h_0,\xi)}}$
. Therefore, for any
$u>H$
, there is no point of
$C_0$
in the annulus
$C_{\tau,u}=B_{b_\tau c}\setminus B_{ u} $
and, thus,

as
$\tau \to\infty$
. Similarly, because any cluster fits in a ball of radius H, Assumption 3 holds immediately, if we trim the block with
$l_\tau = H$
.
4.5. Alternative representations of
$\vartheta$
and Q
Assume that Y is the tail configuration of a stationary marked point process X on
$\mathbb{E}$
, with tail index
$\alpha>0$
and a scaling function r of scaling index
$\beta$
. In this subsection we give some Palm-like properties of the tail configuration under the assumption that
$Y\in \mathcal{N}_{01}$
a.s., which, e.g., holds under the anticlustering condition (4.14); see Proposition 10. All of the following results are based on the exceedance-stationarity property (3.9) of the tail configuration.
For any
$\mu\in \mathcal{N}_{01}$
, denote

By definition,
$0\in e(Y)$
a.s. A function
$A\colon\mathcal{N}_{01}\to\mathbb{R}^d$
will be called an anchoring function if, for all
$\mu \in \mathcal{N}_{01}$
such that
$e(\mu)\neq \varnothing$
,
-
(i)
$A(\mu)\in e(\mu)$ ;
-
(ii)
$A(\varphi_{z}\mu)=A(\mu)-z$ for all
$z\in \mathbb{R}^d$ (shift equivariance).
A typical example of an anchoring function is the first maximum anchor
$A^{\mathrm{fm}}$
from (4.7). Another one is the first exceedance anchor

where the minimum is taken with respect to the lexicographic order.
Lemma 3. If
$Y\in \mathcal{N}_{01}$
a.s. then

for any anchoring function A.
Proof. We adapt the proof of [Reference Basrak and Planinić3, Lemma 3.4]. Using (3.9) and the shift-equivariance property of A,

Note that
$\mathbb{P}(Y\in \mathcal{N}_{01})=1$
implies that
$\textstyle\sum_{(t,s)\in Y} \textbf{1}_{\{s>1\}}<\infty$
a.s. Thus,
$\mathbb{P}(A(Y)=0)>0$
, since otherwise the last expression above would vanish.
Proposition 3. If
$\mathbb{P}(Y\in \mathcal{N}_{01})=1$
then
$\vartheta_A$
and the distribution
$\mathbb{P}([Y] \in \cdot \mid A(Y)=0)$
on
$\tilde{\mathcal{N}}_{01}$
does not depend on the choice of the anchoring function A.
Proof. This result parallels [Reference Basrak and Planinić3, Lemma 3.5] and it can proved in the same manner by using (3.9) instead of [Reference Basrak and Planinić3, Property (3.8)]. We omit the details.
The above results yield alternative representations for
$\vartheta$
in (4.10). Indeed, recall the spectral tail configuration
$\Theta={T_{\eta^\beta,\eta}} Y$
(where
$\eta=Y(0)$
) and observe that
$A^{\mathrm{fm}}(\Theta)=0$
if and only if
$A^{\mathrm{fm}}(Y)=0$
. Thus,
$\vartheta=\vartheta_{A^{\mathrm{fm}}}=\mathbb{P}(A^{\mathrm{fm}}(\Theta)=0)$
, which is further equal to
$\vartheta_A=\mathbb{P}(A(Y)=0)$
for an arbitrary anchoring function A. Since
$Y = T_{\eta^{-\beta}, \eta^{-1}} \Theta$
with
$\Theta$
independent of the
$\mathrm{Pareto}(\alpha)$
random variable
$\eta$
,

on
$\mathcal{N}_{01}$
, where Q has distribution (4.11) on
$\mathcal{N}_{01}$
. Using this fact, we can, as in [Reference Planinić24, Proposition 3.9], prove that Q from (4.11) satisfies

on
$\tilde{\mathcal{N}}_{01}$
. In particular,

and
$(Q_i)_{i\geq 1}$
in (4.9) can be chosen such that their common distribution on
$\mathcal{N}_{01}$
satisfies

5. Examples: tail configurations, extremal indices, and typical clusters
5.1. Small distance to the kth nearest neighbor
Below we consider the situation when each point of the stationary Poisson process is equipped with the score, being the reciprocal of the distance to its nearest neighbor. Then large scores identify points with small distances to nearest neighbors and the tail configuration describes the positions of points in a cluster of points that are all located near to each other. Note that large distances to the nearest neighbor recently studied in [Reference Chenavier and Otto8] indicate isolated points that do not form clusters.
Fix a
$k\in \mathbb{N}$
. For a set
$I\subset \mathbb{R}^d$
that has at most a finite number of points in any bounded region and
$t\in\mathbb{R}^d$
, let
$\rho_k(t,I)$
denote the distance from t to its kth nearest neighbor in
$I\setminus\{t\}$
. Note that
$\rho_k(t,I)<a$
if and only if
$I\setminus\{t\}$
has at least k points in the open ball of radius a centered at t.
Let P be a homogeneous unit intensity Poisson process on
$\mathbb{R}^d$
, and let X be the marked point process obtained by attaching the score
$\rho_k(t,P)^{-1}$
to each of
$t\in P$
, so that the score is the reciprocal to the distance from
$t\in P$
to its kth nearest neighbor. Thus, we can write
$X = \Psi(P)$
with the scoring function
$\psi(s,\mu) =\rho_k(s,\mu)^{-1}$
; see Section 2.2. By Lemma 1, the Palm version
$\tilde{X}$
is obtained by the same procedure applied to
$P+\delta_0$
, so that the points in
$\tilde{X}$
are located at all points t from
$P+\delta_0$
and the score at t is given by
$s=\rho_k(t,P+\delta_0)^{-1}$
. In particular, the score of the Palm version at the origin is

Since the random variable
$P(B_{1/u})$
has Poisson distribution with mean
$C_d u^{-d}$
, where
$C_d$
is the volume of the unit ball in
$\mathbb{R}^d$
, and this mean goes to zero as
$u\to\infty$
, it is straightforward to see that

Thus,

i.e.
$\xi$
has a regularly varying tail with tail index
$\alpha=dk$
.
5.1.1. Tail configuration
$ $
Proposition 4. For every
$k\in \mathbb{N}$
, the tail configuration of X exists with the normalizing function
$r(u)=u^{-1}$
(i.e.
$\beta=-1$
) and is given by

where
$\mathcal{Y}\;:\!=\;\{U_0,\dots, U_k\}$
,
$U_0=0$
, and
$U_1,\dots,U_k$
are i.i.d. uniform on
$B_1$
.
Remark 2. In this case the tail configuration Y is an element of
$\mathcal{N}_{10}\subset \mathcal{N}$
, and as the proof below shows, the convergence in (3.3) is valid even in the stronger
$\mathcal{B}_{10}$
-vague topology.
Proof of Proposition
4. Recall that
$X = \Psi(P)$
for the scoring function
$\psi(s,\mu) =\rho_k(s,\mu)^{-1}$
, and let
$\tilde{P}= \delta_0 + P$
be a Palm version of P. Observe that the conditional distribution of
$\tilde{X}$
given
$\xi>u$
coincides with the conditional distribution of
$\Psi(\tilde{P})$
given
$P(B_{1/u}) \geq k$
. In view of this and taking into account that
$\mathbb{P}(P(B_{1/u})\ge k) \sim \mathbb{P}(P(B_{1/u})=k)$
, it is easy to see that, for any nonnegative f, the Laplace functional of
$\tilde{X}$
conditional on
$\{\xi >u\}$
satisfies

as
$u \to\infty$
.
Furthermore,
$\tilde{P}$
conditionally on
$P(B_{1/u}) = k $
has the distribution of

where
$U_1,\dots,U_k$
on the right-hand side are uniformly distributed in
$B_1$
and independent of P, and
$P_{B_{1/u}^c}$
is the restriction of P to the set
$B_{1/u}^c$
. Since
$r(u) = 1/u$
and
$u\rho_k(t, A) = \rho_k(ut, uA)$
,

where

and
$Z^{(u)}$
is uP restricted to
$B_1^c$
. Thus, for every nonnegative f,

Take now
$f\colon\mathbb{E}\to \mathbb{R}_+$
whose support is contained in the set
$B_a\times (0,\infty)\in \mathcal{B}_{10}$
for some
$a>0$
. We can and will assume that
$a\geq 2$
. Observe that

so that
$\mathbb{P}(Z^{(u)} (B_a) = 0)\to 1$
as
$u \to\infty$
. Since
$f(t,s)=0$
for
$t\notin B_a$
, on the event
$\{Z^{(u)} (B_a) = 0\}$
we have

where the second equality follows since
$a\geq 2$
, and so the kth nearest neighbor of each
$U_i$
is necessarily in
$\mathcal{Y}$
. Thus, for any such f,

In particular, this holds for any nonnegative, continuous and bounded function f on
$\mathbb{E}$
whose support is in
$\mathcal{B}_{11}$
since
$\mathcal{B}_{11} \subset \mathcal{B}_{10}$
, and thus, (3.3) holds.
By Proposition 1, the tail score at the origin
$\eta$
is
$\mathrm{Pareto}(dk)$
distributed, which is easily checked since

By Proposition 2,
$\eta$
is independent of the spectral tail configuration, which is (since
$\beta=-1$
) given by

A direct calculation shows that the random set
$\{U_0/U^{*},\dots,U_k/U^{*}\}$
has the same distribution as
$\mathcal{Y}^*\;:\!=\;\{U_0,U_1,\dots,U_{k-1}, U'_k\}$
, where
$U'_k$
is uniformly distributed on
$\partial B_1$
and independent of
$U_0, U_1,\dots, U_{k-1}$
. In particular,

5.1.2. Point process convergence
First, (5.1) implies that the sequence of thresholds
$(a_{\tau})$
in (4.1) can be chosen as

Scaling scores with
$a_{\tau}^{-1}$
is (up to a transformation) equivalent to scaling distances
$\rho_k(t,P)$
with
$a_{\tau}$
.
Let
$A^{\mathrm{fm}}$
be defined as in (4.7). Then the extremal index (depending on k and dimension) is given by

and (4.11) implies that Q has the conditional distribution of
$\Theta$
given that
$A^{\mathrm{fm}}(\Theta)=0$
. Observe that
$A^{\mathrm{fm}}(\Theta)=0$
if and only if
$\mathcal{Y}^*$
is not contained in
$B_1(U_i)$
for all
$i=1,\dots, k-1$
, and
$U_k'$
is lexicographically larger than 0 if
$\mathcal{Y}^*$
is a subset of the closure of
$B_1(U_k')$
.
If
$k=1$
then
$\mathcal{Y}^*=\{0,U_1'\}$
, so that
$A^{\mathrm{fm}}(\Theta)=0$
if and only if
$U_1'$
is lexicographically larger than 0. Thus,
$\vartheta_{1,d}=\frac{1}{2}$
and

in all dimensions, where
$U_1''$
is uniform on
$\partial B_1 \cap\{x=(x_1,\dots, x_d)\in \mathbb{R}^d \colon x_1\geq 0\}$
. The value
$\vartheta_{1,d}=\frac{1}{2}$
is intuitively obvious since, asymptotically, large values always come in pairs (with exactly the same score). In dimension
$d=2$
,
$\vartheta_{1,2}=\frac{1}{2}$
was obtained in [Reference Chenavier and Robert9, Section 4.2] when analyzing the extremal properties of the inradius of a Poisson–Voronoi tessellation.
If
$k=2$
,
$A^{\mathrm{fm}}(\Theta)=0$
if and only if
$U_1\notin B_1(U_2')$
. Thus, in all dimensions,

where
$U_1'$
is, conditionally on
$U_2'$
, uniform on
$B_1\setminus B_1(U_2')$
. Furthermore, owing to rotational invariance,

where
$e_1$
is the first basis vector of
$\mathbb{R}^d$
and
${\rm Leb}(\!\cdot\!)$
is the Lebesgue measure. In particular,
$\vartheta_{2,1}=\frac{1}{2}$
,

and
$\vartheta_{2,3}=\tfrac{33}{48}$
. Since
$\Gamma(x+\alpha)\sim\Gamma(x)x^\alpha$
as
$x\to\infty$
,

i.e.
$1-\vartheta_{2,d}$
goes to zero exponentially fast as the dimension grows. We can obtain the asymptotics for the integral
$\smash{I_d=\int_0^{\pi/3}\sin^d u \;\mathrm{d} u}$
by showing that
$\lim_{d\to\infty}(({d+1})/({\sin^d(\pi/3)))} I_d = \tan(\pi/3)$
.
For
$d=1$
, we have already seen that
$\vartheta_{1,1}=\vartheta_{2,1}=\smash{\frac{1}{2}}$
. Interestingly, we can check that
$\vartheta_{k,1}=\smash{\frac{1}{2}}$
for all
$k\in \mathbb{N}$
. Indeed, assume that
$U'_k=1$
without loss of generality, and then the maximal score is attained at zero if the unit ball around any
$j\in\{0,\dots,k-2\}$
points that fall in (0,1) does not cover
$k-j$
points uniformly distributed in
$(\!-\!1,0)$
. This probability can be calculated explicitly, and then the result follows by noting that j is binomially distributed.
The exact calculations of
$\vartheta_{k,d}$
become quite involved for
$k\geq3$
and
$d\geq 2$
.
Proposition 5. For arbitrary
$k, d\in \mathbb{N}$
, define X as in the beginning of Section 5.1 and
$(a_{\tau})_{\tau}$
as in (5.3). For any
$(b_{\tau})_{\tau}$
such that
$\tau^{-1/k}/b_{\tau}\to 0$
and
$b_{\tau}/\tau \to 0$
as
$\tau \to\infty$
, the assumptions of Theorem 2 hold (with
$\alpha=dk$
and
$r(u)=u^{-1}$
). Therefore, the convergence of the extremal blocks in (4.9) holds as well.
Proof. Assumption 1 holds, since
$r(u)=u^{-1}$
and
$a_\tau$
has the order
$\tau^{1/k}$
. For each
$\tau>0$
and
$\boldsymbol{i} \in I_{\tau}$
, define the block of indices
$J_{\tau,\boldsymbol{i}}$
by (4.3) and
$X_{\tau,\boldsymbol{i}}$
by (4.4). The key step in checking other assumptions of Theorem 2 is that, for every
$t\in P$
, the condition
$X(t)>y$
(i.e.
$\rho_k(t,P)<y^{-1}$
) is equivalent to
$P(B_{y^{-1}}(t))\geq k+1$
. Thus, given that
$X(t)>a_{\tau} \varepsilon$
, X(t) depends only on the points of P in
$B_{a_{\tau}^{-1} \varepsilon^{-1}}(t)$
, where
$a_{\tau}^{-1} \to 0$
as
$\tau\to\infty$
.
Fix
$\varepsilon,\delta,c>0$
and recall the notation from (4.14). The event
$\{\xi>a_\tau \varepsilon\}= \{\rho_k(0,\tilde{P})<(a_{\tau}\varepsilon)^{-1}\}$
depends only on
$\tilde{P}$
(that is, P) restricted to
$B_{(a_{\tau}\varepsilon)^{-1}}$
. Since
$C_{\tau,u}\subset B_{(a_{\tau}\varepsilon)^{-1}u}^c$
, as soon as
$u>1+\delta^{-1}/\varepsilon^{-1}$
the event
$\{M(\tilde{X}_{C_{\tau,u}})>a_{\tau} \delta\}=\{\min_{t\in \tilde{P}\cap C_{\tau,u}} \rho_k(t,\tilde{P})<(a_{\tau} \delta)^{-1}\}$
is determined only by
$\tilde{P}$
restricted to
$\smash{B_{(a_{\tau}\varepsilon)^{-1}}^c}$
(on this set
$P=\tilde{P}$
and, consequently,
$X=\tilde{X}$
). Since P is a Poisson process, this implies that, for all such u,

In the sixth step we used the refined Campbell’s theorem (2.1), and in the penultimate step we used (4.2) and the fact that
$b_{\tau}$
is chosen such that
$b_{\tau}/\tau \to 0$
. Thus, (4.14) holds, i.e. Assumption 2 is satisfied.
Now consider Assumption 3. Take
$(l_{\tau})_{\tau}$
such that
$l_{\tau}/b_{\tau}\to 0$
and
$a_{\tau}^{-1}/l_{\tau}\to 0$
as
$\tau \to\infty$
; since
$a_{\tau}^{-1}/b_{\tau}=r(a_{\tau})/b_{\tau}\to 0$
, we can, e.g., take
$l_{\tau}=\sqrt{b_{\tau}/a_{\tau}^{-1}}$
. Let
$f_{\tau, \boldsymbol{i}}$
,
$\tau>0$
,
$\boldsymbol{i} \in I_{\tau}$
, be an arbitrary family of shift-invariant measurable functions from
$\mathcal{N}_{01}$
to
$[0,\infty)$
such that, for some
$\delta>0$
and for all
$\tau>0$
,
$\boldsymbol{i} \in I_{\tau}$
and
$\mu\in \mathcal{N}_{01}$
,

where
$\mu^{\delta}$
denotes the restriction of
$\mu$
to
$\mathbb{R}^d\times (\delta,\infty)$
.
Similarly as above, for
$t\in P$
, the random variable
$X(t)\textbf{1}{\{X(t)>a_{\tau} \delta\}}$
depends only on P restricted to
$B_{(a_{\tau}\delta)^{-1}}(t)$
. Moreover, if
$(a_{\tau}\delta)^{-1}<l_{\tau}$
then

recall that
$\hat{J}_{\tau,\boldsymbol{i}}$
in (4.16) are obtained from the original blocks
$J_{\tau,\boldsymbol{i}}$
by trimming the edges by
$l_{\tau}$
. Thus, since
$\widehat{X}_{\tau,\boldsymbol{i}}\;:\!=\;X_{\hat{J}_{\tau,\boldsymbol{i}}}$
, the value of
$f_{\tau,\boldsymbol{i}}({T_{r(a_{\tau}),a_{\tau}}}\widehat{X}_{\tau,\boldsymbol{i}})$
depends only on P restricted to
$J_{\tau,\boldsymbol{i}}$
for all
$\boldsymbol{i}\in I_{\tau}$
. For such
$\tau$
, since P is a Poisson process and the
$J_{\tau,\boldsymbol{i}}$
s are disjoint, the left-hand side of (4.17) vanishes; hence, Assumption 3 holds.
A simple consequence of Proposition 5 concerns the behavior of the minimal distance to the kth nearest neighbor in a Poisson configuration of points on an increasing hypercube
$[0,\tau]^d$
. For all
$\tau>0$
, denote
$m_{\rho,\tau} = \min \{\rho_k(t,P) \colon t \in P\cap[0,\tau]^d\}$
. Then, for
$a_\tau$
as in (5.3), (4.12) implies that

for any
$v>0$
, that is, the scaled minimum distance to the kth nearest neighbor in a Poisson configuration in
$[0,\tau]^d$
converges to a (scaled) Weibull distribution.
5.2. Moving maxima model
Moving maxima models are very popular in the studies of time series. Below we put such models in the spatial context by starting with a marked Poisson point process and then adjusting the scores by giving each point the score derived from a weighted maximum of the scores of points from its neighbors. This general construction was described in Section 2.2.
Let
$\zeta$
be a positive random variable whose tail is regularly varying of index
$\alpha>0$
, and denote the distribution of
$\zeta$
by m. Assume that
$P\in \mathcal{N}_{10}$
is a unit rate independently marked stationary Poisson process on
$\mathbb{R}^d$
with marks in
$(0,\infty)$
, whose intensity measure is the product of the Lebesgue measure on
$\mathbb{R}^d$
and a probability measure m, so that
$\zeta$
is the typical mark. In what follows, write
$P=\textstyle\sum \delta_{(t,\zeta_t)}$
, where
$(\zeta_t)_{t}$
,
$t\in\mathbb{R}^d$
, are i.i.d. with distribution m. While, by independence, the large values of
$\zeta_t$
come in isolation in P, the clustering of scores is easily modeled by considering, for instance,
$X=\textstyle\sum \delta_{(t,\max_{|t-s|<1} \zeta_s)}$
, where the ‘large’ scores in P propagate to all neighboring locations in the ball of radius 1.
Let
$\mathcal{N}_g$
be the space of all simple locally finite point measures on
$\mathbb{R}^d$
equipped with the usual
$\sigma$
algebra. Consider a measurable function
$\Phi\colon\mathbb{R}^d\times\mathcal{N}_g\to \mathcal{N}_g$
such that, for each
$\mu'\in\mathcal{N}_g$
, and each
$t\in \mu'$
,
$\Phi(t,\mu')$
is a finite subset of
$\mu'$
that contains t. It is useful to interpret
$\Phi(t,\mu')$
as a neighborhood of t in
$\mu'$
, and all points
$x\in \Phi(t,\mu')$
as neighbors of t. Furthermore, assume that
$\Phi$
is shift equivariant in the sense that
$\Phi(t-x, \varphi_{x}\mu')=\Phi(t,\mu')-x$
for all
$\mu'\in \mathcal{N}_g$
,
$t\in \mu'$
, and
$x\in \mathbb{R}^d$
. Finally, denote by
$N(t,\mu')$
the cardinality of
$\Phi(t,\mu')$
. It can be easily seen that the arguments below also work if
$\Phi$
depends on some external sources of randomness, but stays independent of the marks
$(\zeta_t)$
.
Consider a (deterministic) weight function
$w\colon\mathbb{R}^d\to[0,\infty)$
such that
$w(0)>0$
, and define the scoring function
$\psi\colon\mathbb{R}^d \times \mathcal{N}_{10}\to [0,\infty)$
by

where
$\mu'$
is the projection of
$\mu$
on
$\mathbb{R}^d$
. Without loss of generality, in what follows assume that
$w(0)=1$
. If the weight function is identically one, the score at t is the maximum of the scores of its neighbors.
Since
$\psi$
is shift invariant,
$X\;:\!=\;\Psi(P)$
is a stationary marked point process on
$\mathbb{R}^d$
. Its Palm version is given by
$\tilde{X}\;:\!=\;\Psi(\tilde{P})$
, where
$\tilde{P}=P+\delta_{(0,\zeta_0)}$
, with
$\zeta_0$
having distribution m and being independent of P. For notational convenience, denote
$\widetilde{\Phi}(t)=\Phi(t,\tilde{P}')$
, and let
$\tilde{N}(t)=N(t,\tilde{P}')$
be the cardinality of
$\widetilde{\Phi}(t)$
for all
$t\in \tilde{P}'$
. Clearly,
$X=\textstyle\sum \delta_{(t,\max_{|t-s|<1} \zeta_s)}$
is a special example of this construction with
$w\equiv 1$
and
$\Phi(t,\mu') = \textstyle\sum_{s\in \mu'} \delta_s \textbf{1}_{\{|s-t|<1\}}$
, i.e. for the neighborhoods that consist of all points
$s \in \mu'$
within a distance less than 1 to t.
5.2.1. Tail configuration
Recall that the random score
$\zeta$
is assumed to be regularly varying with tail index
$\alpha$
.
Theorem 3. Assume that w is a bounded function and that
$\mathbb{E} [\tilde{N}(0)]<\infty$
. Then the following statements hold.
-
(i) The score at the origin
$\xi$ satisfies
(5.4)In particular,\begin{align}\lim_{u\to\infty}\frac{\mathbb{P}(\xi>u)}{\mathbb{P}(\zeta>u)}\to \mathbb{E}\Bigg[\sum_{t\in \widetilde{\Phi}(0)} w(t)^\alpha \Bigg]\;=\!:\;\kappa\in (0,\infty) .\end{align}
$\xi$ is regularly varying with tail index
$\alpha$ .
-
(ii) The tail configuration of X exists for a constant scaling function
$r\equiv 1$ (thus, with no scaling of the positions) and the distribution of its spectral part
$\Theta\in \mathcal{N}$ is given by
(5.5)for all measurable\begin{align}\mathbb{E}[h(\Theta)]=\kappa^{-1} \mathbb{E}\Bigg[\sum_{x\in\widetilde{\Phi}(0)} h\bigg(\bigg\{\big(t,\frac{w(x-t)}{w(x)}\bigg) \colon t\in \tilde{P}',\; x\in \widetilde{\Phi}(t)\bigg\}\bigg)w(x)^{\alpha}\Bigg]\end{align}
$h\colon\mathcal{N}\to \mathbb{R}_+$ , where the summand corresponding to x is understood to be 0 if
$w(x)=0$ .
Remark 3. For a point measure
$\mu$
on
$\mathbb{R}^d\times [0,\infty)$
whose restriction to
$\mathbb{E}=\mathbb{R}^d\times (0,\infty)$
is an element of
$\mathcal{N}$
, for any function h on
$\mathcal{N}$
, we define
$h(\mu)$
to be equal to the value of h on this restriction. In other words, we simply neglect the points of the form (t, 0). This is relevant for (5.5) since the weighting function w can in general attain the value 0.
Remark 4. Observe that the distribution of
$\Theta$
, in addition to
$w, \alpha,$
and
$\Phi$
, depends only on the distribution of
$\tilde{P}'= P' + \delta_0$
. It can be obtained as follows. First, let
$P^*$
be distributed as
$\tilde{P}'$
but from the tilted distribution

and denote
$\Phi^*(t)\;:\!=\;\Phi(t,P^*)$
for all
$t\in\mathbb{R}^d$
. Conditionally on
$P^{*}$
, let V be a
$\Phi^*(0)$
-valued random element such that V equals
$x\in \Phi^*(0)$
with probability proportional to
$w(x)^\alpha$
. Finally, let

restricted to
$\mathbb{E}$
.
Example 5. Here we use the notation of Remark 4.
-
(i) Assume that
$w(t)=0$ for all
$t\neq 0$ (and
$w(0)=1$ ), so that
$\psi(t, P)= \zeta_{t}$ . Then
$\smash{P^{*} \stackrel{d}{=} \tilde{P}'}$ and
$V=0$ a.s., so
$\Theta=\{(0,1)\}$ , i.e. the extreme scores of X appear in isolation.
-
(ii) Now let
$w(t)\equiv1$ , so that
\begin{align*}\psi(t, P) =\textstyle\max_{x\in \Phi(t,P')}\zeta_x ,\end{align*}
$\mathbb{E} \tilde{N}(0)<\infty$ . Then (5.6) implies that
$P^*$ is, compared to
$\tilde{P}'$ , biased towards configurations in which the origin has more neighbors. Furthermore,
$\Theta= \{(t, 1) \colon t\in P^*, V\in \Phi^*(t) \}$ , where V is uniform on
$\Phi^*(0)$ . Note that necessarily
$(0,1)\in \Theta$ . Observe also that (5.4) implies that
(5.7)\begin{align}\lim_{u\to\infty}\frac{\mathbb{P}(\!\textstyle\max_{x\in \widetilde{\Phi}(0)} \zeta_x >u)}{\mathbb{P}(\zeta>u)} = \mathbb{E}[\tilde{N}(0)] \, .\end{align}
Proof of theorem
3. Let
$f\colon \mathbb{E}\to [0,\infty)$
be an arbitrary continuous function such that

for some
$a,\varepsilon>0$
. We extend f to a continuous function on
$\mathbb{R}^d\times [0,\infty)$
by letting
$f(t,0)=0$
for all t. For notational convenience, set
$h(\mu)\;:\!=\; \textrm{e}^{-\mu(f)}$
for all
$\mu \in\mathcal{N}_{11}$
. Both (i) and (ii) would follow immediately if we show that

where
$Y\;:\!=\; T_{1,\eta^{-1}}\Theta$
for
$\Theta$
from (5.5) and
$\eta$
is
$\mathrm{Pareto}(\alpha)$
distributed and independent of
$\Theta$
.
First, write
$\tilde{P}=\{(t_{i}, \zeta_{i}) \colon i\geq 1\}$
, where
$\{t_1,\dots, t_{\tilde{N}(0)}\} = \widetilde{\Phi}(0)$
. Given the projection
$\smash{\tilde{P}'}$
of
$\smash{\tilde{P}}$
, the score
$\smash{\psi(t,\tilde{P})}$
depends only on
$\zeta_x$
for
$\smash{x\in \widetilde{\Phi}(t)}$
. By (5.8),

where
$K_a=K_a(\tilde{P}')$
is the smallest nonnegative integer such that
$\{t_1,\dots, t_{K_a}\}$
contains
$\widetilde{\Phi}(t)$
for all
$t\in\tilde{P}'_{B_a}$
. Observe that
$K_a\geq 1$
. Since
$\tilde{P}'$
and
$(\zeta_1,\zeta_2,\ldots)$
are independent, conditioning on
$\smash{\tilde{P}'}$
yields

where, for all
$\mu'$
from the support of
$\tilde{P}'$
,

and
$(t_i)_{i\geq1}$
are deterministic and depend only on
$\mu'$
. Observe that, for every fixed
$\mu$
(write
$k\;:\!=\;K_a(\mu')$
and
$n\;:\!=\;N(0,\mu')$
, so
$k\geq n$
), the function under the expectation

is bounded (since h is bounded) and continuous except on the set

(since f and, hence, h is continuous). Furthermore, this function has support bounded away from the origin in
$\mathbb{R}^{k}$
, since it vanishes whenever

Since
$(\zeta_{i})_{i}$
are i.i.d. regularly varying with index
$\alpha$
, the vector
$(\zeta_{1},\dots, \zeta_{k})$
is multivariate regularly varying in
$\mathbb{R}^k_+\setminus\{0\}$
with the same index; see [Reference Resnick26, p. 192]. In particular,

for a certain measure
$\nu$
on
$\mathbb{R}_+^{k}\setminus\{0\}$
concentrated on the axes. More precisely, by [Reference Resnick26, p. 192], if
$\eta$
is
$\mathrm{Pareto}(\alpha)$
distributed, the right-hand side of (5.13) equals

where the ith summand on the right-hand side is set to be 0 if
$w(t_i)=0$
. Recalling that g was defined in (5.10), this equals

Recall that
$\{t_1,\dots, t_n\}=\Phi(0,\mu')$
and that
$n\leq k$
. Since
$f(t,0)=0$
for all
$t\in \mathbb{R}^d$
and since, for
$j>k=K_a(\mu')$
, we have
$t_j\notin B_a$
and
$f(t_j, y)=0$
regardless of y, the expression above (and, therefore, the right-hand side of (5.13)) actually equals

Going back to (5.11), (5.13) yields

where
$\eta$
and
$\tilde{P}'$
are independent, which is precisely (5.9). It remains to justify the interchange of the limit and expectation in (5.14).
Since g is bounded by 1 and
$w_*\;:\!=\;\sup_{t\in \mathbb{R}^d} w(t)<\infty$
for each
$\mu'$
and
$u>0$
, we have
$g_u(\mu') \leq N(0,\mu')\mathbb{P}(\zeta>u/w_*)$
. The regular variation property of
$\zeta$
yields

a.s. as
$u\to\infty$
. Moreover, since
$\mathbb{E}[\tilde{N}(0)w_*^\alpha]=\mathbb{E}[\tilde{N}(0)]w_*^\alpha $
is finite by assumption, Pratt’s extension of the dominated convergence theorem (see [Reference Pratt25, Theorem 1]) justifies the interchange

and this completes the proof.
5.2.2. Point process convergence
In the following we assume that the assumptions of Theorem 3 hold so that X admits a tail configuration Y whose spectral configuration
$\Theta$
is given by (5.5); recall the convention explained in Remark 3. In order to determine the ingredients of the limiting point process in Theorem 2, the following result is crucial; our approach used here is similar to that taken by Planinić [Reference Planinić24, Section 4] for random fields over
$\mathbb{Z}^d$
.
Lemma 4. Denote by W the random element in
$\mathcal{N}$
equal to

restricted to
$\mathbb{E}$
. Then the spectral tail configuration
$\Theta$
from (5.5) satisfies

for all
$h\colon\mathcal{N}\to [0,\infty)$
. In particular,

Proof. Observe that

where we used the shift equivariance of
$\Phi$
to obtain the second equality. The point stationarity of
$\tilde{P}'$
(see (2.2)) yields

which is precisely what we wanted to prove. Equation (5.17) follows by taking
$h\equiv 1$
.
By setting
$w \equiv 1$
, the two expressions for
$\kappa$
in (5.4) and (5.17) imply that

This implies that W, and therefore
$\Theta$
, a.s. has finitely many points in
$\mathbb{E}=\mathbb{R}^d\times (0,\infty)$
. In particular, we can regard W and
$\Theta$
as elements of
$\mathcal{N}_{01}$
, i.e.
$\mathbb{P}(W\in \mathcal{N}_{01})=\mathbb{P}(\Theta\in\mathcal{N}_{01})=1$
; the same holds for the tail configuration as well.
We now turn our attention to the process Q defined in (4.22). Recall that
$M(\mu)$
denotes the maximal score of
$\mu\in \mathcal{N}_{01}$
.
Proposition 6. The distribution of Q from (4.22) in
$\tilde{\mathcal{N}}_{01}$
is given by

Moreover,

Proof. Using (4.20) and since
$\beta=0$
, for an arbitrary shift invariant and bounded
$h\colon \mathcal{N}_{01}\to[0,\infty)$
, we have

Observe that
$\tilde{h}$
is shift invariant and homogenous in the sense that
$\tilde{h}(T_{1,y}\mu)=\tilde{h}(\mu)$
for all
$\mu\in\mathcal{N}_{01}$
,
$y>0$
. By (5.16),

Owing to (5.17), taking
$h\equiv 1$
yields (5.19), while (5.18) follows since h is arbitrary.
We now give sufficient conditions under which the assumptions of Theorem 2 are satisfied. First, take a family
$(a_{\tau})_{\tau}$
such that (4.1) holds, and fix an arbitrary family
$(b_{\tau})_{\tau}$
such that
$b_{\tau} \to\infty$
and
$b_{\tau}/\tau \to 0$
as
$\tau \to\infty$
; since r is a constant function, this is equivalent to choosing
$(b_{\tau})$
such that Assumption 1 holds.
Assumption 4. Let
$\tilde{\mathcal{N}}_g$
be the space of all simple locally finite point measures
$\mu'$
on
$\mathbb{R}^d$
such that
$0\in \mu'$
. Assume that there exists a measurable function
$R\colon\tilde{\mathcal{N}}_g\to [0,\infty]$
such that
$R(\tilde{P}')<\infty$
a.s. and, for all
$\mu'\in\tilde{\mathcal{N}}_g$
,
-
(i) for all
$t\in \mu'$ such that
$t\notin B_{R(\mu')}$ ,
$\Phi(0,\mu') \cap\Phi(t,\mu')=\varnothing$ and
$\Phi(t,\mu') = \Phi(t,\mu'\setminus\{0\})$ ;
-
(ii)
$\Phi(0,\mu') = \Phi(0,\nu')$ for all
$\nu'\in \mathcal{N}_g$ such that
$\mu'$ and
$\nu'$ coincide on
$B_{R(\mu')}$ , i.e.
$\Phi(0,\mu')$ is unaffected by changing points in
$\mu'$ outside of
$B_{R(\mu')}$ .
Note that (ii) above necessarily implies that
$\Phi(0,\mu') =\Phi(0,\mu'\cap B_{R(\mu')}) \subseteq B_{R(\mu')}$
for all
$\mu'\in\tilde{\mathcal{N}}_g$
.
Example 6.
-
(a) Assume that, for some
$r_0>0$ ,
\begin{align*}{ \Phi(t,\mu') = \{x\in \mu' \colon |t-x|<r_0\} \quad\text{for all } t\in \mu' .}\end{align*}
$R= 2r_0$ .
-
(b) If
$\Phi(t,\mu')$ is the set containing t and the k nearest neighbors of t in
$\mu'\setminus \{t\}$ (with respect to the Euclidean distance), R satisfying Assumption 4 can be constructed as in the proof of [Reference Penrose and Yukich22, Lemma 6.1]; see also [Reference Eichelsbacher, Raič and Schreiber14, p. 104]. Observe that in this case taking
$R(\mu')$ to be the distance to the kth nearest neighbor of 0 in
$\mu'\setminus \{0\}$ is not sufficient for property (i), and this property is crucial to ensure that the anticlustering condition holds.
Proof. Denote
$\tilde{R}\;:\!=\;R(\tilde{P}')$
and
$w_*\;:\!=\;\max_{x\in \mathbb{R}^d}w(x)$
. Recall that
$w_*\in [1,\infty)$
since
$w(0)=1$
and w is a bounded function. For notational convenience, assume that
$w_{*}=1$
; the proof below is easily extended to the general case. Then, for every
$t\in\tilde{P}'$
,

Owing to (5.4), for arbitrary
$\varepsilon,\delta,c>0$
, to show (4.14) it suffices to prove that

where

Fix a
$u>0$
. If
$\tilde{R}<u$
, since
$C_{\tau,u} \subseteq B_u^c$
, Assumption 4(i) implies that the families
$\{\zeta_x \colon x\in \widetilde{\Phi}(t) \text{ for some } t\in\tilde{P}'\cap C_{\tau,u}\}$
and
$\{\zeta_x \colon x\in\widetilde{\Phi}(0)\}$
consist of completely different sets of the
$\zeta_x$
s, and hence, are independent (given
$\tilde{P}'$
). Moreover, if
$\tilde{R}<u$
, since
$\tilde{P}'=P'\cup \{0\}$
, Assumption 4(i) also implies that

Therefore,

where the inequality relies on the fact that
$\tilde{P}'$
is independently marked. Furthermore,

Conditioning on
$\tilde{P}'$
yields

for all
$u,\tau>0$
. The second term on the right-hand side does not depend on
$\tau$
and vanishes as
$u\to\infty$
by the dominated convergence theorem since
$\mathbb{P}(\tilde{R}<\infty)=1$
and
$\mathbb{E}\tilde{N}(0)<\infty$
. For the first term, observe that, for each
$u>0$
and
$n\in \mathbb{N}$
,

Since
$\lim_{n\to\infty}\mathbb{E}[\tilde{N}(0)\textbf{1}_{\{\tilde{N}(0)>n\}}]=0$
due to
$\mathbb{E} \tilde{N}(0)<\infty$
, to show (5.20) it suffices to prove that

This holds since
$a_{\tau}$
is chosen so that (4.1) holds, while for all u,
$|C_{\tau,u}|/\tau^d \leq |B_{b_{\tau}c}|/\tau^d =\mathrm{const} \cdot \, b_{\tau}^d/\tau^d \to 0$
as
$\tau\to\infty$
by the choice of
$(b_\tau)_{\tau}$
. Indeed, the refined Campbell’s formula (2.1) gives

for all
$u,\tau>0$
. Owing to (4.2), (5.4), and (5.7),
$b_{\tau}/\tau\to 0$
implies that (5.21) holds.
Proposition 8. If Assumption 4 holds, Assumption 3 is satisfied for any
$(l_{\tau})_{\tau}$
such that
$l_{\tau}\to \infty$
and
$l_{\tau}/b_{\tau}\to 0$
as
$\tau\to\infty$
.
Proof. Recall that, for a fixed family
$(l_{\tau})_{\tau}$
, and each
$\tau>0$
and
$\boldsymbol{i} \in I_{\tau}$
, the block of indices
$J_{\tau,\boldsymbol{i}}$
is defined by (4.3), its trimmed version
$\hat{J}_{\tau,\boldsymbol{i}}$
by (4.16), and that
$\smash{\widehat{X}_{\tau,\boldsymbol{i}}\;:\!=\;X_{\hat{J}_{\tau,\boldsymbol{i}}}}$
. Now let
$f_{\tau, \boldsymbol{i}} \in\mathcal{F}$
,
$\tau>0, \boldsymbol{i} \in I_{\tau}$
, be an arbitrary family of functions that satisfy (4.15) for the same
$\delta>0$
. For notational convenience, write

for all
$\tau>0$
,
$\boldsymbol{i} \in I_{\tau}$
. To confirm Assumption 3, we need to show that

We first extend the definition of the radius R from Assumption 4 by setting
$R(t,\mu')\;:\!=\;R(\varphi_{t}\mu')$
for all
$\mu'\in \tilde{\mathcal{N}}_g$
,
$t\in\mu'$
. Using shift equivariance of
$\Phi$
, Campbell’s formula, and the assumption
$\mathbb{P}(R(\tilde{P}')<\infty)=1$
, it is not difficult to show that a.s., for all
$t\in P'$
,
$\Phi(t,P') = \Phi(t,\nu')$
for all
$\nu'$
that coincide with P’ on
$B_{R(t,\mu')}(t)$
.
Thus, for every
$t\in P'$
,
$R(t,P')<u$
implies that

Furthermore, the value of
$W_{\tau, \boldsymbol{i}}$
depends only on those
$t\in P'\cap \hat{J}_{\tau,\boldsymbol{i}}$
with score
$\psi(t,P)>a_{\tau}\delta$
. In particular, if

the random variable
$W_{\tau,\boldsymbol{i}}$
depends only on P restricted to
$J_{\tau,\boldsymbol{i}}$
.
Now construct
$k_{\tau}^d$
i.i.d. Poisson processes
$P_{(\boldsymbol{i})},\boldsymbol{i}\in I_{\tau}$
, with a common distribution equal to the distribution of P and such that, for each
$\boldsymbol{i} \in I_{\tau}$
, restrictions of
$P_{(\boldsymbol{i})}$
and P on the block
$J_{\tau,\boldsymbol{i}}$
coincide. Furthermore, for each
$\boldsymbol{i} \in I_{\tau}$
, let
$W_{\tau,\boldsymbol{i}}^{*}$
be constructed from
$P_{(\boldsymbol{i})}$
in the same way as
$W_{\tau,\boldsymbol{i}}$
is constructed from P. In particular, since the
$W_{\tau,\boldsymbol{i}}^{*}$
s are independent,

and, moreover,
$W_{\tau,\boldsymbol{i}} = W_{\tau,\boldsymbol{i}}^*$
whenever

Thus, since
$\cup_{\boldsymbol{i}\in I_{\tau}} \hat{J}_{\tau,\boldsymbol{i}}\subseteq [0,\tau]^d$
and
$0\leq W_{\tau,\boldsymbol{i}}\le 1$
,

Using shift invariance and the refined Campbell’s theorem (2.1), we obtain

for all
$\tau>0$
. Since

for
$w_{*}=\max_{t\in \mathbb{R}^d} w(t)\in [1,\infty)$
and
$\tilde{P}'$
is independent of the
$\zeta$
s, conditioning on
$\tilde{P}'$
yields

Owing to (5.4) and (4.2),
$\tau^d\mathbb{P}(\zeta>a_{\tau}\delta/w_{*})$
converges to a positive constant, so (5.22) follows by dominated convergence since
$l_{\tau}\to\infty$
,
$\mathbb{P}(R(0,\tilde{P}')<\infty)=1$
and
$\mathbb{E}\tilde{N}(0)<\infty$
.
The above arguments lead to the following conclusion.
Proposition 9. Let X be defined as in the beginning of Section 5.2. Assume that w is a bounded function,
$\mathbb{E} [\tilde{N}(0)]<\infty$
, and that there exists an R satisfying Assumption 4. Let
$(a_{\tau})_{\tau}$
be as in (4.1). Then any family
$(b_{\tau})_{\tau}$
, such that
$b_{\tau}\to \infty$
and
$b_{\tau}/\tau \to 0$
as
$\tau \to\infty$
, satisfies all of the assumptions of Theorem 2 (with
$\alpha$
being equal to the tail index of
$\zeta$
and
$r\equiv 1$
) and, therefore, the convergence of the extremal blocks in (4.9) holds.
6. Proof of Theorem 2
Recall that X is a stationary marked point process on
$\mathbb{E}$
that admits a tail configuration Y with tail index
$\alpha>0$
and a scaling function r of scaling index
$\beta\in \mathbb{R}$
. Moreover, assume that
$(a_{\tau})_{\tau>0}$
satisfies (4.1) and fix a family of block side lengths
$(b_{\tau})_{\tau>0}$
.
For notational convenience, for all
$\tau>0$
and
$y>0$
, denote in what follows

We first extend (3.3) to convergence in the space
$\mathcal{N}_{01}$
with the
$\mathcal{B}_{01}$
-vague topology.
Proposition 10. Assume that
$(b_\tau)_{\tau>0}$
satisfies Assumptions 1 and 2. Then

Furthermore, for all
$y>0$
,

on
$\mathcal{N}_{01}$
, where
$(D_\tau)_{\tau>0}$
is any family of subsets of
$\mathbb{R}^d$
such that

for some constants
$0<c_1< c_2$
.
Proof. Recall that
$\mathbb{P}(Y\in \mathcal{N}_{11})=1$
, i.e. Y a.s. has finitely many points in
$B\times (\delta,\infty)$
for all bounded
$B\subset\mathbb{R}^d$
and all
$\delta>0$
. To prove (6.1), we need to show that
$Y(\mathbb{R}^d\times (\delta,\infty))<\infty$
a.s. for all
$\delta>0$
. Fix
$u,\delta>0$
and observe that

By definition (3.3) of Y, for all but at most countably many
$u'>u$
,

Since
$b_{\tau}/r(a_{\tau})\to \infty$
by Assumption 1, the two previous relations imply that

Assumption 2 (with
$c=\varepsilon=1$
) yields

Thus,

and since
$\mathbb{P}(Y\in \mathcal{N}_{11})=1$
, we have
$Y(\mathbb{R}^d\times [\delta,\infty))<\infty$
a.s. Since
$\delta$
was arbitrary, this proves (6.1).
We now turn to (6.2). Fix
$y>0$
. For all but at most countably many
$u>0$
, the definition of the tail configuration implies that

in
$\mathcal{N}_{11}$
with the
$\mathcal{B}_{11}$
-vague topology. For measures in
$\mathcal{N}_{11}$
whose support is in
$B_u\times (0,\infty)$
,
$\mathcal{B}_{11}$
-vague topology is actually equivalent to the (in general stronger)
$\mathcal{B}_{01}$
-vague topology. Thus, (6.5) holds on
$\mathcal{N}_{01}$
with respect to the
$\mathcal{B}_{01}$
-vague topology as well. Furthermore, it is easy to see that, owing to (6.4),
$Y_{B_u} \to Y$
a.s. in
$\mathcal{N}_{11}$
as
$u\to\infty$
. In particular,

on
$\mathcal{N}_{11}$
with respect to the
$\mathcal{B}_{01}$
-vague topology. By the classical result on weak convergence of probability measures (see [Reference Billingsley6, Theorem 4.2]), to prove (6.2) it suffices to show that

where
${\textsf{m}}$
is the metric from (4.5) (which generates the
$\mathcal{B}_{01}$
-vague topology on
$\mathcal{N}_{01}$
). For this, observe that

Let
$u,\varepsilon>0$
be arbitrary. Due to the first inclusion in (6.3) and since
$b_{\tau}/r(a_{\tau} y)\to \infty$
by Assumption 1,
$B_{r(a_{\tau} y)u} \subseteq D_{\tau}$
for all sufficiently large
$\tau$
. Observe that if

then
$T^{\tau}_y (\tilde{X}_{B_{r(a_{\tau} y) u}})$
and
$T^{\tau}_{y} (\tilde{X}_{D_{\tau}})$
coincide when restricted to
$\mathbb{R}^d\times (1/r,\infty)$
with
$r < 1/\varepsilon$
. Since
${\textsf{m}}_0$
is bounded by 1,

If (6.8) does not hold, we use the fact that
${\textsf{m}}$
is bounded by 1. Thus,

where the last inequality holds, since
$D_{\tau}\subseteq B_{b_{\tau}c_2}$
. If we now let
$u\to\infty$
and then
$\varepsilon\to 0$
, Assumption 2 implies (6.7), and this proves (6.2).
For all
$\mu\in \mathcal{N}_{01}$
and
$y>0$
, let

where the minimum is taken with respect to the lexicographic order; if
$M(\mu)\leq y$
, set
$A_y(\mu)\;:\!=\;0$
. Note that
$A_y$
is well defined, since, for every
$\mu\in \mathcal{N}_{01}$
and
$y>0$
, there are at most finitely many
$(t,s)\in \mu$
with
$s>y$
. Observe also that
$A_y$
is equivariant under translations, that is, if
$M(\mu)>y$
,

In particular,
$A_1$
is precisely the first exceedance anchoring function
$A^{\mathrm{fe}}$
from (4.18). As shown in Lemma 3,
$\mathbb{P}(A_1(Y)=0)$
is positive whenever
$\mathbb{P}(Y\in \mathcal{N}_{01})=1$
, which holds, for instance, under Assumptions 1 and 2. Recall also that
$X_{\tau}\;:\!=\;X_{[0,b_\tau]^d}$
.
Proposition 11. Assume that Assumptions 1 and 2 hold. Then, for every
$y>0$
,

and

Proof. Consider a function
$h\colon \mathcal{N}_{01}\to [0,\infty)$
, which is bounded continuous and shift invariant. First, observe that we can decompose

Since h is shift invariant and since
$T^{\tau}_{y}$
scales the positions with
$r(a_{\tau}y)^{-1}$
,

By the definition of
$A_{a_{\tau} y}$
,

Next,
$\{X(t)>a_{\tau} y\}=\{\varphi_{t}X(0)>a_{\tau}y\}$
. All of the above implies that

for an obvious choice of a function
$g\colon\mathbb{R}^d \times\mathcal{N}_{01} \to [0,\infty)$
. Refined Campbell’s theorem (2.1) yields

where for the last equality we used the substitution
$s=t/b_{\tau}$
and the fact that
$A_{a_{\tau} y}(\mu) = 0$
if and only if
$A_{1}(T^{\tau}_{y}\mu) = 0$
. In particular (recall that
$\xi=\tilde{X}(0)$
),

where
$f(\mu)\;:\!=\;h(\mu)\textbf{1}_{\{A_1(\mu)=0\}}$
. Since, for every fixed
$s\in (0,1)^d$
, sets
$D_{\tau}\;:\!=\;[0,b_{\tau}]^{d}-b_{\tau}s$
,
$\tau>0$
, satisfy (6.3), (6.2) implies that

Observe here that f is not continuous on the whole
$\mathcal{N}_{01}$
, but since the probability that
$(t,1)\in Y$
for some
$t\in\mathbb{R}^d$
is zero (due to Propositions 1 and 2), it is continuous on the support of Y—this justifies the use of (6.2). By the dominated convergence theorem and since the limit in (6.14) does not depend on s, (6.13) yields

Convergence in (6.11) follows from (6.15) with
$h\equiv 1$
. To prove (6.12), we note that

Let
$\boldsymbol{M}^{*}$
be the space of all Borel measures on
$\tilde{\mathcal{N}}_{01}^{*}$
that are finite on all sets from the family

Equip
$\boldsymbol{M}^{*}$
with the vague topology generated by
$\mathcal{B}^*$
. Furthermore, let
$\boldsymbol{M}$
be the space of all Borel measures on
$[0,1]^d\times \tilde{\mathcal{N}}_{01}^*$
taking finite values on
$[0,1]^d\times B$
for all
$B\in \mathcal{B}^*$
, and equip it with the corresponding vague topology. Observe that the intensity measure

of the point process
$N_{\tau}$
from (4.8) is an element of
$\boldsymbol{M}$
for all
$\tau>0$
. Recall that
$\boldsymbol{N}\subset\boldsymbol{M}$
defined right before Theorem 2 is the subset of all counting measures.
Proposition 12. (Intensity convergence.) Assume that Assumptions 1 and 2 hold. Then

where

and
$Q\in \mathcal{N}_{01}$
has distribution (4.11). In particular,

in
$\boldsymbol{N}$
, where Leb is the Lebesgue measure on
$\mathbb{R}^d$
.
Proof. Let
$h\colon\boldsymbol{M}^{*}\to [0,\infty)$
be a bounded and continuous function such that, for some
$\varepsilon>0$
,
$h([\mu])=0$
whenever
$M([\mu])\leq \varepsilon$
. Then

Since
$k_{\tau}\sim \tau / b_{\tau}$
as
$\tau\to\infty$
, (4.1) implies that
$k_{\tau} \sim (b_{\tau}^d \mathbb{P}(\xi >a_{\tau}))^{-1}$
as
$\tau \to\infty$
. Thus, as
$\tau\to\infty$
,

By (6.11), the second term on the right-hand side of (6.20) converges to
$\mathbb{P}(A_1(Y)=0)$
, which in turn by Proposition 3 equals
$\vartheta=\mathbb{P}(A(Y)=0)$
. By the regular variation property (3.5), the third term tends to
$\varepsilon^{-\alpha}$
. For the first term, recall that
$T_{y}^{\tau}$
scales the positions with
$r(a_{\tau}y)^{-1}$
and scores with
$(a_{\tau}y)^{-1}$
. Thus,

By assumption, r is a regularly varying function with index
$\beta$
, so
$r(a_{\tau})/r(a_{\tau}\varepsilon)\to \varepsilon^{-\beta}$
as
$\tau\to\infty$
. In particular, (6.12) and the extended continuous mapping theorem (see [Reference Billingsley6, Theorem 5.5]) imply that

By Proposition 3 and (4.19), we can rewrite the limit as

In the above we used the substitution
$y=u\varepsilon$
to obtain the fourth equality. For the final equality, note that
$M(T_{y^{-\beta}, y^{-1}} Q)=y$
since
$M(Q)=1$
a.s. In particular,
$h([T_{y^{-\beta}, y^{-1}} Q])= 0$
a.s. whenever
$y\le\varepsilon$
by the properties of h stated in the beginning of the proof.
Bringing everything together, (6.19) and (6.20) imply that

where
$\nu$
is defined in (6.17). Since h is arbitrary, this proves (6.16).
We now prove (6.18). By [Reference Kallenberg16, Lemma 4.1], it suffices to prove that
$\mathbb{E}[N_{\tau}(g)] \to ({\rm Leb}\times \nu)(g)$
for all
$g\colon[0,1]^d\times \tilde{\mathcal{N}}_{01}^{*}\to [0,\infty)$
of the form
$g(t,[\mu]) = \textbf{1}_{(a,b]}(t) \textbf{1}_{A}([\mu])$
, where (a, b] is the parallelepiped in
$\mathbb{R}^d$
determined by
$a=(a_1,\dots, a_d), b=(b_1,\dots,b_d)\in [0,1]^d$
with
$a_j\leq b_j$
for all j, and
$A\in \mathcal{B}^{*}$
(with
$\mathcal{B}^{*}$
defined just before Proposition 12) such that
$\nu(\partial A)=0$
.
Recall that
$X_{\tau}=X_{[0,b_{\tau}]^d}$
, so
$X_{\tau}=X_{\tau,\textbf{1}}$
and due to stationarity of X,
$\mathbb{P}([X_{\tau}]\in \cdot \, )= \mathbb{P}([X_{\tau,\boldsymbol{i}}]\in \cdot \, )$
for all
$\boldsymbol{i}\in I_{\tau}=\{1,\dots,k_{\tau}\}^d$
, where
$k_{\tau}=\lfloor\tau/b_{\tau}\rfloor$
. Thus, as
$\tau \to\infty$
,

where in the penultimate step we applied (6.16).
To complete the proof of Theorem 2, we need the following technical lemma.
Lemma 5. Assume that
$(b_\tau)_{\tau>0}$
is such that Assumptions 1 and 2 hold. Then, Assumption 3 implies that

as
$\tau\to\infty$
for every
$f\colon [0,1]^d \times\tilde{\mathcal{N}}_{01} \to [0,\infty)$
such that
-
(i) for some
$\varepsilon>0$ ,
$M(\mu)\leq \varepsilon$ implies that
$f(t,[\mu])=0$ for all
$t\in [0,1]^d$ ;
-
(ii) f is Lipschitz, that is, for some
$c>0$ ,
\begin{align*}|f(t, [\mu])- f(s,[\nu])| \leq c\max\{|t-s|, \tilde{{\textsf{m}}}([\mu],[\nu]) \}\end{align*}
$t,s\in [0,1]^d$ and nontrivial measures
$\mu, \nu \in \mathcal{N}_{01}$ .
Observe that, for each
$\tau>0$
, the first term on the left-hand side of (6.21) is the Laplace functional
$L_f(N_{\tau})$
of the point process
$N_{\tau}$
, while the second term is the Laplace functional of the point process

where the blocks
$X_{\tau,\boldsymbol{i}}^*$
,
$\boldsymbol{i} \in I_{\tau}$
, are independent and, for each
$\boldsymbol{i}\in I_{\tau}$
,
$X_{\tau,\boldsymbol{i}}^*$
has the same distribution as the original block
$X_{\tau,\boldsymbol{i}}$
.
Proof of Lemma
5. Let f be an arbitrary function satisfying the assumptions of the lemma for some
$\varepsilon$
and c. Fix an arbitrary
$\delta<\varepsilon$
and define a function
$f^{\delta}\colon [0,1]^d \times \mathcal{N}_{01} \to[0,\infty)$
by
$f^\delta(t,\mu)\;:\!=\; f(t,[\mu^{\delta}])$
, where
$\mu^\delta$
is the restriction of
$\mu \in \mathcal{N}_{01}$
to
$\mathbb{R}^d\times(\delta,\infty)$
. For all
$t\in [0,1]^d$
and all
$\mu\in \mathcal{N}_{01}$
, the Lipschitz property of f implies that

see (6.9) for a similar argument that justifies the last inequality.
By the construction of
$f^\delta$
and properties of f, Assumption 3 implies that (4.17) holds for the family

We first show that in (4.17) it is possible to replace the trimmed blocks
$\hat{X}_{\tau,\boldsymbol{i}}$
with the original blocks
$X_{\tau,\boldsymbol{i}}$
. For this, observe that

where we used the elementary inequality

valid for all k and all
$a_i,b_i\in[0,1]$
,
$i=1,\dots, k$
. Recall the blocks of indices
$J_{\tau,\boldsymbol{i}}$
and
$\hat{J}_{\tau,\boldsymbol{i}}$
from (4.3) and (4.16), respectively. Observe that
$M(X_{J_{\tau,\boldsymbol{i}}\setminus \hat{J}_{\tau,\boldsymbol{i}}})\leq a_{\tau} \delta$
implies that
$f_{\tau,\boldsymbol{i}}(T^{\tau}_{1} {X}_{\tau,\boldsymbol{i}})=f_{\tau,\boldsymbol{i}}( T^{\tau}_{1}\hat{X}_{\tau,\boldsymbol{i}})$
. In particular, due to stationarity of X,

as
$\tau\to\infty$
. In the third step we used the refined Campbell’s formula, in the fourth the assumption
$l_{\tau}/b_{\tau}\to 0$
, in the fifth the fact that
$k_{\tau} \sim \tau / b_{\tau}$
, and, finally, (4.2). Thus,

After applying (6.23), the same arguments also imply that

Together with (4.17), this implies that, as
$\tau\to\infty$
,

Using (6.23) and the inequality
$|\textrm{e}^{-x}-\textrm{e}^{-y}|\leq |x-y|$
for
$x,y\geq 0$
, we obtain

Since
$M(\mu)\leq \varepsilon$
implies that
$f(t,[\mu])=f^{\delta}(t,\mu)=0$
, for all
$\boldsymbol{i}\in I_{\tau}$
,

where we used (6.22) in the third step. Thus, the right-hand side in (6.25) is bounded by

which by (4.2) and (6.24) tends to
$c \textrm{e}^{-1/\delta} \varepsilon^{\alpha}$
as
$\tau\to\infty$
. Since
$\delta\in (0,\varepsilon)$
is arbitrary, letting
$\delta\to 0$
finally yields (6.21).
We are finally in position to prove Theorem 2.
Proof of Theorem
2. Since the family of Lipschitz continuous functions from Lemma 5 is convergence determining in the sense of [Reference Basrak and Planinić3, Definition 2.1] (see [Reference Basrak and Planinić2, Proposition 4.1]), the convergence of intensities (6.18) and the asymptotic independence of blocks (6.21) imply that, as
$\tau\to\infty$
,
$N_{\tau}$
converges in distribution to a Poisson point process N on
$\boldsymbol{N}=[0,1]^d\times \tilde{\mathcal{N}}_{01}^*$
whose intensity measure is
${\rm Leb}\times \nu$
; this is [Reference Basrak and Planinić3, Theorem 2.1], which is a consequence of the classical Grigelionis theorem (see [Reference Kallenberg16, Corollary 4.25]). Standard transformation results for Poisson processes now imply that N can be constructed as in (4.9), and this completes the proof of Theorem 2.
Acknowledgements
This work was supported by the Swiss Enlargement Contribution in the framework of the Croatian–Swiss Research Programme (project number IZHRZ0_180549), and by the Croatian Science Foundation project IP-2022-10-2277.
Funding information
There are no funding bodies to thank relating to this creation of this paper.
Competing interests
There were no competing interests to declare which arose during the preparation or publication process of this paper.