1. Introduction
Community structures are ubiquitous in graphs modelling natural and social phenomena. In natural sciences, atoms form molecules so that atoms in the same molecule have stronger connections compared to those in different molecules. In social sciences, individuals form groups in such a way that individuals in the same group have more communications compared to individuals in different groups. The main aim for community detection is to determine the specific groups that specific individuals belong to based on observations of (random) connections between individuals. Identifying different communities in the stochastic block model is a central topic in many fields of science and technology; see [Reference Abbe1] for a summary.
In this paper, we study the community detection problem for the Gaussian mixture model, in which there are
$n$
vertices belonging to
$k$
(
$k\geq 2$
) different communities. We observe a
$p\times 1$
vector for each one of the
$n$
vertices, perturbed by a
$p\times 1$
Gaussian vector with independent (but not necessarily identically distributed), mean-0 entries. More precisely, each entry of the
$p\times n$
perturbation matrix is obtained by a multiple of a standard Gaussian random variable, while the intensities of different entries are different. Given such an observation, we find the maximum likelihood estimation (MLE) for the community assignment and study the probability that the MLE equals the true community assignment as the number of vertices
$n\rightarrow \infty$
. If this probability tends to
$1$
as
$n\rightarrow \infty$
, we say exact recovery occurs. Heuristically, it is natural to conjecture that exact recovery may occur when the intensities of the perturbations are small but does not occur when these intensities are large. The major theme of the paper is to investigate how small the intensities of the perturbations are needed in order to ensure the exact recovery and how large the intensities are required to stop the occurrences of the exact recovery.
Clustering problems in the Gaussian mixture model have been studied extensively; see [Reference Dempster, Laird and Rubin5, Reference Fraley and Raftery7, Reference McQueen15, Reference Pollard18] for an incomplete list. We mention some recent related work here.
The Gaussian mixture model when all the entries of the perturbation matrix are i.i.d was studied in [Reference Chen and Yang4], in which a condition for the exact recovery of the semi-definite programming (SDP) is proved. When all the communities have the same size, a condition that exact recovery does not occur was also proved in [Reference Chen and Yang4] when the number of communities
$k\leq \log n$
. The case of unbalanced communities was investigated in [Reference Giraud and Verzelen8]. In this paper, we obtain conditions when the exact recovery happens and does not happen for the more general Gaussian mixture model when the entries of the perturbation matrix are not necessarily identically distributed. Our result can be applied to the special case when intensities of the Gaussian perturbations are all equal, and in particular, we obtain a condition that the exact recovery of MLE does not occur when the number of communities
$k$
is
$e^{o(\log n)}$
in the hypergraph model; see the explanations at the end of Example 4.5. We can also see from the sufficient condition (17) for the exact recovery and the sufficient condition (22) that the exact recovery does not occur that when
$k$
is
$e^{o(\log n)}$
, these two conditions match and there is a sharp phase transition. In Section 7, we see an example in which these necessary and sufficient conditions of the exact recovery give a sharp phase transition when the Gaussian perturbations are not i.i.d.
When
$p=n$
in our model, we may consider the rows and columns of the observation matrix are indexed by the
$n$
vertices, and each entry represents an edge. In this case, we obtain the community detection problem on a graph. When
$p=n^s$
with
$s\geq 2$
, we may consider the rows of the observation matrix are indexed by ordered
$s$
-tuples of vertices, and each entry of the observation matrix represents a
$(s+1)$
-hyperedge. In this case, we obtain the community detection problem on a hypergraph. Community detections on hypergraphs with Gaussian perturbations were studied in [Reference Kim, Bandeira and Goemans11], where the vertices are divided into two equal-sized communities, and a weight-1
$(d+1)$
-hyperedge exists if and only if all the vertices are in the same group. The results proved in this paper can be applied to the community detection problems on hypergraphs with Gaussian perturbation to obtain necessary and sufficient conditions for the exact recovery, in which the number of communities is arbitrary and communities are not necessarily equal-sized; moreover, the hyperedges have general weights as represented in the (unperturbed) observation matrix. Community detection problems on random graphs were also studied in [Reference Abbe and Sandon2, Reference Abbe, Bandeira and Hall3, Reference Dyer and Frieze6, Reference Holland, Laskey and Leinhardt9, Reference Massoulié14, Reference Mossel, Neeman and Sly16]. Algorithms to efficiently implement the community recovery in mixture models include the Expectation-Maximization (EM) algorithm ([Reference Dempster, Laird and Rubin5]) and the spectral method ([Reference Kannan, Salmasian and Vempala10, Reference Löffler, Zhang and Zhou12, Reference Vempalaa and Wang19]). The EM algorithm is a local search heuristic that can fail. The spectral method is based on the singular value decomposition of the dataset and then only use data related to a fixed finite number of largest singular values to achieve dimension reduction. The performance of the spectral method when the Gaussian perturbations consist of i.i.d. Gaussian random variables was discussed in [Reference Löffler, Zhang and Zhou12]. The major differences between results in this paper and those in [Reference Löffler, Zhang and Zhou12] are (1) this paper focuses on exact recovery, that is, the probability that the estimation is exactly equal to the true community assignment and no mislabel is allowed, while [Reference Löffler, Zhang and Zhou12] allows a small number of mislabelled vertices (almost exact recovery) and (2) [Reference Löffler, Zhang and Zhou12] focuses on Gaussian mixture model with isotropic covariance matrix; while in this paper, the covariances matrices for the model can be more general. Partial recovery was also discussed in [Reference Giraud and Verzelen8].
The implement of the MLE is usually very slow; in some cases, relaxing some constraints in MLE leads to the more efficient SDP. However, most SDPs require the partition to be balanced or that, at least, the size of each group is known in advance. The MLE optimisation discussed here has the advantage to attack the case of unbalanced sizes of groups, and the case when the size of each group is unknown. When the perturbations for the entries of the observation matrix are assumed to be i.i.d. Gaussian, and using the average value of the observations associated with each group to approximated the expectation of the observation to the group, in this case the MLE becomes the K-means estimation. The SDP relaxation for K-means have been studied extensively, see for example [Reference Peng and Wei17]. We expect a similar SDP relaxation for K-means in the dependent case in which the inner product in the objective function should be defined with respect to the inverse of the covariance matrix; yet in this paper, we shall focus on the more fundamental MLE and try to investigate its statistical limit.
The organisation of the paper is as follows. In Section 2, we review the definition of the Gaussian mixture models and hypergraphs and state the main results proved in this paper. In Section 3, we prove conditions for the exact recovery of the Gaussian mixture model when the number of vertices in each community is unknown. In Section 4, we apply the results proved in Section 3 to the exact recovery of the community detection in hypergraphs and also prove conditions when exact recovery does not occur in hypergraphs under the assumption that the number of vertices in each community is unknown. In Section 5, we prove conditions for the exact recovery of the Gaussian mixture model when the number of vertices in each community is known and fixed. In Section 6, we prove conditions when exact recovery does not occur in hypergraphs under the assumption that the number of vertices in each community is known and fixed. In Section 7, we give an example in which these necessary and sufficient conditions of the exact recovery give a sharp phase transition when the Gaussian perturbations are not i.i.d.; results of numerical experiments are also included to show the performance of MLE with difference parameters to illustrate the sharp phase transition. In Section A, we prove a lemma used to obtain the main results of the paper.
2. Backgrounds and main results
In this section, we review the definition of the Gaussian mixture models and hypergraphs and state the main results proved in this paper.
2.1. Gaussian mixture model
Let
$n\geq k\geq 2$
be positive integers. Let

be a set of
$n$
vertices divided into
$k$
different communities. Let

be the set of communities. A mapping
$x\; :\; [n]\rightarrow [k]$
which assigns a unique community represented by an integer in
$[k]$
to each one of the
$n$
vertices in
$[n]$
is called a community assignment mapping. Let
$\Omega$
be the set consisting of all the possible mappings from
$[n]$
to
$[k]$
, that is,

Each mapping in
$\Omega$
is a community assignment mapping.
We shall define a function
$\theta$
, which assigns to each vertex a
$p\times 1$
vector, depending on the community of this vertex. Indeed, we require that all the vertices in the same community correspond to the same
$p\times 1$
vector under
$\theta$
. Observing
$\theta$
perturbed by some Gaussian noise, our goal is to identify the community assignment and determine when exactly recovery occurs. It is natural to believe that those vertices have corresponding observations close to each other are in the same community, while those vertices whose corresponding observations are far away from each other are in different communities, but this intuition may be complicated by the existence of the Gaussian noise. See Sections 4.1, 4.2 and 4.3 for examples with specific
$\theta$
.
More precisely, let
$p\geq 1$
be a positive integer. Let

be a function on the set
$\Omega \times [p]\times [k]$
taking real values.
For a community assignment mapping
$x\in \Omega$
, let
$\mathbf{A}_x$
be a
$p\times n$
matrix whose entries are given by:

Let
$\Sigma$
be a
$p\times n$
matrix with positive real entries defined by:

Let
$P,Q$
be two
$p\times n$
matrices. Define the inner product of
$P,Q$
by

The norm
$\|P\|$
for a matrix
$P$
is defined by:

Let
$P*Q$
be a
$p\times n$
matrix defined by:

Define a random observation matrix
$\mathbf{K}_x$
by:

where
$\mathbf{W}$
is a random
$p\times n$
matrix with i.i.d. standard Gaussian entries. Note that if the entries of
$\Sigma$
are not all equal, the perturbation matrix
$\Sigma *W$
has independent but not identically distributed entries.
Let
$y\in \Omega$
be the true community assignment mapping. Given the observation
$\mathbf{K}_y$
, the goal is to determine the true community assignment mapping
$y$
. We shall apply the MLE.
Let
$n_1,\ldots,n_k$
be positive integers satisfying

and

that is,
$n_i$
is the number of vertices in community
$i$
for each
$i\in [k]$
under the mapping
$y$
.
Let

be the set of all the community assignment mappings such that there are exactly
$n_i$
vertices in the community
$i$
, for each
$i\in [k]$
.
For each real number
$c\in (0,1)$
, let

that is,
$\Omega _c$
consists of all the community assignment mappings such that the percentage of the numbers of vertices in each community is at least
$c$
.
Assume the true community assignment mapping
$y\in \Omega _c$
for some
$c\in (0,1)$
. Let
$\Phi$
be an
$p\times n$
matrix whose entries are given by:

in other words, the
$(i,j)$
-entry of
$\Phi$
is the reciprocal of the
$(i,j)$
-entry of
$\Sigma$
. Define

and

Then we have the following lemma
Lemma 2.1.
$\hat{y}$
is the MLE with respect to the observation
$\mathbf{K}_y$
in
$\Omega _{\frac{2c}{3}}$
.
$\check{y}$
is the MLE with respect to the observation
$\mathbf{K}_y$
in
$\Omega _{n_1,\ldots,n_k}$
.
Proof. By definition, the MLE with respect to the observation
$\mathbf{K}_y$
in
$\Omega _{\frac{2c}{3}}$
(resp.
$\Omega _{n_1,\ldots,n_k}$
) should maximise the probability density of the observation
$\mathbf{K}_y$
among all
$x\in \Omega _{\frac{2c}{3}}$
(resp.
$x\in \Omega _{n_1,\ldots,n_k}$
). If the true community assignment mapping
$y=x$
, we may consider
$\mathbf{K}_{y}$
as a random matrix with mean value
$\mathbf{A}_x$
and independent entries, such that variance of its
$(i,j)$
-entry is
$\sigma _{i,j}^2$
. Therefore, the probability density of
$\mathbf{K}_y$
is given by:

where the exponent is exactly

It is straightforward to check that the minimiser of
$\|\Phi *(\mathbf{K}_y-\mathbf{A}_x)\|^2$
is exactly the maximiser of the probability density. Then the lemma follows.
We shall investigate under which conditions we have
$\check{y}=y$
and
$\hat{y}=y$
with high probability.
To state the main theorems proved in this paper, we first introduce a few assumptions.
For
$x,y\in \Omega$
, let

For
$x\in \Omega$
, let

then
$n_i(x)$
is the number of vertices in community
$i$
under the community assignment mapping
$x$
. It is straightforward to check that

For
$i,j\in [k]$
and
$x,z\in \Omega$
, let
$t_{i,j}(x,z)$
be a non-negative integer given by:

That is,
$t_{i,j}(x,z)$
is the number of vertices in
$[n]$
which are in community
$i$
under the mapping
$x$
and in community
$j$
under the mapping
$z$
. Then

We now introduce an equivalence condition on
$\Omega$
.
Definition 1.
For
$x\in \Omega$
, let
$C(x)$
consist of all the
$x'\in \Omega$
such that
$x'$
can be obtained from
$x$
by a
$\theta$
-preserving bijection of communities. More precisely,
$x'\in C(x)\subset \Omega$
if and only if the following condition holds
-
• for
$i\in [p]$ and
$j\in [n]$ ,
$\theta (x,i,x(j))=\theta (x',i,x'(j))$ .
We define an equivalence relation on
$\Omega$
as follows: we say
$x,z\in \Omega$
are equivalent if and only if
$x\in C(z)$
. Let
$\overline{\Omega }$
be the set of all the equivalence classes in
$\Omega$
.
We see that if
$x$
and
$z$
are equivalent in the sense of Definition 1, the non-random parts of the observation matrices corresponding to
$x$
and
$z$
satisfy
$\mathbf{A}_x=\mathbf{A}_z$
. Hence, any algorithm based on this observation will not distinguish equivalent community assignments; and the best we expect is to exactly recover the equivalent class of the community assignment. We want to assume this
$\theta$
function to be able to distinguish community assignments up to the composition with a bijection of communities more precisely.
Assumption 2.2.
Let
$x,z\in \Omega$
. If for any
$i\in [p]$
and
$j\in [n]$
,

then there is a bijection
$\eta\; :\;[k]\rightarrow [k]$
, such that

where
$\circ$
denotes the composition of two mappings.
Note that (8) is equivalent of saying that for
$i,j\in [n]$
,
$x(i)=x(j)$
if and only if
$x'(i)=x'(j)$
. See Section 4.1 for examples.
Define a set

For
$\varepsilon \gt 0$
, define a set
${\mathscr B}_{\varepsilon }$
consisting of all the
$(t_{1,1},t_{1,2},\ldots,t_{k,k})\in{\mathscr B}$
satisfying all the following conditions:
-
1.
$\forall \ i\in [k],\ \max _{j\in [k]}t_{j,i}\geq n_i-n\varepsilon$ .
-
2. For
$i\in [k]$ , let
$t_{w(i),i}=\max _{j\in [k]}t_{j,i}$ . Then
$w$ is a bijection from
$[k]$ to
$[k]$ .
-
3.
$w$ is
$\theta$ -preserving, that is, for any
$x\in \Omega$ ,
$i\in [p]$ and
$a\in [k]$ , we have
\begin{eqnarray*} \theta (x,i,a)=\theta (w\circ x,i,w(a)). \end{eqnarray*}
We may assume
$\theta$
and
$\Sigma$
satisfy the following assumptions.
Assumption 2.3.
Suppose
$\varepsilon \in (0,\frac{2c}{3k})$
,
$x\in \Omega _{\frac{2c}{3}}$
and
$y\in \Omega _c$
. Then for all
$x,y\in \Omega$
, and

we have

Here
$R(n)$
is some function of
$n$
, such that (
11
) combined with (
15
) leads to the desired exact recovery result in Theorem
2.6
.
Assumption 2.4.
Assume
$\varepsilon \in (0,\frac{2c}{3k})$
,
$x\in \Omega _{\frac{2c}{3}}$
and
$y\in \Omega _c$
. Assume there exists
$\Delta \gt 0$
such that the following holds:
Let
$y_1,y_2\in \Omega _{\frac{2c}{3}}$
and
$a,b\in [k]$
and
$a\neq b$
. Let
$i,j\in [n]$
such that
$i\in y_1^{-1}(a)\cap x^{-1}(b)$
. Let
$y_2\;:\;[n]\rightarrow [k]$
be defined as follows:

When

such that for all
$i\in [k]$

$\varepsilon \in \left (0,\frac{2c}{3k}\right )$
, and

we have

where
$o(1)\rightarrow 0$
, as
$n\rightarrow \infty$
.
Assumptions 2.3 and 2.4 are technical. One may interpret Assumption 2.4 as follows: when a community assignment mapping
$y_1\in \Omega$
is sufficiently close to the community assignment mapping
$x$
in the sense of (12), change the community of exactly one vertex
$j$
from
$y_1(j)$
to
$x(j)$
and obtain a new community assignment mapping
$y_2(j)$
, then the distance between
$y_2$
and
$x$
, in the sense of
$L_{\Phi }$
, approximately decreases by
$\Delta$
from the distance between
$y_1$
and
$x$
. In other words, as
$y_1$
is close to
$x$
and approaching
$x$
,
$L_{\Phi }(y_1,x)$
decreases linearly with rate
$\Delta$
. As we see later, this will create a convergent geometric series when computing the lower bound for the probability of exact recovery.
Note that Assumption 2.3 can be guaranteed by the following stronger assumption with
$R(n)=\frac{T(n)}{B_1^2}$
; see Lemma 3.5.
Assumption 2.5.
-
1. There exists
$B_1\gt 0$ , such that for all
$i,j\in [p]\times n$ , we have
\begin{eqnarray*} |\sigma _{i,j}|\leq B_1. \end{eqnarray*}
-
2. Assume
$\varepsilon \in (0,\frac{2c}{3k})$ ,
$x\in \Omega _{\frac{2c}{3}}$ and
$y\in \Omega _c$ . Then for all
$x,y\in \Omega$ , and
(13)we have\begin{eqnarray} (t_{1,1}(x,y),t_{1,2}(x,y),\ldots,t_{k,k}(x,y))\in \mathscr{B}\setminus \mathscr{B}_{\varepsilon }, \end{eqnarray}
(14)\begin{eqnarray} \sum _{i\in [p],j\in [n]}(\theta (x,i,x(j))-\theta (y,i,y(j)))^2\geq T(n)\gt 0. \end{eqnarray}
Here
$T(n)$
is some function of
$n$
, such that (
14
) combined with (
16
) leads to the desired exact recovery result in Theorem
2.6
.
Assumption 2.5(1) gives an upper bound on the standard deviation of the Gaussian perturbations (which may depend on
$n$
); Assumption 2.5(2) may be interpreted as when
$x,y\in \Omega$
are “far away” from each other in the sense of (13), the corresponding
$\theta$
functions are “far away” from each other in the sense of (11).
Theorem 2.6.
Assume
$y\in \Omega _c$
is the true community assignment mapping. Suppose that Assumptions 2.5 and 2.4 hold. Let
$\varepsilon \in (0,\frac{2c}{3k})$
. Suppose one of the following cases occurs
-
1. Assumption 2.3 holds and
(15)\begin{eqnarray} \lim _{n\rightarrow \infty } n\log k-\frac{R(n)}{8}=-\infty, \end{eqnarray}
-
2. Assumption 2.5 holds and
(16)\begin{eqnarray} \lim _{n\rightarrow \infty } n\log k-\frac{T(n)}{8B_1^2}=-\infty . \end{eqnarray}
Moreover, suppose that for any constant
$\delta \gt 0$
independent of
$n$
,

then
$\lim _{n\rightarrow \infty }\Pr (\hat{y}\in C(y))=1$
.
Theorem 2.6 gives a sufficient condition for the exact recovery of MLE in the Gaussian mixture model. It is proved in Section 3. An application of Theorem 2.6 on the exact recovery of community detection on hypergraphs is discussed in Section 4.3.
We also obtain a condition for the exact recovery when the sample space of the MLE is restricted to
$\Omega _{n_1,\ldots,n_k}$
, that is, the number of vertices in each community is known and fixed.
Assumption 2.7.
Assume
$x,y_m,y_h\in \Omega$
such that
-
1.
$D_{\Omega }(y_m,y_h)=j$ , where
$j\geq 2$ is a positive integer; and
-
2. There exist
$u_1,\ldots,u_j\in [n]$ , such that
-
(a)
$y_m(v)=y_h(v)$ , for all
$v\in [n]\setminus \{u_1,\ldots,u_j\}$ ; and
-
(b)
$y_m(u_i)\neq y_h(u_i)=x(u_i)=y_m(u_{i-1})$ for all
$i\in [j]$ .
-
(c)
$(t_{1,1}(x,y_m),t_{1,2}(x,y_m),\ldots,t_{k,k}(x,y_m))\in \mathscr{B}_{\varepsilon }$ with
$\varepsilon \in \left (0,\frac{2c}{3k}\right )$ and
$w(i)=i$ .
-
Then

Theorem 2.8.
Suppose that Assumptions 2.5, 2.7
, (
15
) and (
17
) hold. Then
$\lim _{n\rightarrow \infty }\Pr (\check{y}\in C(y))=1$
.
Indeed, Assumption 2.4 implies Assumption 2.7; see Lemma 5.5. Theorem 2.8 is proved in Section (5).
2.2. Hypergraphs
A special case for the Gaussian mixture model is the hypergraph model. Let
$s,s_1,s_2$
be positive integers satisfying

A hypergraph
$H=(V,E)$
has vertex set
$V\;:\!=\;[n]$
and hyperedge set
$E$
defined as follows:

Let
$\phi \;:\;\cup _{s=s_1}^{s_2}[k]^{s}\rightarrow [0,\infty )$
be a function which assigns a unique real number
$\phi (c_1,\ldots,c_s)$
to each
$s$
-tuple of communities
$(c_1,\ldots,c_s)\in [k]^s$
, and
$s\in [s_1,s_2]$
.
For a community assignment mapping
$x$
, the weighted adjacency tensor
$\mathbf{A}_x$
is defined by:

and

Define a random tensor
$\mathbf{K}_x$
as in (2). Recall that
$y\in \Omega _c$
is the true community assignment mapping. Define
$\hat{y}$
and
$\check{y}$
as in (3) and (4).
Recall that
$y\in \Omega$
is the true community assignment mapping satisfying
$|y^{-1}(i)|=n_i$
, for all
$i\in [k]$
. Let
$a\in [n]$
. Let
$y^{(a)}\in \Omega$
be defined by:

such that

Theorem 2.9. Assume

Suppose that there exists a subset
$H\subset [n]$
satisfying all the following conditions:
-
1.
$|H|=h=o(n)$ ;
-
2.
$\lim _{n\rightarrow \infty }\frac{\log h}{\log n}=1$ ;
-
3. For each
$g\in H$ ,
\begin{eqnarray*} &&\sum _{s=s_1}^{s_2}\sum _{j=1}^{s}\sum _{(i_1,\ldots,\widehat{i}_j,\ldots,i_s)\in ([n]\setminus H)^{s-1}}\frac{1}{\sigma _{(i_1,\ldots,i_{j-1},g,i_{j+1},\ldots,i_s)}^2}\\ &&\times (\phi (y(i_1),\ldots,y^{(g)}(g),\ldots,y(i_s))-\phi (y(i_1),\ldots,y(g),\ldots,y(i_s)))^2\\ &=&(1+o(1))L_{\Phi }(y^{(g)},y) \end{eqnarray*}
-
4. there exists a constant
$\beta \gt 0$ independent of
$n$ , such that
\begin{eqnarray*} \frac{\max _{a\in H}L_{\Phi }(y^{(a)},y)}{\min _{a\in H} L_{\Phi }(y^{(a)},y)}\leq \beta ^2,\ \forall n. \end{eqnarray*}
If there exists a constant
$\delta \gt 0$
independent of
$n$
, such that

Then
$\lim _{n\rightarrow \infty }\Pr (\hat{y}\in C(y))=0$
.
Theorem 2.9 is proved in Section 4. An example is given in Section 4.2.
Let
$a,b\in [n]$
such that
$y(a)\neq y(b)$
. Let
$y^{(ab)}\in \Omega _{n_1,\ldots,n_k}$
be the community assignment mapping defined by:

In other words,
$y^{(ab)}$
is obtained from
$y$
by exchanging
$y(a)$
and
$y(b)$
.
We also prove a condition when the exact recovery does not occur if the sample space of the MLE is restricted in
$\Omega _{n_1,\ldots,n_k}$
.
Theorem 2.10. Assume

Suppose that there exist two subsets
$H_1,H_2\subset [n]$
satisfying all the following conditions:
-
1.
$|H_1|=|H_2|=h=o(n)$ ;
-
2.
$\lim _{n\rightarrow \infty }\frac{\log h}{\log n}=1$ ;
-
3. For any
$u_1,u_2\in H_1$ and
$v_1,v_2\in H_2$ ,
\begin{eqnarray*} y(u_1)=y(u_2)\neq y(v_1)=y(v_2); \end{eqnarray*}
-
4. For any
$u\in H_1$ and
$v\in H_2$
\begin{eqnarray*} &&\sum _{s=s_1}^{s_2}\sum _{j=1}^{s}\sum _{(i_1,\ldots,\widehat{i}_j,\ldots,i_s)\in ([n]\setminus (H_1\cup H_2))^{s-1}}\left (\frac{1}{\sigma _{(i_1,\ldots,i_{j-1},u,i_{j+1},\ldots,i_s)}^2}+\frac{1}{\sigma _{(i_1,\ldots,i_{j-1},v,i_{j+1},\ldots,i_s)}^2}\right )\\ &&(\phi (y(i_1),\ldots,y(v),\ldots,y(i_s))-\phi (y(i_1),\ldots,y(u),\ldots,y(i_s)))^2\\ &=&(1+o(1))L_{\Phi }(y^{(uv)},y) \end{eqnarray*}
-
5. For any
$g\in H_1\cup H_2$ , the quantity
\begin{eqnarray*} &&\sum _{s=s_1}^{s_2}\sum _{j=1}^{s}\sum _{(i_1,\ldots,\widehat{i}_j,\ldots,i_s)\in ([n]\setminus (H_1\cup H_2))^{s-1}}\frac{1}{\sigma _{(i_1,\ldots,i_{j-1},g,i_{j+1},\ldots,i_s)}^2}\\ &&(\phi (y(i_1),\ldots,y(b),\ldots,y(i_s))-\phi (y(i_1),\ldots,y(a),\ldots,y(i_s)))^2 \end{eqnarray*}
$g$ .
If there exists a constant
$\delta \gt 0$
independent of
$n$
, such that

$\lim _{n\rightarrow \infty }\Pr (\check{y}\in C(y))=0$
.
3. Community detection on K-community Gaussian mixture models
In this section, we consider the MLE when the number of vertices in each community is unknown. We shall obtain a sufficient condition for the occurrence of the exact recovery. The main goal is to prove Theorem 2.6.
Recall that we defined an equivalence relation on
$\Omega$
in Definition 1. It is straightforward to check that

Therefore, the MLE based on the observation
$\mathbf{K}_y$
can only recover the community assignment mapping up to equivalence.
The main idea to prove Theorem 2.6 is as follows:
-
We find an upper bound for the probability that the MLE is not in the equivalence class of the true community assignment; see (32).
-
We show this upper bound converges to zero as
$n\rightarrow \infty$ under the assumptions of Theorem 2.6. This is achieved by splitting the upper bounds into two parts: one part
$I_2$ is the sum over all the community assignment functions that differs from the true community assignment functions by at most
$n\varepsilon$ and other part
$I_1$ is the sum over all the community assignment functions that differs from the true community assignment functions by at least
$n\varepsilon$ . The fact that
$I_2$ converges to 0 as
$n\rightarrow \infty$ follows directly from Assumption 2.5. To prove that
$I_2$ converges to 0 as
$n\rightarrow \infty$ , we bound
$I_1$ by a geometric series that converges to 0 as
$n\rightarrow \infty$ .
Note that

In particular for each
$x\in \Omega$
, we have

Recall that
$y\in \Omega _{n_1,\ldots,n_k}$
is the true community assignment mapping. Note that

For each fixed observation
$\mathbf{K}_y$
,
$\|\Phi *\mathbf{K}_y\|^2$
is fixed and independent of
$x\in \Omega$
. Therefore,

For
$x\in \Omega$
, define

Then

where we use the identity

Then
$f(x)-f(y)$
is a Gaussian random variable with mean value

and variance

Lemma 3.1.
For
$x,z\in \Omega$
. If
$x\in C(z)$
, then

Proof. By Definition 1, if
$x\in C(z)$
, then for any
$i\in [p]$
and
$j,h\in [n]$
,
$x(j)=x(h)$
if and only if
$z(j)=z(h)$
and
$\theta (x,i,x(j))=\theta (z,i,z(j))$
, then
$\mathbf{A}_x=\mathbf{A}_z$
by (19). Moreover, since for any
$i\in [p]$
and
$j\in [n]$
,

we have

this implies

Then the lemma follows from (29).
Define

Then

Lemma 3.2.
Let
$x,y,x',y'\in \Omega$
, such that
$x'\in C(x)$
and
$y'\in C(y)$
, then

Proof. By (31), we obtain that when
$x'\in C(x)$
and
$y'\in C(y)$
,

Then the lemma follows from (5).
Lemma 3.3.
For
$x,y\in \Omega$
,
$L_{\Phi }(x,y)\geq 0$
. Moreover,
-
1. If
$x\in C(y)$ , then
$L_{\Phi }(x,y)=0$ .
-
2. If
$\theta$ satisfies Assumption 2.2 and
$L_{\Phi }(x,y)=0$ , then
$x\in C(y)$ .
Proof. From (5), it is straightforward to check that
$L_{\Phi }(x,y)\geq 0$
for any
$x,y\in \Omega$
. Moreover, from (5), we obtain

By the fact that
$\sigma _{i,j}\gt 0$
for all
$i\in [p],j\in [n]$
, we obtain that
$L_{\Phi }(x,y)=0$
if and only if

If
$x\in C(y)$
, then there exists a
$\theta$
-preserving bijection
$\eta \;:\;[k]\rightarrow [k]$
, such that
$x=\eta \circ y$
. Then (33) holds by the
$\theta$
-preserving property of
$\eta$
, then we obtain Part(1).
On the other hand, if
$L_{\Phi }(x,y)=0$
, we have (33) holds. Then
$x\in C(y)$
follows from Assumption 2.2.
Lemma 3.4.
Assume that
$y\in \Omega _c$
and
$x\in \Omega _{\frac{2c}{3}}$
. For
$i\in [k]$
, let

where
$w(i)\in [k]$
. When
$\varepsilon \in \left (0,\frac{2c}{3k}\right )$
and
$(t_{1,1}(x,y),t_{1,2}(x,y),\ldots, t_{k,k}(x,y))\in{\mathbb R}^{k^2}$
satisfies

$w$
is a bijection from
$[k]$
to
$[k]$
.
Proof. See Lemma 5.6 of [Reference Li13].
Definition 2.
Define the distance function
$D_{\Omega }\;:\;\Omega \times \Omega \rightarrow [n]$
as follows:

for
$x,y\in \Omega$
.
From Definition 2, it is straightforward to check that

Lemma 3.5.
Assume that
$\theta,\Sigma$
satisfies Assumptions 2.5
. Then for all the
$x,y\in \Omega$
such that (
10
) holds, we have

Proof. Note that

By Assumption 2.5(1), we have

Then the lemma follows from Assumption 2.5(2).
Proof of Theorem 2.6 . Note that

where

and

and
$\varepsilon \in \left (0,\frac{2c}{3k}\right )$
.
By Lemma 3.5, when Assumption 2.5 holds, we have

When (15) holds, we obtain

Now let us consider
$I_2$
. Let
$w$
be the bijection from
$[k]$
to
$[k]$
as defined in (34). Let
$y^*\in \Omega$
be defined by:

Then
$y^*\in C(y)$
since
$w$
is
$\theta$
-preserving by the definition of
$\mathscr{B}_{\varepsilon }$
. Moreover,
$x$
and
$y^*$
satisfies

We consider the following community changing process to obtain
$x$
from
$y^*$
:
-
1. If for all
$(j,i)\in [k]^2$ , and
$j\neq i$ ,
$t_{j,i}(x,y^*)=0$ , then
$x=y^*$ .
-
2. If (1) does not hold, find the least
$(j,i)\in [k]^2$ in lexicographic order such that
$j\neq i$ and
$t_{j,i}(x,y^*)\gt 0$ . Choose an arbitrary vertex
$u\in \left \{ x^{-1}(j)\cap (y^*)^{-1}(i)\right \}$ . Define
$y_1\in \Omega$ as follows:
\begin{eqnarray*} y_1(z)= \begin{cases}j&\mathrm{if}\ z=u\\ y^*(z)&\mathrm{if}\ z\in [n]\setminus \{u\} \end{cases} \end{eqnarray*}
Then we have


Therefore,
$x$
,
$y_1$
and
$y^*$
satisfy

for all
$l\in [k]$
.
From Assumption 2.4 and Lemma 3.2, we obtain

Therefore,

In general, if we have constructed
$y_l\in \Omega$
(
$r\geq 1$
) satisfying all the following conditions:

for all
$l\in [k]$
. We now construct
$y_{r+1}\in \Omega$
as follows:
-
(1) If for all
$(j,i)\in [k]^2$ , and
$j\neq i$ ,
$t_{j,i}(x,y_r)=0$ , then
$x=y_r$ ; then the construction process stops at this step.
-
(2) If (a) does not hold, find the least
$(j,i)\in [k]^2$ in lexicographic order such that
$j\neq i$ and
$t_{j,i}(x,y_r)\gt 0$ . Choose an arbitrary vertex
$u\in \left \{x^{-1}(j)\cap y_r^{-1}(i)\right \}$ . Define
$y_{r+1}\in \Omega$ as follows:
\begin{eqnarray*} y_{r+1}(z)= \begin{cases}j&\mathrm{if}\ z=u\\ y_r(z)&\mathrm{if}\ z\in [n]\setminus \{u\} \end{cases} \end{eqnarray*}
Then it is straightforward to check that

for all
$l\in [k]$
.
Then if (36) holds with
$y^*$
replaced by
$y_r$
, then (36) holds with
$y^*$
replaced by
$y_{r+1}$
. By Assumption 2.4, we obtain

Recall that the distance
$D_{\Omega }$
in
$\Omega$
is defined in Definition 2. From the constructions of
$y_{r+1}$
, we have

Therefore, there exists
$h\in [n]$
, such that
$y_h=x$
. By (39) and Assumption 2.4, we obtain

Since any
$x$
in
$\mathscr{B}_{\varepsilon }$
can be obtained from
$y$
by the community changing process described above, we have

The right-hand side of (41) is the sum of geometric series with both initial term and common ratio equal to

When (17) holds, we obtain

4. Community detection on k-community hypergraphs
In this section, we apply the results proved in Section 3 to the exact recovery of the community detection in hypergraphs and also prove conditions when exact recovery does not occur in hypergraphs under the assumption that the number of vertices in each community is unknown.
The main idea to prove Theorem 2.9 is as follows:
-
We obtain an lower bound for the probability that the MLE is not in the equivalent class of the true community assignment function; see (50).
-
We show this lower bound converges to 1 as
$n\rightarrow \infty$ . The proof utilises the inequalities regarding the distribution of the maximum of a number of Gaussian random variables as discussed in Section A.
In the case of a hypergraph, from (26), when

we obtain for
$x,z\in \Omega$

In particular,

Recall that
$y\in \Omega _c$
is the true community assignment mapping. Then

where
$f(x)$
is given by (29).
By (30), we obtain that in the hypergraph case

Then
$f(x)-f(y)$
is a Gaussian random variable with mean value
$L_{\Phi }(x,y)$
and variance
$4L_{\Phi }(x,y)$
, where
$L_{\Phi }(x,y)$
is defined by (5).
Proof of Theorem
2.9. When
$y^{(a)}\in \Omega$
is defined by (20),




and

Moreover,

Since any of the event
$\{f(y^{(a)})-f(y)\lt 0\}$
implies
$\hat{y}\neq y$
.
Let
$H\subset [n]$
be given as in the assumptions of the proposition. Under Assumption (3) of the proposition when
$a\in H$
, we have

Then from (45), we have

Then
$1-p(\hat{y};\ \sigma )$
is at least

Let
$(\mathscr{X},\mathscr{Y},\mathscr{Z})$
be a partition of
$\cup _{s=s_1}^{s_2}[n]^s$
defined by:

For
$\eta \in \{\mathscr{X},\mathscr{Y},\mathscr{Z}\}$
, define the random tensor
$\mathbf{W}_{\eta }$
from the entries of
$\mathbf{W}$
as follows:

For each
$a\in H$
, let

For
$s\in \{s_1,s_1+1,\ldots,s_2\}$
, let

Explicit computations show that

Claim 4.1. The followings are true:
-
1.
$\mathscr{X}_{a}=0$ for
$a\in H$ .
-
2. For each
$a\in H$ , the variables
$\mathscr{Y}_{a}$ and
$\mathscr{Z}_{a}$ are independent.
Proof. It is straightforward to check (1). (2) holds because
$\mathscr{Y}\cap \mathscr{Z}=\emptyset$
.
For
$g\in H$
, let
$\mathscr{Y}^{\,g}\subseteq \mathscr{Y}$
be defined by

Note that for
$g_1,g_2\in H$
and
$g_1\neq g_2$
,
$\mathscr{Y}^{\,g_1}\cap \mathscr{Y}^{\,g_2}=\emptyset$
. Moreover,
$\mathscr{Y}=\cup _{g\in H}\mathscr{Y}^{\,g}$
. Therefore,

Note also that
$\langle \mathbf{W}_{\mathscr{Y}^{\,g}},\Phi *(\mathbf{A}_{y^{(a)}}-\mathbf{A}_{y}) \rangle =0$
, if
$g\neq a$
. Hence,

So by (51), we obtain

Then
$\{\mathscr{Y}_g\}_{g\in H}$
is a collection of independent centred Gaussian random variables. Moreover, the variance of
$\mathscr{Y}_g$
is equal to

by Assumption (3) of the proposition.
By Claim 4.1, we obtain

Moreover,

Recall that

By Lemma A.1 about the tail bound result of the maximum of Gaussian random variables, if (A1) holds with
$N$
replaced by
$h$
, the event

has probability at least
$1-e^{-h^{\varepsilon }}$
, and the event

has probability
$1-h^{-\varepsilon }$
.
Moreover, by Assumption (3) of the Proposition and (52),

Define an event
$E$
by

Then
$E_1\cap E_2\subseteq E$
.
When
$n$
is large, and (22) holds

as
$n\rightarrow \infty$
. Then the proposition follows.
4.1. Examples of Assumption 2.2
We shall see some examples of the function
$\theta \; :\; \Omega \times [k]\times [k]\rightarrow{\mathbb R}$
satisfying Assumption 2.2. We first see an example when
$\theta$
can uniquely determine the community assignment mapping in
$\Omega$
.
Example 4.2.
Assume
$p=k$
. For
$a,b\in [k]$
,
$x\in \Omega$

Then if for all
$a\in [p]$
and
$j\in [n]$
, (
7
) holds, we have
$x(j)=a$
if and only if
$z(j)=a$
, then
$x=z$
.
We now see an example when
$\theta$
cannot uniquely determine the community assignment mapping but determines the community assignment mappings up to the equivalent class as defined in Definition 1.
Example 4.3.
Assume
$p=n$
.

Then if for all
$i,j\in [n]$
, (
7
) holds, we have
$x(j)=x(i)$
if and only if
$z(j)=z(i)$
, then
$x\in C(z)$
.
Example 4.4.
Assume
$p=n$
,
$a\in [k]$
and
$i\in [n]$
.

Then if for all
$i,j\in [n]$
, (
7
) holds, we have
$x(i)-x(j)=z(i)-z(j)$
. This implies that
$x(i)=x(j)$
if and only if
$z(i)=z(j)$
, therefore
$x\in C(z)$
. If both
$x$
and
$z$
are surjective onto
$[k]$
, then
$x=z$
.
4.2. Example of Theorem 2.9
Example 4.5.
Here we see an example about how to apply Theorem
2.9
to the exact recovery of community detection on hypergraphs. Let
$y\in \Omega _{n_1,\ldots,n_k}$
be the true community assignment mapping. Assume that for any
$s\in \{s_1,\ldots,s_2\}$
,
$(i_1,i_2,\ldots,i_s), (j_1,j_2,\ldots,j_s)\in [n]^s$
, we have

whenever

that is,
$\sigma _{(i_1,\ldots,i_s)}$
depends only on the communities of
$(i_1,\ldots,i_s)$
under the mapping
$y$
. In this case, we can define
$\overline{\sigma }: \cup _{s=s_1}^{s_2}[k]^s\rightarrow (0,\infty )$
, such that

Then for any
$a\in [n]$
,

Moreover, for any
$a,b\in [n]$
such that

we have

We consider

Assume that when
$y(a)=r_0$
,
$y^{(a)}(a)=r_1$
,
$L_{\Phi }(y^{(a)},y)$
achieves its minimum. Let
$H\subset y^{-1}(r_0)$
, then
$h=|H|\leq n_{r_0}$
. Assume

Then we may choose
$h=\frac{n_{r_0}}{\log n}$
such that Assumptions (1)(2) in Theorem
2.9
hold. Moreover, Assumption (4) in Theorem
2.9
holds because if for all
$a\in H$
, let
$y^{(a)}(a)=r_1$
, then
$L_{\Phi }(y^{(a)},y)$
takes the same value for all
$a\in H$
. There are many mappings
$\phi\; : \; \cup _{s=s_1}^{s_1}[k]^s\rightarrow{\mathbb R}$
to guarantee Assumption (3) in Theorem
2.9
. For example, one may choose

for
$s\in \{s_1,s_1+1,\ldots,s_2\}$
and
$b_1,\ldots,b_s\in [k]$
. Then from (
54
), we obtain

From ( 46 )–( 49 ) and ( 56 ), we obtain that the terms actually contributing to the sum must satisfy

or

Then we obtain

where

Assume

If in
$\{(d_1,b_1),\ldots,(d_s,b_s)\}$
, there exist more than one
$g\in [s]$
, such that
$(d_g,b_g)=(r_1,r_0)$
, then the sum of such terms will be of order
$o((n_{r_0})^{s-1})$
(resp.
$o((n_{r_1})^{s-1})$
) in
$L_{0,s}$
(resp.
$L_{1,s}$
). Therefore, we obtain

To analyse
$L_{1,s}$
, assume that there exists a positive constant
$C\gt 0$
independent of
$n$
, such that

Then we obtain

Then Assumption (3) of Theorem
2.9
follows from the fact that
$|H|=\frac{n_{r_0}}{\log n}=o(n_{r_0})$
.
In the special case when all the communities have equal size, we may obtain a sufficient condition that the exact recovery of MLE does not occur in the hypergraph case when the number of communites
$k=e^{o(\log n)}$
. Since in this case we have
$n_1=n_2=\ldots =n_k\geq e^{\log n-o(\log n)}$
, then (
21
) holds. Choose
$h=\frac{n_1}{\log n}$
, then Assumptions (1) and (2) of Theorem
2.9
hold.
4.3. Example of Theorem 2.6
Example 4.6.
We can also apply Theorem 2.6 to the case of exact recovery of community detection on hypergraphs. Again we consider the case when
$\sigma _{(i_1,\ldots,i_s)}$
depends only on
$(y(i_1),\ldots,y(i_s))$
. Hence, we may define
$\overline{\sigma }$
as in (
53
).
To check Assumption 2.4
, let
$y_m,y_{m+1},x\in \Omega$
be given as in the proof of Proposition
2.6
. For the simplicity of notation, we use
$y$
instead of
$y^*$
. By (
57
), we obtain

For
$j,p,q\in [k]$
and
$x,y,z\in \Omega$
, we define

Then

Recall that
$D_{\Omega }(y_m,y_{m+1})=1$
, and there exists
$u\in [n]$
such that

where
$i,j\in [k]$
and
$i\neq j$
, while
$y_m(v)=y_{m+1}(v)$
for all the
$v\in [n]\setminus \{u\}$
. This implies that if
$\{d_1,\ldots,d_s\}\cap \{i,j\}=\emptyset$
, then the corresponding summand in (
59
) is 0 and does not contribute to the sum. Under the assumption that
-
1.
$(t_{1,1}(x,y),t_{1,2}(x,y),\ldots,t_{k,k}(x,y))\in \mathscr{B}_{\varepsilon }$ with
$w\;:\;[k]\rightarrow [k]$ the identity map; and
-
2.
$n_1\geq n_2\geq \ldots \geq n_k$ ; and
-
3.
$\min _{(b_1,\ldots,b_s)\in [k]^s}|\overline{\sigma }(b_1,\ldots,b_s)|\geq B_3\gt 0$ ; and
-
4.
$\lim _{n\rightarrow \infty }\frac{n\varepsilon }{n_1}=0$ .
we obtain

The identity above can be interpreted as follows. We can classify the terms satisfying
$\{d_1,\ldots,d_s\}\cap \{i,j\}\neq \emptyset$
by the number

and obtain that the leading term of
$L_{\Phi }(x,y_m)-L_{\Phi }(x,y_{m+1})$
is given by the terms when
$N_{i,j}=1$
. Moreover, by Assumption (1) we have

Define

We further make the assumptions below:

Then

Then by Assumption 2.5(1), the exact recovery occurs with probability 1 when
$n\rightarrow \infty$
if (
15
) and (
17
) hold, and

when ( 10 ) holds.
There are a lot of functions
$\phi \;:\;\cup _{s=s_1}^{s_2}[k]^s\rightarrow{\mathbb R}$
satisfying (
60
) and (
61
) and Assumption 2.2
. For example, we may consider the function
$\phi$
as defined in (
56
). Assume

where
$r_0,r_1\in [k]$
and
$r_0\neq r_1$
. As in Example 4.5, we obtain that

where

where the inequality follows from Assumption 2.5(2). It is straightforward to check that when
$\phi$
is given by (
56
) and (
60
) hold if (
58
) holds.
To check ( 61 ), note that

When ( 10 ) holds, the following cases might occur
-
•
$w$ is not a bijection from
$[k]$ to
$[k]$ . In this case, there exists
$i,j\in [k]$ , such that
$w(i)=w(j)$ , then when ( 10 ) holds, we obtain
\begin{eqnarray*} t_{w(i),j}=t_{w(j),j}\geq \frac{n_{j}}{k} \end{eqnarray*}
-
•
$w$ is a bijection from
$[k]$ to
$[k]$ . However, there exists
$i\in [k]^2$ , such that
\begin{eqnarray*} t_{w(j),j}\leq n_i-\varepsilon n. \end{eqnarray*}
\begin{eqnarray*} i\;:\!=\;w^{-1}(\mathrm{argmax}_{l\in [k]\setminus \{w(j)\}}t_{l,j}), \end{eqnarray*}
$i\neq j$ and
\begin{eqnarray*} t_{w(i),j}(x,y)\geq \frac{\varepsilon n}{k-1} \end{eqnarray*}
When (
10
) holds and
$y\in \Omega _c$
, we have

Let

Then we obtain ( 61 ).
5. Community detection on Gaussian mixture models with fixed number of vertices in each community
In this section, we consider the MLE restricted to the sample space consisting of all the mappings satisfying the condition that the number of vertices in each community is the same as that of the true community assignment mapping
$y\in \Omega _{n_1,\ldots,n_k}$
. Again we shall prove a sufficient condition for the occurrences of exact recovery.
Let
$x\in \Omega _{n_1,\ldots,n_k}$
. By (28),

Recall that
$f(x)$
is defined as in (29). Recall also that
$f(x)-f(y)$
is a Gaussian random variable with mean value
$L_{\Phi }(x,y)$
and variance
$4L_{\Phi }(x,y)$
.
For each
$x\in \Omega _{n_1,\ldots,n_k}$
, let

that is,
$C^*(x)$
consists of all the community assignment mappings in
$\Omega _{n_1,\ldots,n_k}$
that are equivalent to
$x$
in the sense of Definition 1. Let

that is,
$\overline{\Omega }_{n_1,\ldots,n_k}$
consists of all the equivalence classes in
$\Omega _{n_1,\ldots,n_k}$
.
Lemma 5.1.
For
$x,z\in \Omega _{n_1,\ldots,n_k}$
. If
$x\in C^*(z)$
, then

Proof. The lemma follows from Lemma 3.1.
Define

Then

Lemma 5.2.
Let
$y\in \Omega _{n_1,\ldots,n_k}\cap \Omega _c$
be the true community assignment mapping. Let
$x\in \Omega _{n_1,\ldots,n_k}$
. For
$i\in [k]$
, let
$w(i)\in [k]$
be defined as in (
34
). Then
-
1. When
$\varepsilon \in \left (0,\frac{c}{k}\right )$ and
$(t_{1,1}(x,y),\ldots, t_{k,k}(x,y))\in \mathscr{B}_{\varepsilon }$ ,
$w$ is a bijection from
$[k]$ to
$[k]$ .
-
2. Assume there exist
$i,j\in [k]$ , such that
$n_i\neq n_j$ . If
(63)Then for any\begin{eqnarray} \varepsilon \lt \min _{i,j\in [k]:n_i\neq n_j}\left |\frac{n_i-n_j}{n}\right | \end{eqnarray}
$i\in [k]$ ,
(64)\begin{eqnarray} n_i=|y^{-1}(i)|=|y^{-1}(w(i))|=n_{w(i)}. \end{eqnarray}
Proof. See Lemma 6.6 of [Reference Li13].
Definition 3.
Let
$l\geq 2$
be a positive integer. Let
$x,y\in \Omega _{n_1,\ldots,n_k}$
. We say
$l$
distinct communities
$(i_1,\ldots,i_{l})\in [k]^l$
is an
$l$
-cycle for
$(x,y)$
, if
$t_{i_{s-1},i_s}(x,y)\gt 0$
for all
$2\leq s\leq l+1$
, where
$i_{l+1}\;:\!=\;i_1$
.
Lemma 5.3.
Let
$x,y\in \Omega _{n_1,\ldots,n_k}$
and
$x\neq y$
. Then there exists an
$l$
-cycle for
$(x,y)$
with
$2\leq l\leq k$
.
Proof. See Lemma 3.3 of [Reference Li13].
Lemma 5.4.
For any
$x,y\in \Omega _{n_1,\ldots,n_k}$
,
$L_{\Phi }(x,y)\geq 0$
, where the equality holds if and only if
$x\in C^*(y)$
.
Proof. The lemma follows from Lemma 3.3.
Proof. Let
$z_0=y_m$
and
$z_j=y_h$
. For
$i\in [j-1]$
, define
$z_i\in \Omega$
by

Then for any
$i\in [j]$
,

by Assumption 2.4, we obtain

summing over all the
$i\in [j]$
, we obtain (18).
Proof of Theorem 2.8. Let

By (62), it suffices to show that
$\lim _{n\rightarrow \infty }\Gamma =0$
.
Let

Note that

where

and

Under Assumption 2.5, by Lemma 3.5 we have

By (15), we have

Now let us consider
$\Gamma _2$
. Recall that
$y\in \Omega _{n_1,\ldots,n_k}\cap \Omega _c$
is the true community assignment mapping. Let
$w$
be the bijection from
$[k]$
to
$[k]$
as defined in (34). Let
$y^*\in \Omega$
be defined by:

Then
$y^*\in C(y)$
. By Part (2) of Lemma 5.2, we obtain that for
$i\in [k]$

therefore
$y^*\in \Omega _{n_1,\ldots,n_k}$
. Moreover,
$x$
and
$y^*$
satisfies

If
$x\neq y^*$
, by Lemma 5.3, there exists an
$l$
-cycle
$(i_1,\ldots,i_{l})$
for
$(x,y^*)$
with
$2\leq l\leq k$
. Then for each
$2\leq a\leq (l+1)$
, choose an arbitrary vertex
$u_m$
in
$S_{i_{m-1},i_m}(x,y^*)$
, and let
$y_1(u_m)=i_{m-1}$
, where
$i_{l+1}\;:\!=\;i_1$
. For any vertex
$z\in [n]\setminus \{u_{2},\ldots,u_{l+1}\}$
, let
$y_1(z)=y^*(z)$
.
Note that
$y_1\in \Omega _{n_1,\ldots,n_k}$
. Moreover, for
$1\leq m\leq l$
, we have

and

When
$\left (t_{1,1}(x,y),\ldots, t_{k,k}(x,y)\right )\in \mathscr{B}_{\varepsilon }$
, From Assumption 2.4 and Lemma 5.5, we obtain

Therefore,

If
$y_1\neq x$
, we find an
$l_2$
-cycle (
$2\leq l_2\leq k$
) for
$(x,y_1)$
, change community assignments along the
$l_2$
-cycle as above, and obtain another community assignment mapping
$y_2\in \Omega _{n_1,\ldots,n_k}$
, and so on. Let
$y_0\;:\!=\;y$
, and note that for each
$r\geq 1$
, if
$y_r$
is obtained from
$y_{r-1}$
by changing colours along an
$l_r$
cycle for
$(x,y_{r-1})$
, we have

Therefore, finally we can obtain
$x$
from
$y$
by changing colours along at most
$\left \lfloor \frac{n}{2} \right \rfloor$
cycles. By similar arguments as those used to derive (69), we obtain that for each
$r$

Therefore, if
$y_h=x$
for some
$1\leq h\leq \left \lfloor \frac{n}{2} \right \rfloor$
, we have

By Assumption 2.4 and Lemma 5.5, we obtain

Therefore,

Note also that for any
$r_1\neq r_2$
, in the process of obtaining
$y_{r_1}$
from
$y_{r_1-1}$
and the process of obtaining
$y_{r_2}$
from
$y_{r_2-1}$
, we change community assignments on disjoint sets of vertices. Hence, the order of these steps of changing community assignments along cycles does not affect the final community assignment mapping we obtain. From (66), we have

On the right-hand side of (70), when expanding the product, each summand has the form:

where the factor
$\left [(nk)^{2m_2}e^{-\frac{(1+o(1))\Delta 2 m_2}{8}}\right ]$
represents that we changed along 2-cycles
$m_2$
times, the factor
$\left [ (nk)^{3m_3}e^{-\frac{(1+o(1))\Delta 3 m_3}{8}}\right ]$
represents that we changed along 3-cycles
$m_3$
times, and so on. Moreover, each time we changed along an
$l$
-cycle, we need to first determine the
$l$
different colours involved in the
$l$
-cycle, and there are at most
$k^l$
different
$l$
-cycles; we then need to choose
$l$
vertices to change colours, and there are at most
$n^{l}$
choices. It is straightforward to check that if
$\sigma$
satisfies (17), then

Therefore, we have

when
$n$
is sufficiently large and
$\varepsilon$
is sufficiently small. Let

Since
$\log (1+x)\leq x$
for
$x\geq 0$
, we have

as
$n\rightarrow \infty$
. Then

6. Community detection on hypergraphs with fixed number of vertices in each community
In this section, we study community detection on hypergraphs under the assumption that the number of vertices in each community is known and fixed. We shall prove a condition when exact recovery does not occur.
Recall that
$y\in \Omega _{n_1,\ldots,n_k}$
is the true community assignment mapping.
Proof of Theorem
2.10
. When
$y^{(ab)}\in \Omega _{n_1,\ldots,n_k}$
is defined by (23):

and

Note that

since any of the event
$(f(y^{(ab)})-f(y)\lt 0)$
implies
$\check{y}\neq y$
. By (45), we obtain that
$f(y^{(ab)})-f(y)$
is a Gaussian random variable with mean value
$\|\Phi *(\mathbf{A}_{y^{(ab)}}-\mathbf{A}_y)\|^2$
and variance
$4\|\Phi *(\mathbf{A}_{y^{(ab)}}-\mathbf{A}_y)\|^2$
. So
$1-p(\check{y};\ \sigma )$
is at least

Let
$H_1$
and
$H_2$
be given as in the assumptions of the proposition. Then

Let
$(\mathscr{X},\mathscr{Y},\mathscr{Z})$
be a partition of
$[n]^s$
defined by:

For
$\eta \in \{\mathscr{X},\mathscr{Y},\mathscr{Z}\}$
, define a random tensor
$\mathbf{W}_{\eta }$
from the entries of
$\mathbf{W}$
as follows:

For each
$u\in H_{1}$
and
$v\in H_{2}$
, let

Lemma 6.1. The followings are true:
-
1.
$\mathscr{X}_{uv}=0$ for
$u\in H_{1}$ and
$v\in H_{2}$ .
-
2. For each
$u\in H_{1}$ and
$v\in H_{2}$ , the variables
$\mathscr{Y}_{uv}$ and
$\mathscr{Z}_{uv}$ are independent.
-
3. Each
$\mathscr{Y}_{uv}$ can be decomposed into
$Y_u+Y_v$ where
$\{Y_u\}_{u\in H_{1}}\cup \{Y_v\}_{v\in H_{2}}$ is a collection of i.i.d. Gaussian random variables.
Proof. Note that for
$J_s=(j_1,j_2,\ldots,j_s)\in [n]^s$
,

It is straightforward to check (1). (2) holds because
$\mathscr{Y}\cap \mathscr{Z}=\emptyset$
.
For
$g\in H_{1}\cup H_{2}$
, let
$\mathscr{Y}_g\subseteq \mathscr{Y}$
be defined by:

Note that for
$g_1,g_2\in H_{1}\cup H_{2}$
and
$g_1\neq g_2$
,
$\mathscr{Y}_{g_1}\cap \mathscr{Y}_{g_2}=\emptyset$
. Moreover,
$\mathscr{Y}=\cup _{g\in H_{1}\cup H_{2}}\mathscr{Y}_g$
. Therefore,

Note also that
$\langle \mathbf{W}_{\mathscr{Y}_g},\Phi *(\mathbf{A}_{y^{(ab)}}-\mathbf{A}_{y}) \rangle =0$
, if
$g\notin \{a,b\}$
. Hence,

So, we can define

By (72), we obtain

Similarly, define

Then
$\mathscr{Y}_{ab}=Y_a+Y_b$
and
$\{Y_g\}_{g\in H_{1}\cup H_{2}}$
is a collection of independent Gaussian random variables. Moreover, the variance of
$Y_g$
is

By Assumption (6) of the proposition, this is independent of
$g$
.
By Lemma 6.1, we obtain

Moreover,

By Lemma A.1, we obtain that when
$\varepsilon, h$
satisfy (A1) with
$N$
replaced by
$h$
, each one of the following two events

has probability at least
$1-e^{-h^{\varepsilon }}$
. Moreover, the event

occurs with probability at least
$1-h^{-2\varepsilon }$
. Then by Assumption (4) of the proposition, we have

By Assumption (5) of the proposition, for any
$u\in H_1$
and
$v\in H_2$
, we have

Moreover, by Assumption (4) of the proposition,

Hence, the probability of the event

is at least

When (25) holds, we have

as
$n\rightarrow \infty$
. Then the proposition follows.
7. An example
Assume
$n$
vertices are divided into group I and group II such that group I contains
$\lfloor \alpha n\rfloor$
vertices and group II contains
$n-\lfloor \alpha n \rfloor$
vertices, where
$\alpha \in (0,1)$
. Let
$y\;:\;[n]\rightarrow [2]$
be the true community assignment mapping. For any
$x\in \Omega$
, define

and

Then

Assume

Note that (73) holds whenever
$\delta _0=\delta _1=\delta _2$
, which corresponds to the case when the Gaussian perturbations for each pair of vertices are i.i.d.; however, (73) may also holds even when
$\delta _0,\delta _1,\delta _2$
are not all equal.
When (73) holds, one may check that Assumption 2.4 holds with

and Assumption 2.3 holds with

We obtain that if

then a.s. exact recovery occurs; if

Then a.s. exact recovery does not occur by Theorem 2.9.
In the following tables, we let
$(n_1,n_2)=(10,6)$
,
$n=16$
and
$\alpha =0.625$
. In this case, we have

For each triple
$(\delta _0,\delta _1,\delta _2)$
satisfying (73), we perform the MLE for
$m=200$
times to compute
$\hat{y}$
, and let
$m_{+}$
be the number of experiments with
$\hat{y}=y$
. The values
$\frac{m_+}{m}$
are shown in the following tables.




Financial support
ZL’s research is supported by National Science Foundation grant 1608896 and Simons Foundation grant 638143.
Competing interests
None.
Appendix
A. Maximum of Gaussian Random Variables
Lemma A.1.
Let
$G_1,\ldots, G_N$
be Gaussian random variables with mean
$0$
. Let
$\varepsilon \in (0,1)$
. Then

and moreover, if
$G_i$
’s are independent, and
$\varepsilon, N$
satisfy

Then

Proof. It is known that for a Gaussian random variable
$G_i$
and
$x\gt 0$
,

Let
$G_1,\ldots, G_N$
be
$N$
Gaussian random variables. Then by (A2), we have

If we further assume that
$G_i$
’s are independent, then

By (A2), we obtain

When (A1) holds, we have

Then the lemma follows.