1. Introduction
1.1 Motivation and problem description
Dominant (largest in module) eigenvalue of the adjacency matrix associated with a network plays an important role in the dissemination of an entity such as disease or information in both unidirectional and bidirectional networks. In particular, the dominant eigenvalue determines whether a dissemination process will become an epidemic (Wang et al., Reference Wang, Chakrabarti, Wang and Faloutsos2003; Prakash et al., Reference Prakash, Chakrabarti, Valler, Faloutsos and Faloutsos2012; Chen et al., Reference Chen, Tong, Prakash, Eliassi-Rad, Faloutsos and Faloutsos2016; Li et al., Reference Li, Wang and Van Mieghem2013; Van Mieghem & Van de Bovenkamp, Reference Van Mieghem and Van de Bovenkamp2013). Dissemination process of an entity can be affected by several factors including the intrinsic property of the entity and the network topology. In this article, it is assumed that one could modify only the network structure where the entity spreads on. Specifically, this article focuses on the problem of removing a fraction of links/edges from a network with the goal of containing the dissemination by minimizing the dominant eigenvalue of the network’s adjacency matrix. This problem can be interpreted as controlling the interaction between people or cities in a country in order to slow the spread of disease when a vaccine is not yet available. In addition, in practice it is also desirable to preserve the (strong) connectivity of a network. For example, strong connectivity of a network ensures that important information can still be passed to all the users/nodes in the network or goods can still be delivered between the cities.
The problem of removing a fraction of links from a network to minimize the dominant eigenvalue of the adjacency matrix poses several challenges. First, the problem is NP-hard due to its combinatorial nature (Van Mieghem et al., Reference Van Mieghem, Stevanović, Kuipers, Li, van de Bovenkamp, Liu and Wang2011). Second, in practice the global network structure may not be available or may be very hard to obtain in a centralized manner due to geographical constraint or privacy concerns (McDaniel & McLaughlin, Reference McDaniel and McLaughlin2009; Li et al., Reference Li, Zhang and Das2011). The absence of overall network topology makes the design of an efficient algorithm for removing a fraction of links very challenging. This article aims at developing link removal algorithms by addressing simultaneously both challenges described previously.
1.2 Related literature
Since the link removal problem is NP-hard, most of the work focuses on heuristics and approximations of the problem. That is, they focused on developing strategies to approximate and compute a suboptimal solution to this problem for both unidirectional and bidirectional networks, see for example (Bishop & Shames, Reference Bishop and Shames2011; Van Mieghem et al., Reference Van Mieghem, Stevanović, Kuipers, Li, van de Bovenkamp, Liu and Wang2011; Chen et al., Reference Chen, Tong, Prakash, Eliassi-Rad, Faloutsos and Faloutsos2016; Milanese et al., Reference Milanese, Sun and Nishikawa2010; Yang et al., Reference Yang, Li, Yang, Wu and Deng2016; Wu et al., Reference Wu, Zhang, Chen and Wang2017). Among these work, an effective and scalable algorithm based on eigenvalue sensitivity analysis is proposed in (Chen et al., Reference Chen, Tong, Prakash, Eliassi-Rad, Faloutsos and Faloutsos2016) to minimize the dominant eigenvalue of an adjacency matrix by removing a fraction of links from a directed network. To this end, a suboptimal solution is computed by solving an optimization problem which involves both the left and right eigenvectors associated with the dominant eigenvalue of the adjacency matrix. However, all the previously mentioned work including (Chen et al., Reference Chen, Tong, Prakash, Eliassi-Rad, Faloutsos and Faloutsos2016) suffer from the following limitations: (1) the global network structure is assumed to be available and known to the designer; (2) the strong connectivity of the resulting network is not guaranteed. It is worth to note that the authors in (Gusrialdi et al., Reference Gusrialdi, Qu and Hirche2019) proposed distributed algorithms based on eigenvalue sensitivity analysis which do not require knowledge of the overall network structure to remove a fraction of links from a network. Moreover, the distributed strategy also ensures the connectivity of the resulting network. However, application of the proposed strategy is only limited to bidirectional or undirected network and its extension to directed network is highly nontrivial.
1.3 Statement of contribution
The contribution of this article is the development of distributed algorithms to compute a suboptimal solution to the link removal problem in a directed network while preserving strong connectivity of the resulting network. Specifically, with the help of matrix perturbation theory, the problem of minimizing dominant eigenvalue of an adjacency matrix is reformulated as an optimization problem involving both the left and right eigenvectors corresponding to the dominant eigenvalue of the adjacency matrix. Novel distributed algorithms to estimate both the dominant left and right eigenvectors are then presented which will be used to decide the candidate links to be removed. In addition, distributed algorithm is proposed to verify whether the removal of a fraction of links will destroy or maintain strong connectivity of the resulting directed network. This article is an extensively extended version of the conference paper (Gusrialdi, Reference Gusrialdi, Benito, Cherifi, Cherifi, Moro, Rocha and Sales-Pardo2021). Specifically, in this article: (1) new distributed algorithms, including the one with reduced communication cost, for estimating the dominant eigenvectors are proposed; (2) a formal proof of the proposed distributed link removal algorithms is presented; (3) a new iterative link removal strategy is presented in details and its performance comparison with the simultaneous link removal strategy is evaluated on different type of networks. It is worth to note that even though in this article we focus on link removal problem, the proposed distributed algorithms can be readily applied to the link addition problem whose goal is to maximize dominant eigenvalue of the network’s adjacency matrix.
1.4 Organization of the article
This article is organized as follows: preliminaries followed by the problem formulation are presented in Section 2. The proposed distributed link removal algorithms are presented in Section 3. The proposed distributed algorithms are demonstrated and evaluated using several numerical simulations in Section 4. Finally, Section 5 concludes the article.
2. Problem statement
In this section, we provide a brief overview of graph theory and maximum/minimum consensus algorithms followed by the problem formulation.
2.1 Notation and preliminaries
Let $\mathbb{R}$ be the set of real numbers and vector $\textbf{1}_n \in \mathbb{R}^n$ denote the column vector of all ones. The number of the elements in the set $\mathcal{V}$ is denoted by $|\mathcal{V}|$ . The graph $\mathcal{G}=(\mathcal{V}, \mathcal{E})$ is a directed graph (digraph) with a set of nodes $\mathcal{V}=\{1,2,\cdots, n\}$ and a set of edges (links) $\mathcal{E} \subset \mathcal{V} \times \mathcal{V}$ . Let $(j, i)\in \mathcal{E}$ represents a directed edge from node $j$ to node $i$ . When the graph $\mathcal{G}$ represents a communication network topology, the edge $(j, i)\in \mathcal{E}$ denotes that node $j$ can send information to node $i$ or node $i$ can receive information from node $j$ . The set of in-neighbors of node $i$ is denoted by $\mathcal{N}_{\mathcal{G},i}^{\text{in}} = \{j | (j, i) \in \mathcal{E}\}$ . Similarly, the set of out-neighbors of node $i$ is denoted by $\mathcal{N}_{\mathcal{G},i}^{\text{out}} = \{j | (i, j) \in \mathcal{E}\}$ . The directed graph $\mathcal{G}$ is strongly connected if every node can be reached from any other nodes by following a set of directed edges.
For a matrix $C\in \mathbb{R}^{n\times n}$ , let us denote its dominant (i.e., largest in module) eigenvalue as $\lambda (C)$ . The adjacency matrix associated with digraph $\mathcal{G}$ , denoted by $A(\mathcal{G}) \in \mathbb{R}^{n \times n}$ is defined as
where $[A]_{ij}$ denotes the element in the $i$ th row and $j$ th column of matrix $A$ . Matrix $C\in \mathbb{R}^{n\times n}$ is nonnegative (i.e., $C\geq 0$ ) if all its elements are nonnegative. A nonnegative matrix $C$ is irreducible if and only if $(I_n+C)^{n-1}\gt 0$ where $I_n$ denotes an identity matrix of size $n$ . Matrix $C$ is primitive if it is irreducible and has at least one positive diagonal element.
Finally, we review the maximum and minimum consensus algorithms which will be used in development of the distributed link removal algorithms. Consider a digraph $\mathcal{G}$ with $n$ nodes and assume that each node maintains a state or variable $x_i(t)\in \mathbb{R}$ at discrete time $t=\{0,1,2,\cdots \}$ . Let $T$ denote the maximum of the shortest path length between any pair of nodes in $\mathcal{G}$ . Furthermore, each node executes the following maximum consensus algorithm
It is shown in Nejad et al. (Reference Nejad, Attia and Raisch2009) that $x_i(t)=x_j(t)=\max _{k\in \mathcal{V}} x_k(0)$ for all $i,j\in \mathcal{V}$ , $\forall t\geq T$ , and any initial conditions $x_i(0)$ if and only if the digraph $\mathcal{G}$ is strongly connected. Analogously, for the minimum consensus algorithm, each node executes
and as a result the states of all nodes converge to the value $\min _{i\in \mathcal{V}} x_i(0)$ after $T$ steps if and only if the digraph $\mathcal{G}$ is strongly connected.
2.2 Problem formulation
Consider an $n$ node network whose connection is given by an unweighted strongly connected directed graph ${\mathcal{G}}_0=\{\mathcal{V}, \mathcal{E}_0\}$ . It is known that the dominant eigenvalue $\lambda (A(\mathcal{G}_0))$ is real, strictly positive, and simple (Bullo, Reference Bullo2018). Our objective is to remove at most $m_e$ number of links $\Delta \mathcal{E}^-$ from the set $\mathcal{E}_0$ such that dominant eigenvalue of the adjacency matrix of resulting graph ${\mathcal{G}}_{m_e}=\{\mathcal{V}, \mathcal{E}_0 \setminus \Delta \mathcal{E}^- \}$ is minimized while guaranteeing ${\mathcal{G}}_{m_e}$ remains to be strongly connected. The problem can be formally formulated as the following optimization problem:
In general, the global knowledge on the network topology $\mathcal{G}_0$ is required to solve the optimization (P1). However, the global network topology $\mathcal{G}_0$ is often unknown or not available in practice due to geographical constraint or privacy reasons. Motivated by this limitation, the following constraint is imposed for the remaining of the article.
Constraint 1. The overall network topology $\mathcal{G}_0$ is not available. Node $i$ can receive information via a communication network only from nodes in the set $\mathcal{N}_{\mathcal{G}_0,i}^{\text{in}}$ . Furthermore, node $i$ knows the set $\mathcal{N}_{\mathcal{G}_0,i}^{\text{out}}$ . In other words, node $i$ only knows the $i$ th row and column of matrix $A$ .
The absence of information on the overall network topology prevents us from solving (P1) in a centralized manner. In addition, optimization (P1) is a combinatorial problem whose complexity increases exponentially with the network size. Therefore, in this article, we are interested in developing a distributed strategy to compute a suboptimal solution to (P1) as stated in the following problem.
Problem 1. Assume that graph $\mathcal{G}_0$ is strongly connected. Find a suboptimal solution or an upper bound to the solution to optimization (P1) under Constraint 1 .
To this end, we assume that each individual node is equipped with both computational and data storage capabilities in addition to communication via a network whose structure is similar to $\mathcal{G}_0$ . Furthermore, for the sake of simplicity, it is assumed that the nodes know the network’s size $n$ . Alternatively, the network’s size can be estimated distributively using the methods proposed in the literature, see for example (Shames et al., Reference Shames, Charalambous, Hadjicostis and Johansson2012).
It is worth to note that one possible strategy to overcome Constraint 1 is by passing local information of each node (e.g., $\mathcal{N}_{\mathcal{G}_0,i}^{\text{in}}$ ) to the rest of the network so that in the end each node can construct the global network topology $\mathcal{G}_0$ . However, this flooding strategy is not locally adaptable to the changes in the network topology. More importantly, this strategy will also reveal the global network topology to all the nodes in the network, which is not desirable due to privacy concerns.
3. Main results: proposed distributed link removal algorithms
A suboptimal solution to optimization problem (P1) can be computed using the matrix perturbation theory presented in e.g., (Chen et al., Reference Chen, Tong, Prakash, Eliassi-Rad, Faloutsos and Faloutsos2016; Gusrialdi et al., Reference Gusrialdi, Qu and Hirche2019). Specifically, for a graph with a large spectral gap (i.e., difference between the largest and second largest eigenvalue in magnitude), the dominant eigenvalue $\lambda \!\left({A}\!\left({\mathcal{G}}_{m_e}\right)\right)$ can be written as
where $\Delta A^-$ denotes the adjacency matrix corresponding to the graph whose links are given by the set $\Delta \mathcal{E}^-$ . Furthermore, $\nu _0, w_0$ denote the dominant left and right eigenvectors corresponding to eigenvalue $\lambda (A(\mathcal{G}_0))$ , respectively. Since the spectral gap of the graph $\mathcal{G}_{0}$ is large, the higher order term in Equation (4) can be neglected and thus minimizing $\lambda \!\left({A}\!\left({\mathcal{G}}_{m_e}\right)\right)$ is equivalent to maximizing the term $\nu _0^T\Delta A^- w_0/\left(\nu _0^T w_0\right)$ . Therefore, a suboptimal solution to optimization problem (P1) is given by the set of edges $\Delta \mathcal{E}^-$ which solves the following optimization problem
where $\nu _{0,i}$ and $w_{0,i}$ , respectively, denote the $i$ th element of left eigenvector $\nu _0$ and $w_0$ associated with $\lambda (A(\mathcal{G}_0))$ . Analysis of the optimality gap between the solutions obtained by solving optimization problems (P2) and (P1) are discussed in Chen et al. (Reference Chen, Tong, Prakash, Eliassi-Rad, Faloutsos and Faloutsos2016).
Optimization problem (P2) involves the dominant left and right eigenvectors associated with the graph $\mathcal{G}_0$ . However, since the global network topology $\mathcal{G}_0$ is not available, the dominant eigenvectors $\nu _0, w_0$ cannot be directly computed and similarly, strong connectivity of the resulting directed graph $\mathcal{G}_{m_e}$ also cannot be directly verified. Therefore, as a first step in solving (P2), it is necessary to develop distributed algorithms performed at each node to estimate the dominant eigenvectors $\nu _0, w_0$ under Constraint 1. To this end, let us define the primitive matrix $Q_0$ as
For simplicity, in the remaining of the article we choose $c=1$ . Since matrix $Q_0$ is primitive, it is known that there exists a real dominant and simple eigenvalue of $Q_0$ , denoted by $\lambda (Q_0)$ satisfying $\lambda (Q_0) \gt |\mu |$ for all the other eigenvalues $\mu$ of $Q_0$ (Bullo, Reference Bullo2018). Hence, we have the following relationship:
Furthermore, it can also be seen that both matrices $Q_0$ and $A(\mathcal{G}_0)$ share the same set of left and right eigenvectors (i.e., $\nu _0, w_0$ ) which are both positive, up to rescaling (Bullo, Reference Bullo2018). In order to distributively estimate the dominant eigenvectors $\nu _0, w_0$ , we will work with matrix $Q_0$ instead of $A(\mathcal{G}_0)$ . The proposed strategy can be summarized as follows:
-
1. estimate in a distributed manner the dominant eigenvalue $\lambda (Q_0)$
-
2. using the estimated dominant eigenvalue $\lambda (Q_0)$ , the dominant right and left eigenvectors are then computed by solving distributively the following linear equations
(7) \begin{equation} \begin{aligned} \left(Q_0-\lambda (Q_0)I_n\right)\!w_0&=0,\\ \left(Q^T_0-\lambda (Q_0)I_n\right)\!\nu _0&=0. \end{aligned} \end{equation}
Details of the proposed strategy are described below.
3.1 Distributed algorithm for estimating dominant eigenvalue
In order to distributively estimate the dominant eigenvectors, each node is first required to estimate the dominant eigenvalue $\lambda (Q_0)$ . To this end, let us assign to each node a scalar variable ${h}_i \in \mathbb{R}$ with initial value ${h}_i(0) \gt 0$ . Each node then updates their variables ${h}_i$ according to the following rule
In addition, let us define
We then have the following relations between scalars $\underline{\lambda }(t), \overline{\lambda }(t)$ , and dominant eigenvalue $\lambda (Q_0)$ .
Since matrix $Q_0$ is non-negative and primitive, it is known that $\underline{\lambda }(t)$ and $\overline{\lambda }(t)$ will converge to $\lambda (Q_0)$ (Wood & O’Neill, Reference Wood and O’Neill2004).
It can be observed that the update law (8) can be performed at each node in a distributed manner using only local information available to individual node. Furthermore, the scalar variables $\underline{\lambda }(t)$ (resp. $\overline{\lambda }(t)$ ) can also be computed in a distributed manner using minimum (resp. maximum) consensus algorithm (3) (resp. (2)). For example, in order to compute $\overline{\lambda }(1)$ distributively, each node maintains a variable $x_i(t)$ which is the local estimate of $\overline{\lambda }(1)$ and sets its initial value as $x_i(0)=\frac{{h}_{i}(2)}{{h}_{i}(1)}$ . The local estimate $x_i(t)$ is then updated according to Equation (2) and after $n$ steps, all the local estimates are equal to $\overline{\lambda }(1)$ .
It is worth to note that the update rule given in Equation (8) is a power method and it is shown that $h_i(t)$ will converge to the right dominant eigenvector $w_0$ from which the dominant eigenvalue can be computed (Golub & Van Loan, Reference Golub and Van Loan1996). However, instead of estimating the dominant right eigenvector directly using the power method, in this article we first estimate the dominant eigenvalue since, as can be observed from Equation (10), both the estimates $\underline{\lambda }(t)$ and $\overline{\lambda }(t)$ provide a bound on the estimation error of $\lambda (Q_0)$ . By taking advantage of this error bound, the designer can then provide a desired accuracy which will be used as a stopping criteria for the estimation of the dominant eigenvalue $\lambda (Q_0)$ . That is, for a sufficiently small threshold $\varepsilon \gt 0$ , the nodes computes (8), (9) until
3.2 Distributed algorithm for estimating dominant left and right eigenvectors
Given the estimated dominant eigenvalue $\lambda (Q_0)$ , the next step is to distributively estimate the right and left dominant eigenvectors by solving linear equations (7) in a distributed manner. Before proceeding, let us define the following matrices:
Hence, linear equations in (7) can be written as
The matrices $\overline{Q}$ and $\widetilde{Q}$ can be partitioned as
where $\overline{Q}_i\in \mathbb{R}^{1\times n}, \widetilde{Q}_i\in \mathbb{R}^{1\times n}$ for $i\in \{1,2,\cdots,n\}$ . Since the dominant eigenvalue $\lambda (Q_0)$ has been estimated and from the Constraint 1, the $i$ th node knows both vectors $\overline{Q}_i$ and $\widetilde{Q}_i$ .
First, let us look at the distributed estimation of the right dominant eigenvector $w_0$ using only local information available to each node. To this end, each node maintains a variable $\hat{w}^i_0(t)\in \mathbb{R}^n$ which is a local estimate of $w_0$ at node $i$ . The local estimate of each node is then updated according to the following rule by exchanging information with its in-neighbors set $\mathcal{N}_{\mathcal{G}_0,i}^{\text{in}}$ :
where matrix $\overline{P}_i$ is defined as
and the initial condition $\hat{w}^i_0(0)$ is chosen to satisfy $\overline{Q}_i\hat{w}^i_0(0)=0$ and $\hat{w}^i_0(0)\neq 0$ . Under update rule (14), all the local estimates $\hat{w}_0^i$ will converge exponentially fast to the same solution to $\overline{Q}w_0=0$ , i.e., the right dominant eigenvector $w_0$ as shown in Mou et al. (Reference Mou, Liu and Morse2015).
Similarly, in order to estimate the dominant left eigenvector by solving distributively linear equations (12b), each node maintains a local estimate $\hat{\nu }^i_0(t)\in \mathbb{R}^n$ updated according to the following rule
where matrix $\widetilde{P}_i$ is defined as
and the initial condition $\hat{\nu }^i_0(0)$ is chosen to satisfy $\widetilde{Q}_i\hat{w}^i_0(0)=0$ and $\hat{\nu }^i_0(0)\neq 0$ . Note that both distributed algorithms (14), (15) can be performed using the same strongly connected communication network topology $\mathcal{G}_0$ .
In contrast to distributed estimation algorithms presented in (Gusrialdi & Qu, Reference Gusrialdi and Qu2017) which requires each node to send $n^2$ number of values to its neighbors, using distributed algorithms (14), (15) each node only needs to send $n$ number of values, that is the local estimate $\hat{w}_0(t)$ or $\hat{\nu }_0(t)$ to its neighbors. However, when the network’s size $n$ is large, it either will be communication-costly for individual node to send the entire estimate vector $\hat{w}_0^i(t)$ or $\hat{\nu }_0^i(t)$ to its neighbors at each time or it is not possible to do so if the node has limited communication capacity. To address this issue, we next present an alternative distributed algorithm based on the method proposed (Liu & Anderson, Reference Liu and Anderson2020) to solve linear equations (12b), (12a) with reduced communication cost.
First, the local estimate $\hat{w}_0^i(t)$ and $\hat{\nu }_0^i(t)$ are partitioned as
where $\ell \gt 1$ is a positive integer and $y_{i\sigma}(t), z_{i\sigma}(t)$ with $\sigma \in \{1,2,\cdots,\ell \}$ are vectors. Note that the local estimates can be partitioned in an arbitrary manner, that is $\ell$ can be chosen to be any positive integer and $y_{i\sigma }(t), z_{i\sigma }(t)$ can have arbitrary size depending on the node’s communication. Furthermore, each discrete-time time $t\geq 1$ can be uniquely written in the following form
Next, let us define
At each time $t\geq 1$ , each node broadcasts the vectors $y_{i\sigma (t)}(t)$ and $z_{i\sigma (t)}(t)$ to its neighbors and updates its local estimates according to
and
It is shown in (Liu & Anderson, Reference Liu and Anderson2020) that under update law (17) (resp. (18)), the local estimate $\hat{w}_0^i(t)$ (resp. $\hat{\nu }_0^i(t)$ ) for all nodes converge exponentially fast to the same solution to linear equations (12a) (resp. (12b)), that is the dominant eigenvector $w_0$ (resp. $\nu _0$ ).
Intuitively, distributed algorithms (14), (15) converge faster to the solution to linear equations (12a) and (12b) compared to distributed algorithms (17), (18) since the full entries of vectors $\hat{w}_0^i(t)$ and $\hat{\nu }_0^i(t)$ are being exchanged in Equations (14), (15) at each time $t$ . Therefore, the designer could choose the size of vectors $y_{i\sigma }(t), z_{i\sigma }(t)$ and integer $\ell \gt 1$ by taking into account the trade-off between the communication capacity and convergence speed. In addition, under distributed algorithms (17) and (18), each node can exchange the vectors $y_{i\sigma (t)}(t)$ and $z_{i\sigma (t)}(t)$ at the same time so that the dominant right and left eigenvectors can be estimated simultaneously.
3.3 Distributed algorithm for verifying digraph’s strong connectivity
After developing distributed algorithms for estimating both the right and left dominant eigenvectors, the next step is to develop distributed algorithm to verify the strong connectivity of a digraph when a link $(j^*, i^*)\in \mathcal{E}_0$ is being removed. In other words, given a candidate link to be removed $(j^*, i^*)\in \mathcal{E}_0$ , we want to verify in a distributed manner whether the resulting network $\mathcal{G}_{1}=\{\mathcal{V},\mathcal{E}_{0}\backslash (j^*, i^*)\}$ remains to be strongly connected.
To this end, all the nodes execute maximum consensus protocol (2) on the graph $\mathcal{G}_{1}=\{\mathcal{V},\mathcal{E}_{0}\backslash (j^*, i^*)\}$ , i.e., node $j^*$ does not send its information to node $i^*$ when executing the update law (2). The initial values of $x_i(t)$ are chosen as
The following result reveals the relationship between the final values of $x_i(t)$ and the strong connectivity of graph $\mathcal{G}_{1}$ .
Proposition 1. Given a strongly connected digraph $\mathcal{G}_0$ and a link $(j^*, i^*)\in \mathcal{E}_0$ . Each node executes max-consensus protocol (2) on the graph $\mathcal{G}_{1}=\{\mathcal{V},\mathcal{E}_{0}\backslash (j^*, i^*)\}$ with initial values $x_{j^*}(0)=1$ and $x_{m}(0)=0$ for all $m\neq j^*$ . The graph $\mathcal{G}_{1}=\{\mathcal{V},\mathcal{E}_{0}\backslash (j^*, i^*)\}$ is strongly connected if and only if $x_i(n)=1$ for all $i\in \mathcal{V}$ .
Proof. Proof.For showing the necessity $({\Longrightarrow})$ , observe that since the graph $\mathcal{G}_{1}=\{\mathcal{V},\mathcal{E}_{0}\backslash (j^*, i^*)\}$ is strongly connected, that max-consensus protocol (2) makes all the states $x_i(t)$ converge (after $n$ steps) to $\max _i x_i(0)$ which is equal to $1$ . In order to show the sufficiency $({\Longleftarrow})$ , note that the removal of link $(j^*, i^*)$ may result in that there exists no direct or indirect path from node $j^*$ to node $i^*$ . However, since we have $x_i(n)=1$ under update law (2) for all nodes $i$ in the network, this means that there exists at least one indirect path from node $j^*$ to $i^*$ . Hence, it can be concluded that the resulting graph $\mathcal{G}_{1}=\{\mathcal{V},\mathcal{E}_{0}\backslash (j^*, i^*)\}$ remains to be strongly connected.
After each node executes update law (2) for $n$ iterations with initial values described in Proposition 1, node $i^*$ then checks whether $x_{i^*}(n)=1$ . If $x_{i^*}(n)=1$ , it needs to notify node $j^*$ that the network remains to be strongly connected after removing link $(j^*, i^*)$ . This can be done by again executing the maximum consensus algorithm (2) on graph $\mathcal{G}_0$ whose initial values are chosen as
If $x_{j^*}(n)=1$ (resp. $x_{j^*}(n)=0$ ) then node $j^*$ will know that the graph $\mathcal{G}_{1}$ remains to be strongly connected (resp. will not be strongly connected) when the link $(j^*, i^*)$ is being removed. Algorithm 1 summarizes distributed verification of the link removal. Note that the strong connectivity of the resulting graph under multiple links removal can be verified by applying iteratively Algorithm 1 for each individual candidate link to be removed.
3.4 The complete distributed algorithms for link removal
In this subsection, we present distributed algorithms for solving optimization problem (P2) using the ingredients developed in the previous subsections. Before proceeding, let us introduce the following notation. For a given set $\mathcal{I}_i$ , its smallest and largest elements, denoted, respectively, by $\underline{k}_i$ and $\overline{k}_i$ , are defined as
Furthermore, let $\ell _{(j,i)}\in \{1,2,\cdots,|\mathcal{E}_0|\}$ be the label of the edges $(j,i)$ of graph $\mathcal{G}_0$ in a descending order according to the value $\nu _{0,i}w_{0,j}$ of the edges. That is, if $\nu _{0,i}w_{0,j}\gt \nu _{0,p}w_{0,q}$ then $\ell _{(j,i)}\lt \ell _{(q,p)}$ and if two edges have the same value of $\nu _{0,i}w_{0,j}$ , e.g., $ \nu _{0,i}w_{0,j}=\nu _{0,p}w_{0,q}$ , then the label can be assigned as $\ell _{(j,i)}=\ell _{(q,p)}+1$ or $\ell _{(q,p)}=\ell _{(j,i)}+1$ . Hence, an edge with label $\ell _{(j,i)}$ also means that the edge $(j,i)$ has the $\ell _{(j,i)}$ th largest value of $\nu _{0,i}w_{0,j}$ . First, it can be observed that the solution to optimization problem (P2) in the absence of connectivity preserving constraint is given by a set of links with $m_e$ largest value of $\nu _{0,i}w_{0,j}$ labeled by $\ell _{(j,i)}=1,2,\cdots, m_e$ . The solution can be easily obtained using Algorithm 2 and by ignoring steps 6–14 in it. However, the introduction of additional constraint to preserve strong connectivity of the network makes solving optimization problem (P2) more challenging.
In the following we describe in a less-formal way the idea of the proposed distributed algorithms for solving optimization problem (P2). Details of the algorithms are summarized in Algorithm 3. Broadly speaking, the algorithms aim at finding link candidate sets $\mathcal{I}_i$ for $i=1,2,\cdots,$ where $|\mathcal{I}_i|=m_e$ and one of the sets $\mathcal{I}_i$ is the solution to (P2). This strategy is similar to a brute-force search, but instead of looking at all possible link combinations we only look at a small number of combinations. Specifically, the distributed algorithms consist of the following main steps.
-
1. Find links given by the set $\mathcal{I}_1$ with $m_e$ largest value of $\nu _{0,i}w_{0,j}$ whose removal do not destroy strong connectivity of the network
-
2. Compute the edge’s label $\ell ^*_{(j,i)}$ which is the solution to the following optimization
(22) \begin{equation} \begin{aligned} & \min _{\ell _{(j,i)}\in \mathcal{I}_1} & & \ell _{(j,i)}\\ & \; \text{s.t.} & & \Big|\ell _{(j,i)}-\overline{\ell }^1_{(j,i)}\Big|\leq m_e,\\ \end{aligned} \end{equation}where the labels $\overline{\ell }^1_{(j,i)}$ is defined according to (21). -
3. After finding the first link candidate set $\mathcal{I}_1$ , the other link candidate sets $\mathcal{I}_k$ with $k=2,3, \cdots$ are computed by repeating step 1 as long as the resulting $\underline{\ell }^k_{(j,i)}$ of the set $\mathcal{I}_k$ , defined according to (21), satisfies $\underline{\ell }^k_{(j,i)}\leq \ell _{(j,i)}^*$ and for each $k=2,3, \cdots$ we set the value of $\nu _{0,i}w_{0,j}$ equal to zero for the links labeled by $\underline{\ell }^s_{(j,i)}$ with $s=1,\cdots, k-1$ . This step in principle provides information on how far we should explore possible candidate optimal links.
-
4. The solution to optimization (P2) is then given by the set $\mathcal{I}_{i^*}$ whose links have the largest value of $\sum _{(j,i)\in \mathcal{I}_k} \nu _{0,i}w_{0,j}$
The following theorem shows that the set of links computed by Algorithm 3 is the solution to optimization problem (P2).
Proof. Proof.Assume that the link candidate sets computed from Algorithm 3 is given by $\mathcal{I}_k$ with $k=1,2,\cdots,\alpha$ . The set of links, denoted by $\mathcal{I}_{\alpha +1}$ with $m_e$ largest value of $\nu _{0,i}w_{0,j}$ whose smallest label equals to $\ell ^*_{(j,i)}+1$ is given by
Hence, it is sufficient to prove that
for the set $\mathcal{I}_1$ with smallest value of $\sum _{(j,i)\in \mathcal{I}_1} \nu _{0,i}w_{0,j}$ . This is because
for the optimal solution set of links $\mathcal{I}_{i^*}$ as can be seen in step 12 of Algorithm 3. To this end, for a given label $\overline{\ell }^1_{(j,i)}$ the set $\mathcal{I}_1$ with possible smallest value $\Big($ i.e., when $\Big|\ell ^*_{(j,i)}-\overline{\ell }^1_{(j,i)}\Big|=m_e$ and $\underline{\ell }^1_{(j,i)}=\ell ^*_{(j,i)}\Big)$ of $\sum _{(j,i)\in \mathcal{I}_1} \nu _{0,i}w_{0,j}$ is given by
Recalling that an edge with label $\ell _{(j,i)}$ also means that the edge $(j,i)$ has the $\ell _{(j,i)}$ th largest value of $\nu _{0,i}w_{0,j}$ , it can be seen that (24) is satisfied for the sets given in (23) and (25). This completes the proof.
4. Numerical simulations
In this section, the proposed distributed algorithms are demonstrated and evaluated using several simulations.
4.1 Comparison with the optimal solution
First, we are interested in evaluating the optimality gap between the proposed distributed algorithms and the brute-force search. To this end, we consider a strongly connected graph consisting of 10 nodes as shown in Figure 1. Choosing a small size network allows us to compare the solutions obtained by the proposed distributed algorithms with the ones obtained from the brute-force search (given that the global network topology is available) which is in general NP-hard. Interested reader is referred to (Chen et al., Reference Chen, Tong, Prakash, Eliassi-Rad, Faloutsos and Faloutsos2016) for the performance evaluation of Algorithm 3 on real large graphs without connectivity constraint.
The number of links to be removed $m_e$ in the numerical simulation is varied between 1 and 6. We then apply Algorithm 3 to solve optimization problem (P2). First, each node distributively estimates the dominant eigenvalue of matrix $Q_0$ in Equation (5) based on update rule (8), (9) whose estimation is depicted in Figure 2.
It can be observed that the estimation converges to the dominant eigenvalue within few time steps. Using the estimated dominant eigenvalue, the nodes then distributively estimate both the dominant right and left eigenvectors using update rule (14) (or (17) for reduced communication cost with $\ell =5$ ) and (15) (or (18) for reduced communication cost with $\ell =5$ ), respectively. Estimation of the dominant eigenvectors by nodes 1 and 2 are illustrated in Figures 3–6. It can be observed that the nodes could successfully estimate both dominant eigenvectors. Furthermore, the estimation based on update rule (14) and (15) converge faster in comparison to the ones based on update rule (17) and (18) where only two elements of the estimated eigenvectors are being exchanged with the other nodes at each time step.
After distributively estimating the dominant eigenvectors, the nodes then solve optimization problem (P2) using Algorithm 3. The solutions to optimization (P2) obtained from Algorithm 3 and the ones obtained by solving optimization (P1) using the brute-force search (assuming the knowledge of overall network topology) for different values of $m_e$ are summarized in Table 1. As can be observed, when $m_e=1$ both Algorithm 3 and brute-force search result in the same solution, i.e., zero optimality gap. However, as $m_e$ increases the gap between the solutions obtained from Algorithm 3 and brute-force search becomes larger. It is demonstrated in (Gusrialdi et al., Reference Gusrialdi, Qu and Hirche2019) for the case of undirected network that iteratively solving optimization (P2), i.e., removing one link at a time followed by re-estimating the dominant eigenvectors of the resulting network to compute the next link to be removed, may result in a better performance in comparison to removing $m_e$ simultaneously as done in Algorithm 3. Motivated by this observation, next we apply the iterative link removal strategy to solve optimization (P2) whose details are given in Algorithm 4. The results are summarized in Table 1. It can be observed that for $m_e=1, 2, 5, 6$ the iterative link removal strategy result in the same solution to the ones obtained from the brute-force search, i.e., the optimality gap is zero. In addition, for $m_e=3, 4$ the gap with the optimal solution is smaller for the iterative link removal strategy (i.e., Algorithm 4) compared to the simultaneous link removal strategy (i.e., Algorithm 3). It should be noted that in contrast to simultaneous link removal, the iterative link removal strategy requires a longer time and higher communication cost for computing the solutions as the dominant eigenvectors need to be re-estimated after removal of each individual link from the network.
4.2 Comparison between simultaneous and iterative link removal strategies
Next, we compare the performance of both simultaneous and iterative link removal strategies for different types of network. To this end, we consider four types of networks illustrated in Figure 7, namely: (a) modular small-world network which is a random, directed network with a specified number of fully connected modules linked together by randomly distributed between-module connections; (b) random network with specified number of nodes and edges; (c) ring lattice network (with toroidal boundary conditions); and (d) non-ring lattice network (without toroidal boundary conditions). For the simulations, the number of links to be removed is varied between 1 and 20% of the total number of links.
The results of applying Algorithm 3 and Algorithm 4 to the networks described previously are summarized in Figures 8–11. From the results, we can make the following observations:
-
For the cases of random, ring lattice and nonring lattice networks, the iterative link removal strategy yields better performance (i.e., lower value of $\lambda (A(\mathcal{G}_{m_e}))$ means better performance) compared to simultaneous link removal strategy. Furthermore, the performance gap between both strategies tend to increase as the value of $m_e$ increases. Note that for random and ring-lattice networks, both strategies performed similarly when the number of links to be removed is less than 5% as can be seen in Figure 9 and Figure 10.
-
For nonring lattice network, the simultaneous link removal strategy could not reduce the dominant eigenvalue that much even though around 20% of total links have been removed from the network, see Figure 11. Hence, for this type of network, it is recommended to use iterative link removal strategy which comes at a price of higher communication cost.
-
For the modular small-world network, the performance of both algorithms depend on the cluster size for a fixed number of nodes and edges. As can be observed in Figure 8, when the cluster size is small the iterative link removal strategy performed better compared to simultaneous link removal strategy. However, as the cluster size increases, the simultaneous link removal strategy outperformed the iterative link removal strategy for some values of $m_e$ .
5. Conclusions
This article proposed two novel distributed algorithms, namely simultaneous and iterative link removal strategies to compute a suboptimal solution to link removal problem while maintaining network’s strong connectivity whose objective is to minimize the dominant eigenvalue of the adjacency matrix associated with the network’s structure. Key ingredients include distributed algorithms for estimating the dominant eigenvectors of an adjacency matrix together with distributed algorithm to verify network’s strong connectivity after the removal of a fraction of links. It was shown in simulations on different type of networks that there was a trade-off between the performance and communication cost of both strategies. Specifically, in general the iterative link removal strategy yields a better performance in comparison to simultaneous link removal strategy. Furthermore, the performance gap becomes larger as the number of links to be removed increases. However, the iterative link removal comes at a price of a higher communication cost in contrast to the simultaneous link removal strategy. As a future work, it is desirable to develop distributed estimation algorithms which converge in finite time in order to effectively apply the proposed idea to large-size networks.
Funding
The work was supported by the Academy of Finland under academy project decision number 330073.
Competing interests
None.
A preliminary version of this paper appeared as Gusrialdi, A. (2021). Distributed Algorithm for Link Removal in Directed Networks. In International Conference Complex Networks & Their Applications (pp. 509–521). Cham: Springer. https://doi.org/10.1007/978-3-030-65347-7_42