Vital nodes identification in complex networks
作者:Linyuan Lü, Duanbing Chen, Xiao-Long Ren, Qian-Ming Zhang,
abstract
The vital nodes identification
1. Introduction
network science
Recently, the focus of network science has been shifting from discovering macroscopic statistical regularities to unfolding mesoscopic structural organization.
the roles of different nodes in the structure and function of a network
may be largely different.
To identify vital nodes is not a trivial task.
- Firstly, criteria of vital nodes are diverse.
- Secondly, to find a nice tradeoff between local and global indices or between parameter-free and multi-parameter indices is a challenge.
- Thirdly, most known methods were essentially designed for identify individual vital nodes instead of a set of vital nodes, while the latter is more relevant to real applications.
- Lastly, to design efficient and effective methods for some new types of networks, is a novel task in this research domain.
Motivation for writing this review :
- Firstly, it lacks a systematic review in this direction.
- Secondly, we intend to make extensive empirical comparisons with well-known methods on disparate real networks under different objective functions.
- Thirdly, we carefully choose the language that can be easily accepted by both computer scientists and physicists.
- Fourthly, we would like to highlight some open challenges for future studies in this domain.
2. Structural centralities
structural centralities
-->neighborhood-based centralities
-->path-based centralities
centrality
The concept centrality was just proposed to answer the question that how to characterize a node’s importance according to the structure .
structural centralities can be obtained based solely on structural information.
importance
A node’s influence is highly correlated to its capacity to impact the behaviors of its surrounding neighbors.
How to compute the degree centrality?
- directly count the number of a node’s immediate neighbors
- LocalRank algorithm
- ClusterRank
- k-core decomposition
- H-index
result: the degree centrality, H-index and coreness can be considered as the initial, intermediate and steady states of a sequence driven by an discrete operator
From the viewpoint of information dissemination, the node who has the potential to spread the information faster and vaster is more vital
- eccentricity centrality
- closeness centrality
- betweenness centrality
- Katz centrality
2.1. Neighborhood-based centralities
2.1.1. Degree centrality
Degree centrality is the simplest index to identify nodes’ influences: the more connections a node has, the greater the influence of the node gets.
$DC(i) =\frac{k^i}{n-1}$
where$ n = |V|$
is the number of nodes in G and$n-1$
is the largest possible degree.
2.1.2. LocalRank
Chen et al. proposed an effective local-information-based algorithm: LocalRank, which fully considers the information contained in the fourth-order neighbors of each node.
LR(i) =\sum_{j\in\Gamma_i}Q(j)
Q(j) =\sum_{k\in\Gamma_j}R(k)
where $\Gamma_i$
is the set of the nearest neighbors of $v_i$
and $R(k)$
is the number of the nearest and the next nearest neighbors of $v_k$
.
The computational complexity : $O(n(k)^2)$
2.1.3. ClusterRank
ClusterRank not only considers the number of the nearest neighbors, but also takes into account the interactions among them.
ClusterRank is defined in directed networks,
CR(i) = f(c_i)\sum_{j\in\Gamma_i}(k_j^{out}+1)
where $f (c_i)$
is a function of the clustering coefficient $ c_i $
of the node $v_i$
in the directed network D, which is defined as
c_i = \frac{|{(j\rightarrow k)|j,k\in \Gamma^{out}_i}|}{k^{out}_i(k^{out}_i-1)}
where $k^{out}_i $
is the out-degree of $ v_i$
and $\Gamma^{out}_i $
is the set of the nearest out-neighbors of $v_i$
.
other factors:
- the number of communities the node connects with
- structural holes.
2.1.4. Coreness
the location of a node is more significant than its immediate neighbors in evaluating its spreading influence. Coreness as a better indicator for a node’s spreading influence, which can be obtained by using the k-core decomposition in networks.
k-core decomposition
Given an undirected simple network G, initially, the coreness $ci $
of every isolated node $ v_i (i.e., k_i = 0)$
is defined as $c_i = 0$
and these nodes are removed before the k-core decomposition.
Then in the first step of k-core decomposition, all the nodes with degree k = 1 will be removed. This will cause a reduction of the degree values to the remaining nodes. Continually remove all the nodes whose residual degree k $\leq$
1, until all the remaining nodes’ residual degrees k > 1. All the removed nodes in the first step of the decomposition form the 1-shell and their coreness ks are all equal to 1.
In the second step, all the remaining nodes whose degrees k = 2 will be removed in the first place. Then iteratively remove all the nodes whose residual degrees k $\leq$
2 until all the remaining nodes’ whose residual degrees k > 2. The removed nodes in the second step of the decomposition form the 2-shell and their coreness ks are two.
The decomposition process will continue until all the nodes are removed. At last, the coreness of a node $v_i$
equals its corresponding shell layer.