A measure of betweenness centrality based on random walks

A measure of betweenness centrality based on random walks


The indices of centrality measures:


- degree:Degree is a measure in some sense of the popularity of an actor.
- closeness:Closeness can be regarded as a measure of how long it will take information to spread from a given vertex to others in the network.
- betweeness:The betweenness of a vertex i is defined to be the fraction of shortest paths between pairs of vertices in a network that pass through i .


To be precise, suppose that g(st)i is the number of geodesic paths from vertex s to vertex t that pass through i, and suppose that nst is the total number of geodesic paths from s to t.
The betweeness of vertex i :

bi=s<tg(st)i/nst12n(n1)

where n is the total number of vertices in the network.
Betweenness centrality can be regarded as a measure of the extent to which an actor has control over information flowing between others.
Betweenness can be calculated for all vertices in time O(mn) for a network with m edges and n vertices.
But we would imagine that in most cases a realistic betweenness measure should include non-geodesic paths in addition to geodesic ones


Consider the network sketched in Fig. 1, for instance, in which two large groups are bridged by connections among just a few of their members. Vertices A and B will certainly get high betweenness scores in this case, since all shortest paths between the two communities must pass through them. Vertex C on the other will hand get a low score, since none of those shortest paths pass through it, taking instead the direct route from A to B. It is plausible however that in many real-world situations C would have quite a significant role to play in information flows.
A measure of betweenness centrality based on random walks_第1张图片


New betweeness measure: random-walk betweenness
Roughly speaking, the random-walk betweenness of a vertex i is equal to the number of times that a random walk starting at s and ending at t passes through i along the way, averaged over all s and t.


Random walk:

Imagine a “message,” which could be information of almost any kind, that originates at a source vertex s on a network. The message is intended for some target t, but the message, or those passing it, have no idea where t is, so the message simply gets passed around at random until it finds itself at t. Thus, on each step of its travels, the message moves from its current position on the network to one of the adjacent vertices, chosen uniformly at random from the possibilities.

Absorbing Random Walk:

A walk that starts at vertex s and makes random moves around the network until it finds itself at vertex t and then stops. If at some point in this walk we find ourselves at vertex j, then the probability that we will find ourselves at i on the next step is given by the matrix element:
Mij=Aijkjforjt(1)

where once again Aij is an element of adjacency matrix(邻接矩阵) ,and kj=iAij is the degree of vertex j.


In matrix notation ,we can write M=AD1 ,where D is , as before, the diagonal matrix(对角矩阵) with elements Dii=kj (注解:其实这和(1)式是相同定义的不同表达而已,我们再看到这里时不能有任何疑惑, D1 其实就等于 1Kj )
The only exception to Eq. (1) is for j=t ; Since this is absorbing random walk, we never leave t once we get there,so Mit=0 for all i (注释:这句话通俗的讲就是说,当 j = t 时,我们就终止随机游走,停止在 t 这个顶点,也就是说,我们通过随机游走从源顶点 i 到达了目标顶点 t ,我们已经没有了next step ,那么下一步到达顶点 i 的概率也就成了 0 )


We can also remove column t without affecting transitions between any other vertices. Let us denote by Mt=AtD1t the matrix with these elements removed, and similarly for At and D1t
(注解: AtD1tMt 均表示的是移除了第 t 列之后的矩阵 )
Now for a walk starting at s, the probability that we find ourselves at vertex j after r steps is given by [Mrt]js ,and the probability that we then take a step to an adjacent vertex i is k1j[Mrt]js .(此处采用新的符号定义对应矩阵)Summing over all values of r from 0 to ∞, the total number of times we go from j to i, averaged over all possible walks, is k1j[IMt1]js


In matrix notation we can write this as an element of the vector :
V=D1tIMt1s=(DtAt)1s(2)

where the source vector s has elements

(The element st=1 is not strictly necessary—we could give st any value we like, since row t is removed from the equations anyway. We make this particular choice in order to demonstrate that our random-walk betweenness is the same as the current-flow betweenness.)


Now the net flow of the random walk along the edge from j to i is given by the absolute difference(绝对差分) |ViVj| and the net flow through vertex i is a half the sum of the flows on the incident edges, just as in Eq. (9)

The rest of the derivation follows through as before, and the final net flow of random walks through vertex i is given by Eq. (11)
bi=s<tI(st)i12n(n1)(11)


To summarize, the prescription for calculating random-walk betweenness, whichis the expected net number of times a random walk passes through vertex i on its way from a source s to a target t, averaged over all s and t, is as follows for each separate component of the graph of interest.

  1. Construct the matrix DA , where D is the diagonal matrix of vertex degrees and A is the adjacency matrix.
  2. Remove any single row, and the corresponding column. For example, one could remove the last row and column.
  3. Invert( 转置) the resulting matrix and then add back in a new row and column consisting of all zeros in the position from which the row and column were previously removed (e.g., the last row and column). Call the resulting matrix T , with elements Tij .
  4. Calculate the betweenness from Eq. (11), using the values of Ii from Eqs. (9) and (10).

至于应用日后实现,如有什么错误,请予以指正,感激涕零!!

你可能感兴趣的:(社团检测)