The emergence of hierarchy in large-scale human cooperation

Abstract

Online communities are becoming increasingly important as platforms for large-scale human cooperation. In these communities users seek and share professional skills, spreading knowledge along the chain of skill level. To investigate how users communicate and cooperate to
complete a large number of tasks, we analyze StackExchange, one of the largest question and answer systems in the world. We construct expertise networks to include all pairs of help-seeking interactions and measure the skill levels of users based on their positions in networks by a novel indicator "average flow distance". We explain the discovered hierarchy in networks, in particular, the maximum length and the distribution of users across hierarchies.

1. Introduction

2. Method

2.1 Constructing expertise networks
The emergence of hierarchy in large-scale human cooperation_第1张图片
Figure 1

As shown in Figure 1, question answering is a collective action that involves at least two types of users; the asker and the successful answerer whose answer is accepted by the asker. For a majority of questions, there is also a third type of users, the failed answerers whose answers were not accepted. To focus one real contributions, we only draw edges from askers to successful answerers, who are sharing their professional skills to solve the problems (Jun Zhang et al., 2007).

2.2 Calculating average flow distances
The emergence of hierarchy in large-scale human cooperation_第2张图片
method.png

3. Findings

3.1 The hierarchy of expertise networks

We construct expertise network using the data of physics.stackexchange.com and investigate the network topology (Figure 1). A divide between askers and answers is observed: the population of askers is 1.5 times as big as that of answerers, but only 28% of askers also answer questions. A similar structure called "bow-tie" was observed by Andrei Broder et al. at 2000 and Jun Zhang et al. at 2007.

The emergence of hierarchy in large-scale human cooperation_第3张图片
Figure 3: The hierarchy of the Math expertise network. The nodes represents users and the arcs represent the help-seeking relationships between users. The X coordinates are randomly generated between 0 and 1 and the Y coordinates shows the flow level *L_i* of users. The color of arcs shows the difficulty of questions calculated by the TrueSkill algorithm.

We calculate the flow level L_i of all users and found that the askers and answerers are separated (Figure 2). The flow level of askers equals one and that of answerers is equal to or greater than two. Those users who both ask and answer questions have a variety of flow levels, depending on the level of the users receiving their help.

It is observed that question difficulty is related with the flow level of asker and answers.

The emergence of hierarchy in large-scale human cooperation_第4张图片
Figure 4. The distribution of TrueSkill scores of questions and users (left) and the distribution of flow level gaps L_j - L_i on all pairs of connected nodes i and j.
The emergence of hierarchy in large-scale human cooperation_第5张图片

The TrueSkill score of users separates the askers (whose scores are around 10) from the answerers (whose scores are around 30). The distribution of flow level gaps shows that for a majority of cases the answerers need to have a (1.2) higher skill level to give a satisfying (accepted) answer.

The emergence of hierarchy in large-scale human cooperation_第6张图片
Figure 5. The comparison between four different measures of skill level, including degree, PageRank score, TrueSkill score, and flow level.

We compare four different measures of skill level, including degree, PageRank score, TrueSkill score, and flow level. It turns out that PageRank score is trivially correlated with the degree of nodes. The TrueSkill scores, while it separates the askers from answerers as efficient as flow level,

3.2 Cascade Model for Attention Competition
The limitation of hierarchical levels
The emergence of hierarchy in large-scale human cooperation_第7张图片
Figure 4.

We find the cascade model explains the limitation of flow hierarchy in expertise networks. In particular, the flow distance Li is a function of the ith node in the model such that:

![][1]
[1]:http://latex.codecogs.com/svg.latex?L_i=1+\frac{1}{n-i}(L_1+L_2+...+L_{i-1})

![][2]
[2]:http://latex.codecogs.com/svg.latex?f(x)=\left{\begin{array}{lr}L_i=1+\frac{1}{n-i}(L_1+L_2+...+L_{i-1})&:i\leq\frac{n}{2}\L_i=1+\frac{1}{i-1}(L_1+L_2+...+L_{i-1})&:\frac{n}{2}2}\sum_{i=1}{n/2}(n-2i+1)L_{n+1-i}&:i=n+1&n=even\L_i=1+\frac{4}{n2-1}\sum_{i=1}{(n-1)/2}(n-2i+1)L_{n+1-i}&:i=n+1&n=odd\end{array}\right.

The emergence of hierarchy in large-scale human cooperation_第8张图片
Figure 4. Analysis of Li on i when n/2

See the following figure for simulation

The emergence of hierarchy in large-scale human cooperation_第9张图片
Figure 4.
The emergence of hierarchy in large-scale human cooperation_第10张图片
Figure 4.
The emergence of hierarchy in large-scale human cooperation_第11张图片
Figure 4.
The distribution of users across hierarchical levels

Comparing model against StackExchange data.


The emergence of hierarchy in large-scale human cooperation_第12张图片
Figure 4.
Helping gap as a function of user level
The emergence of hierarchy in large-scale human cooperation_第13张图片
Figure 4.
The emergence of hierarchy in large-scale human cooperation_第14张图片
Figure 4.
The emergence of hierarchy in large-scale human cooperation_第15张图片
Figure 4.
The emergence of hierarchy in large-scale human cooperation_第16张图片
Figure 4.
The emergence of hierarchy in large-scale human cooperation_第17张图片
Figure 4.
The emergence of hierarchy in large-scale human cooperation_第18张图片
The emergence of hierarchy in large-scale human cooperation_第19张图片
ER random network
The emergence of hierarchy in large-scale human cooperation_第20张图片
StackExchange communities

你可能感兴趣的:(The emergence of hierarchy in large-scale human cooperation)