学习笔记(2)-标签传播算法伪代码实现

算法名称:社区发现中的标签传播算法(LPA)
算法输入:无向无权图邻接矩阵AdjacentMatrix,节点个数VerticeNum
算法输出:存储节点标签的分类数组Community

//初始化每个节点的标签
For i <- 0 to VerticeNum Do
    Community[i] <- i
    //寻找i节点的所有邻居存入Neighbor[i]
    FindMaetexNonZero(i,AdjacentMatrix,NeighBor[i])
while 未达到分类标准 or 未超出迭代阈值 then
    RandomSort(SS)//生成随机序数队列SS
    For i <- 0 to VerticeNum Do
        //统计节点i邻居中数量最多的标签
        VectorFrequency(Neighbor[i], lable)
        //若只有一个数量最多则直接赋值
        if lable.size() = 1 then
            Community[i] <- lable[0]
        //若有多个相同数量的标签则随机选择一个
        else then
            Community[i] <- lable[random]
return Community

改进算法:
1、不重叠三角形初始化:
算法名称:标签传播算法不重叠三角形初始化(Findunitriangle)
算法输入:无向图邻接阵AdjacentMatrix,节点个数VerticeNum,各节点邻居集合Neighbor,分类数组Community
算法输出:存储节点标签的分类数组Community

//标记所有节点都未访问过
For i <- to VerticeNum Do isVisited[i] <- False
//社区标签
c=0
//寻找不重叠三角形赋予相同标签作为初始社区
For i <- 0 to VerticeNum Do
    For j <- 0 to Neighbor[i].size Do
        //寻找每个i节点的邻居的邻居点集合存入NeighborTemp
        FindMaetexNonZero(AdjacentMatrix,Neighbor[j],NeighborTemp)
        //若i,j,k形成三角形且均未被访问过则初始化
        For k <- 0 to NeighborTemp.size Do
            if AdjacentMatrix[k][i] = 1 and isVisited[i\j\k] = False then
            Community[i\j\k] <- c;
            isVisited[i\j\k] = True
    c++
return Community

2、基于标签熵的部分随机队列生成过程:
算法名称:标签熵部分随机队列生成(EntropySort_Random)
算法输入:无向图邻接阵AdjacentMatrix,节点个数VerticeNum,分类数组Community
算法输出:标签熵算法使用的随机队列Ssort

//初始化标记数组
For i <- 0 to VerticeNum Do
    t[i] <- 0
    lableNum[i] <- 0
//初始化结构体队列
For i <- 0 to VerticeNum Do
    Ssort[i].S=0.0;
    Ssort[i].px=i;
//计算标签熵
For i <- 0 to VerticeNum Do         
    //寻找i节点的所有邻居存入Neighbor
    FindMaetexNonZero(i,AdjacentMatrix,NeighBor)
    For j <- 0 to Neighbor.size Do
        //统计节点i的邻居的各个标签在所有节点中的计数
        For k <- 0 to VerticeNum Do
            if Community[Neighbor[j]] = k then  lableNum[k]++;
        //统计节点i的邻居的标签在L{v,N(v)}中的计数
        For k <- 0 to Neighbor.size Do
            if Community[Neighbor[k]] = Community[Neighbor[j]] then    t[Community[Neighbor[j]]]++;
    //若i节点与其邻居的节点标签均不同,则i的标签单独计数
    if t[Community[i]] = 0 then  t[Community[i]]++;
    For j <- 0 to VerticeNum Do
        if t[j] != 0 and lableNum[j] != 0
            t[j] <- t[j]/lableNum[j]
            pl <- t[j]/(Neighbor.size()+1.0);
            Ssort[i].S += pl*abs(log(pl));
//快速排序Ssort
qsort(Ssort)
//将队列分为3部分,分别随机排列
temp0=Ssort;
RandomSort(temp0,VerticeNum/3);     
temp1=Ssort+VerticeNum/3;
RandomSort(temp1,VerticeNum/3); 
temp2=Ssort+(VerticeNum/3)*2;
RandomSort(temp2,VerticeNum-(VerticeNum/3)*2);

return Ssort

3、基于邻接点的邻接点标签分布的标签传播过程
算法名称:随机过程修改的标签传播算法(LPA_R)
算法输入:无向无权图邻接矩阵AdjacentMatrix,节点个数VerticeNum
算法输出:存储节点标签的分类数组Community

//初始化每个节点的标签
For i <- 0 to VerticeNum Do
    Community[i] <- i
    //寻找i节点的所有邻居存入Neighbor[i]
    FindMaetexNonZero(i,AdjacentMatrix,NeighBor[i])
while 未达到分类标准 or 未超出迭代阈值 then
    RandomSort(SS)//生成随机序数队列SS
    For i <- 0 to VerticeNum Do
        //统计节点i邻居中数量最多的标签
        VectorFrequency(Neighbor[i], lable)
        //若只有一个数量最多则直接赋值
        if lable.size() = 1 then
            Community[i] <- lable[0]
        //若有多个相同数量的标签则分析该节点的邻节点的邻节点集标签分布情况
        else then
            For j <- 0 to Neighbor[i].size Do
                //该节点的邻节点的邻节点集
                NeighborTemp <- Neighbor[Neighbor[i][j]]
                For k <- 0 to NeighborTemp.size Do
                    //统计邻居k的邻居中数量最多的标签
                    VectorFrequency(Neighbor[k], lablet)
                    //若只有一个数量最多则直接赋值
                    if lablet.size() = 1 then
                        Community[i] <- lable[0]
                    //若有多个相同数量的标签则分析该节点的邻节点的邻节点集标签分布情况
                    else then
                        Community[i] <- lablet[random]
return Community

算法名称:重叠社区发现中的标签传播算法(Copra)
算法输入:无向无权图邻接矩阵AdjacentMatrix,节点个数VerticeNum
算法输出:存储节点标签的分类数组Community

//初始化每个节点的标签集
For i <- 0 to VerticeNum Do
    label_old[i] <- {(i,1)}
    label_new <- {}
    //寻找i节点的所有邻居存入Neighbor[i]
    FindMaetexNonZero(i,AdjacentMatrix,NeighBor[i])
(1)同步更新:
asynchronous = False
/* (2)异步更新: asynchronous = True */
while 未达到分类标准(minl == oldmin) then
    RandomSort(SS)//生成随机序数队列SS
    /* (3)同异步交替更新: If asynchronous: asynchronous = False Else: asynchronous = True */
    //为每个节点更新标签
    For i <- 0 to VerticeNum Do
        Propagate(SS[i], label_old, label_new, NeighBor, VerticeNum, asynchronous)
    //统计每次迭代后标签数目
    If get_ids(label_old) = get_ids(label_new):
        min <- mc(min, count(label_new))
    Else:
        min <- count(label_new)
    If min != oldmin:
        label_old <- label_new
        oldmin <- min
    //依据每个节点的标签集划分社区
    For each vertex x:
        ids <- get_ids(label_old[x]).
        For eachc in ids:
            If eachc in coms && eachc in sub:
                coms[eachc].append(each)
                sub.update({eachc: set(sub[eachc]) & set(ids)})
            Else:
                coms.update({eachc:[each]})
                sub.update({eachc:ids})
    //处理某个社区是其他大社区子集的情况
    Remove_sub(coms,sub)
    //划分为无连接的社区
    Split_discon_communities(coms)
return coms
附:
Propagate(x, old, new, neighbours,v,asynchronous):
    //洗牌保证随机性
    random.shuffle(neighbours[x])
    //依据邻结点标签集更新该节点
    For eachpoint in neighbours[x]:
        For eachlable in old[eachpoint]:
            b = old[eachpoint][eachlable]
            If eachlable in new[x]:
                new[x][eachlable] += b
            Else:
                new[x].update({eachlable: b})
            //异步更新
            If asynchronous:
                old = copy.deepcopy(new)
    //归一化每个节点的标签集
    Normalize(new[x])
    //去除小于1/v的候选项,若均小于则''选b最大的赋值'',否则规范化
    For each in new[x]:
        If new[x][each] < 1/float(v):
            new[x] -= each 
            If new[x][each] > maxb:
                maxb = new[x][each]
                maxc = each
    If len(new[x]) == 0:
        new[x][maxc] = 1
    Else:
        Normalize(new[x])

你可能感兴趣的:(标签传播-社区发现)