社团检测之SLPA算法

本文简单介绍SLPA标签传播算法原理及其Python实现

SLPA(Speaker-Listener LPA)算法思想

输入参数:迭代次数T,满足社区次数要求的阈值r

输出参数:每一个节点的社区分布

  1. 首先,每一个节点的存储器中初始化一个唯一的标签。
  2. 然后,重复进行以下步骤,直到达到最大迭代T:

a. 选择一个节点作为监听器;

b. 所选节点的每个邻居随机选择概率正比于该标签在其存储器中的出现频率的标签,把所选择的标签(speakervote)发送到听众(listener);

c. 监听器增加接收到的最流行的标签到内存。

  1. 最后,根据在存储器里的标签和阈值r,后处理被用于输出社区

Python实现

# -*- coding: UTF-8 -*-

"""
Created on 17-11-30

@summary: SLPA(Speaker-listener Label Propagation Algorithm)算法实现

@author: dreamhome
"""
import networkx as nx
import numpy as np


def read_graph_from_file(path):
    """
    :param path: 从文件中读取图结构
    :return: Graph graph
    """
    # 定义图
    graph = nx.Graph()
    # 获取边列表edges_list
    edges_list = []
    # 开始获取边
    fp = open(path)
    edge = fp.readline().split()
    while edge:
        if edge[0].isdigit() and edge[1].isdigit():
            edges_list.append((int(edge[0]), int(edge[1])))
        edge = fp.readline().split()
    fp.close()
    # 为图增加边
    graph.add_edges_from(edges_list)

    # 给每个节点增加标签
    for node, data in graph.nodes_iter(True):
        data['label'] = node

    return graph


def slpa(path, threshold, iteration):
    """
    slpa算法
    :param path: 图路径
    :param threshold:  阈值
    :param iteration:  迭代次数
    :return:
    """
    graph = read_graph_from_file(path)

    # 节点存储器初始化
    node_memory = []
    for n in xrange(graph.number_of_nodes()):
        node_memory.append({n+1: 1})

    # 算法迭代过程
    for t in xrange(iteration):
        # 任意选择一个监听器
        order = [x+1 for x in np.random.permutation(graph.number_of_nodes())]
        for i in order:
            label_list = {}
            # 从speaker中选择一个标签传播到listener
            for j in graph.neighbors(i):
                sum_label = sum(node_memory[j-1].values())
                label = node_memory[j-1].keys()[np.random.multinomial(
                    1, [float(c) / sum_label for c in node_memory[j-1].values()]).argmax()]
                label_list[label] = label_list.setdefault(label, 0) + 1
            # listener选择一个最流行的标签添加到内存中
            selected_label = max(label_list, key=label_list.get)
            node_memory[i-1][selected_label] = node_memory[i-1].setdefault(selected_label, 0) + 1

    # 根据阈值threshold删除不符合条件的标签
    for memory in node_memory:
        sum_label = sum(memory.values())
        threshold_num = sum_label * threshold
        for k, v in memory.items():
            if v < threshold_num:
                del memory[k]
    # 返回划分结果
    return node_memory


if __name__ == "__main__":
    path = "/home/dreamhome/network-datasets/dolphins/out.dolphins"
    # print read_graph_from_file(path)
    print slpa(path, 0.1, 20)


你可能感兴趣的:(community,detection)