DeepWalk

原文

《DeepWalk: Online Learning of Social Representations》

亮点

In this paper we introduce deep learning (unsupervised feature learning) techniques, which have proven successful
in natural language processing, into network analysis for the first time.

目的

DeepWalk takes a graph as input and produces a latent representation as an output.

We seek learning social representations with the following characteristics:

  • Adaptability - Real social networks are constantly evolving; new social relations should not require repeating the learning process all over again.
  • Community aware - The distance between latent dimensions should represent a metric for evaluating social similarity between the corresponding members of the network. This allows generalization in networks with homophily.
  • Low dimensional - When labeled data is scarce, lowdimensional models generalize better, and speed up convergence and inference.
  • Continuous - We require latent representations to model partial community membership in continuous space. In addition to providing a nuanced view of community membership, a continuous representation has smooth decision boundaries between communities which allows more robust classification.

现实网络和自然语言之间的相似性

DeepWalk_第1张图片

语言模型

The goal of language modeling is estimate the likelihood of a specific sequence of words appearing in a corpus.

将网络问题转化成 NLP 问题

In this work, we present a generalization of language modeling to explore the graph through a stream of short random walks. These walks can be thought of short sentences and phrases in a special language.

带返回的随机游走

  def random_walk(self, path_length, alpha=0, rand=random.Random(), start=None):
    """ Returns a truncated random walk.

        path_length: Length of the random walk.
        alpha: probability of restarts.
        start: the start node of the random walk.
    """
    G = self
    if start:
      path = [start]
    else:
      # Sampling is uniform w.r.t V, and not w.r.t E
      path = [rand.choice(list(G.keys()))]

    while len(path) < path_length:
      cur = path[-1]
      if len(G[cur]) > 0:
        if rand.random() >= alpha:
          path.append(rand.choice(G[cur]))
        else:
          path.append(path[0])
      else:
        break
    return [str(node) for node in path]

游走的结果:
DeepWalk_第2张图片
看这多像语料库中的一条条语句!

有了上述通过随机游走生成的“语料库”,利用NLP中现成的 word embading 方法,即可对网络的节点进行表示学习

word2vec

在这里插入图片描述
2013年,Google开源了一款用于词向量计算的工具——word2vec,关于 word2vec 的介绍:
【1】https://www.cnblogs.com/guoyaohua/p/9240336.html
【2】https://www.jianshu.com/p/39e3b771da3a
【3】http://mccormickml.com/2016/04/19/word2vec-tutorial-the-skip-gram-model/

你可能感兴趣的:(deepwalk,深度学习,复杂网络,机器学习)