Network Embedding: A Reading Report

author: Kuncheng Xie

note: Due to the limitation of ability and time of author, there may be some bugs and incompletion in the article.

What is network embedding

Network embedding maps vertices into low-dimensional vector space, which is dense, continuous and meaningful for modelling the relationships between vertices.

What is the usage of it

The things in nature is always discrete. How to represent things such as words. One naïve way is to represent then with index number, as 1,2,3 … \dots . But such a representation is not good in neural networks. A better representation is using one hot encoding, only one 1 in a vector, such as representing the word ‘apple’ with the vector [ 0 , 0 , ⋯   , 0 , 1 , 0 , ⋯   , 0 ] [0,0,\cdots,0,1,0,\cdots,0] [0,0,,0,1,0,,0].

However this kind of representation has two big problems:

  • Sparsity. The vectors have dimension of the length of the whole vocabulary, which is usually tens of thousands of. And this is very inefficient to have only one 1 in the vector.
  • Meaningless. Different words’ representations is totally different, like the words ‘apples’ and ‘apple’ are very similar in semantics, but their representations have nothing similar.

At last word embedding gets rid of the problems. It relies on the distributional hypothesis[1], i.e. the assumption that words with similar contexts(other words) have the same meaning. It makes use of unsupervised learning to exploit the big data of corpus. The learned representation vectors can then be used in downstream tasks as sentences sentiment analysis, reading comprehension and so on.

The problems of representing words similarly exist in representing vertices in networks like users in social networks. Network embeddings follow the ideas of word embeddings and turn network information into dense, low-dimensional real-valued vectors. For the learned features of vertices, the meaningful vectors can be used as input for existing machine learning algorithms[2], like vertex classification, community discovery, recommendation system, etc.

How to learn it

There are many kinds of algorithm to learn the representation of networks and I will introduce some presentative modern models which I know.

DeepWalk[3] first introduces the technique of word embedding, neural networks learning, into the network embedding. Through truncated random walks, we can sample lots of vertex sequences and feed them to the neural network model such as SkipGram[4]. It outperforms the conventional methods in some tasks with a great gap, proving the superiority of exploiting the relation between vertices in a network and encoding it with dense, low-dimensional vectors.

Afterwards, different kinds of network representation learning algorithms mushroomed focusing on different aspects of networks.

LINE[5] doesn’t use neural networks to learn the embeddings but try to minimize the distance between the empirical distribution and carefully designed distribution, usually with KL-divergence. Besides, it make use of the first-order and the second-order proximities to learn a good embedding. It can fit the large scale network and outperforms DeepWalk in some tasks.

Soon, node2vec[6] was proposed with more flexible notion of a node’s network neighborhood to learn richer representations. The algorithm designs a biased random walk procedure with some parameters to control the random walk between BFS and DFS.

Informational networks, where each vertex also contains rich external information such as text and labels, draw people’s attention recently. TADW[7] uses matrix factorization to combine text information. CANE[8] focuses on the different aspects of a vertex when interacting with its different neighbor vertices, and use mutual attention mechanism to obtain the context-aware text information. It combines the structure embedding and text embedding to form embeddings for each edge in the network. The proposed objective function and the basic ideas are followed by X Zhang(2018)[9]. They use a subtler structure learning strategy, the diffusion process, to gain more representative information.

Future work

Network embedding has progressed a lot since DeepWalk. However, there are still some aspects need to be improved proposed by Tu, et al[2].

  • Knowledge-driven network representation learning. Current network embedding don’t take external information like knowledge graph into consideration, which can improve the inference ability. I think this is the shortage of all representation learning.
  • Large scale network representation learning. Current algorithms can’t achieve both efficiency in training and well performance in tasks. It is a difficulty to fit the networks which is in the scale of millions or billions in real world.
  • Network embedding considering specific downstream tasks. The commonality of current methods is to try to fit the true structure of network, which may be not useful in some subsequent specific tasks.


