Orignal in Wechat article.
Machine learning algorithms are tuned for continuous data, hence why embedding is always to a continuous vector space.
Graph embedding is an approach that is used to transform nodes, edges, and their features into vector space (a lower dimension) whilst maximally preserving properties like graph structure and information.
https://towardsdatascience.com/overview-of-deep-learning-on-graph-embeddings-4305c10ad4a4
Graph representation:
There is a variety of ways to go about embedding graphs, each with a different level of granularity. Embeddings can be performed on the node level, the sub-graph level, or through strategies like graph walks.
Deepwalk belongs to the family of graph embedding techniques that uses walks, which are a concept in graph theory that enables the traversal of a graph by moving from one node to another, as long as they are connected to a common edge.
Code of random walk on example graph, revised according to the repository on Github.
import networkx as nx
import numpy as np
import matplotlib.pyplot as plt
import random
# Create Graph
G = nx.Graph()
# Add nodes
G.add_nodes_from(range(0, 9))
# Add edges
G.add_edge(0, 1)
G.add_edge(1, 2)
G.add_edge(1, 6)
G.add_edge(6, 4)
G.add_edge(4, 3)
G.add_edge(4, 5)
G.add_edge(6, 7)
G.add_edge(7, 8)
G.add_edge(7, 9)
G.add_edge(2, 3)
G.add_edge(2, 4)
G.add_edge(2, 5)
# Draw graph
#nx.draw(G)
#plt.show()
# Define red and green nodes
red_vertices = [0, 2, 3, 9]
green_vertices = [5, 8]
# Store number of successes
nsuccess = 0
# Execute 1million times this command sequence
for step in range(1, 2):
# Choose a random start node
vertexid = 1
# Dictionary that associate nodes with the amount of times it was visited
visited_vertices = {}
# Store and print path
path = [vertexid]
print("Step: %d" % (step))
# Restart the cycle
counter = 0
# Execute the random walk with size 100,000 (100,000 steps)
for counter in range(1, 10):
# Extract vertex neighbours vertex neighborhood
vertex_neighbors = [n for n in G.neighbors(vertexid)]
# Set probability of going to a neighbour is uniform
probability = []
probability = probability + [1./len(vertex_neighbors)] * len(vertex_neighbors)
# Choose a vertex from the vertex neighborhood to start the next random walk
vertexid = np.random.choice(vertex_neighbors, p = probability)
# Accumulate the amount of times each vertex is visited
if vertexid in visited_vertices:
visited_vertices[vertexid] += 1
else:
visited_vertices[vertexid] = 1
# Append to path
path.append(vertexid)
nsuccess = nsuccess + 1
# Organize the vertex list in most visited decrescent order
mostvisited = sorted(visited_vertices, key = visited_vertices.get, reverse = True)
print("Path: ", path)
# Separate the top 10 most visited vertex
print("Most visited nodes: ", mostvisited[:10])
The approach taken by DeepWalk is to complete a series of random walks using the equation:
KaTeX parse error: No such environment: equation* at position 8: \begin{̲e̲q̲u̲a̲t̲i̲o̲n̲*̲}̲P(v_i| {\bf \Ph…
The goal is to estimate the likelihood of observing node v i v_i vi given all the previous nodes visited so far in the random walk.
And next, feed that matrix representing the graph to a neural networks make a prediction about a node feature or classification, e.g. The method used to make predictions is skip-gram.
Idea: use flexible, biased random walks that can trade off between local and global views of the network.
The difference between Node2vec and DeepWalk is subtle but significant. Node2vec features a walk bias variable α \alpha α, which is parameterized by p p p and q q q. The parameter p p p prioritizes a breadth-first-search (BFS) procedure, while the parameter q q q prioritizes a depth-first-search (DFS) procedure. The decision of where to walk next is therefore influenced by probabilities 1 / p 1/p 1/p or 1 / q 1/q 1/q.
KaTeX parse error: No such environment: equation* at position 8: \begin{̲e̲q̲u̲a̲t̲i̲o̲n̲*̲}̲\alpha_{pg}(t,x…
where t t t is the last node, x x x is the next node and now resides at node v v v.
As the visualization implies, BFS is ideal for learning local neighbors, while DFS is better for learning global variables
A modification to the node2vec variant, graph2vec essentially learns to embed a graph’s sub-graphs.
Using an analogy with word2vec, if a document is made of sentences (which is then made of words), then a graph is made of sub-graphs (which is then made of nodes).
Three steps: