目录
background
(1) heterogeneity of graph
(2) semantic-level attention
(3) Node-level attention
(4) HAN
contributions
2. Related Work
2.1 GNN
2.2 Network Embedding
3. Preliminary
background
4. Proposed Model
4.1 Node-level attention
ideas: challenge: hetero graph -> homo graph lose much semantics and structural info.
4.2 Semantic-level attention
ideas:semantics
4.3 Analysis of the proposed model
5.4 classification
5.5 Analysis
GNN, a powerful graph representation technique
problem: it not beeen fully considered in graph neural network for heterogeneous graph which contains different types of nodes and links.
solution: HAN(Heterogeneous graph attention network)
-> model can generate node embedding by aggregating features from meta-path based neighbors in a hierarchical manner.
GAT: leverages attention mechanism for the homogeneous graph which includes only one type of nodes or links.
problem: an intrinsic property of heterogeneous graph -> different types of nodes have different traits and their features may fall in different feature spaces.
-> how to handle such complex structured information and preserve the diverse feature information simultaneously is an urgent problem that needs to be solved.
background: Different meta-paths in the heterogeneous graph may extract diverse semantic information.
problem: how to select the most meaningful meta-paths and fuse the semantic information for the specific task.
solution: semantic-level attention aims to learn the importance of each meta-path and assign proper weights to them.
problem: treating different meta-path equally is unpractical and will weaken the semantic information <- some useful meta-paths
how to distinguish subtle difference of these neighbors and select some informative neighbors is required.
solution: node-level attention aims to learn the importance of meta-path based neighbors and assign different attention values to them.
e.g. The Terminator movie <-> meta-path relation
problem: how to design a model which can discover the subtle differences of neighbors and learn their weights properly will be desired.
solution: HAN
-> our model can get the optimal combination of neighbors and multiple meta-paths in a hierarchical manner, which enables the learned node embeddings to better capture the complex structure and rich semantic information in a heterogeneous graph.
Network Representation Learning(NRI) -> is proposed to embed network into a low dimensional space while preserving the network structure and property so that the learned embeddings can be applied to the downstream network tasks.
Heterogeneous graph embedding mainly focuses on preserving the meta-path based structural information.
Aim-1: needs to conduct grid search to find the optimal weights of meta-paths.
semi-supervised gnn for heterogenous graph.
(1) nodel-level attention -> learn the weight of meta-path based neighbors and aggregate them to get the semantic-spicific node embedding.
for node i, 同一meta-path(即semantics)下,求 neighbors weight.
(2) semantic-level attention -> can tell the difference for meta-paths and get the optimal weighted combination of the semantic-specific node embedding.
for node i, different meta-path 的 weight
problem: due to the heterogeneity of nodes, different types of nodes have different feature spaces.
solution: design type-specific transformation matrix to project the features of different types of nodes into the same feature space.
<- type-specific transformation matrix is based on node-type rather than edge-type.
attention weight is generated for single meta-path, it is semantic-specific and able to capture one kind of semantic information.
-> multi-head attention, repeat the node-level attention for k times and concatenate the learned embeddings as the semantic-specific embeddings.
need to fuse multiple semantics which can be revealed by meta-paths.
这里的semantics只是nlp传统意义上很狭隘的语序概念,而在更广泛的语义概念上,包括形状、图片、音色、颜色等可以指明一个物体独特性的语义属性
problem: the variance of graph-structured data can be quite high.
solution: repeat the process for 10 times and report the average.
HAN -> designs for heterogeneous graph, captures the rich semantics successfully and show its superiority.