在使用DGL复现论文Graph Convolutional Network for Text Classification的时候,模型准确度一直保持在62-64左右,无论采取什么方法都不能使得准确度继续上升,模型如下:
import torch.nn
from torch.nn import Module, ReLU, Dropout
from dgl.nn.pytorch import GraphConv
class GCN_Model(Module):
def __init__(self, input_dim, hidden_dim, label_size):
super().__init__()
self.dropout = Dropout(0.4)
self.dropout2 = Dropout(0.4)
self.conv1 = GraphConv(input_dim, hidden_dim)
self.conv2 = GraphConv(hidden_dim, label_size)
self.relu = ReLU()
def forward(self, graph, feature):
feature1 = self.relu(self.dropout(self.conv1(graph, feature)))
result = self.dropout2(self.conv2(graph, feature1))
return result
为了确定不是我生成关系图时候的计算失误,我直接用了原文提供的数据集,准确度依然如此,改了大约三天后,我发现,dgl的conv,没有内置权重功能,必须设置weight=True才能成为一个GCN,否则只是一个Graph Sage
改掉以后,论文准确度提升到了65%左右,距离原文中的accuracy68%仍然有很大的差距。
import torch.nn
from torch.nn import Module, ReLU, Dropout
from dgl.nn.pytorch import GraphConv
class GCN_Model(Module):
def __init__(self, input_dim, hidden_dim, label_size):
super().__init__()
self.dropout = Dropout(0.4)
self.dropout2 = Dropout(0.4)
self.conv1 = GraphConv(input_dim, hidden_dim,weight=True)
self.conv2 = GraphConv(hidden_dim, label_size,weight=True)
self.relu = ReLU()
def forward(self, graph, feature):
feature1 = self.relu(self.dropout(self.conv1(graph, feature)))
result = self.dropout2(self.conv2(graph, feature1))
return result
我冥思苦想了很久,然后想到是不是还有什么地方我没有注意,接着GraphConv源代码,但是其中没有额外参数,而且发现一个恐怖的故事,这个东西不会自动使用边的特征,在输入的graph中没有边的权重……接着我发现,dgl0.6(我目前使用版本)添加了这个edge_weight作为训练参数,我看的文档默认是0.4版本所以没有写……改了以后模型accuracy上升到了67%,和论文依然有着一篇ACL的差距。这次就很顺利了,我发现论文中的norm方式和原文不同,将norm设置为none,再用论文中的normalize_adj跑一下,accuracy上升到了68.4%,剩下的0.4%姑且认为是模型问题。
class GCN_Model(Module):
def __init__(self, input_dim, hidden_dim, label_size):
super().__init__()
self.dropout = Dropout(0.5)
self.dropout2 = Dropout(0.5)
self.Relu=ReLU()
self.conv1 = GraphConv(input_dim, hidden_dim,weight=True,allow_zero_in_degree=True,norm='right')
self.conv2 = GraphConv(hidden_dim, label_size,weight=True,allow_zero_in_degree=True,norm='right')
def forward(self, graph, feature):
feature=self.dropout(feature)
feature1 = self.Relu(self.conv1(graph, feature,edge_weight=graph.edata['w'].float()))
result = self.conv2(graph, self.dropout2(feature1),edge_weight=graph.edata['w'].float())
return result