这次尝试在配置PyTorch geometric的时候出现了一些问题,常见的依赖包安装命令以及官网命令都出现了一些错误,于是在同学的帮助下更换了源完成了安装:
(在Anaconda Prompt中实现)
pip install torch==1.9.0+cpu torchvision==0.10.0+cpu torchaudio==0.9.0 -f https://download.pytorch.org/whl/torch_stable.html
pip install torch-scatter -f https://data.pyg.org/whl/torch-1.9.0+cpu.html
pip install torch-sparse -f https://data.pyg.org/whl/torch-1.9.0+cpu.html
pip install torch-cluster -f https://data.pyg.org/whl/torch-1.9.0+cpu.html
pip install torch-spline-conv -f https://data.pyg.org/whl/torch-1.9.0+cpu.html
pip install torch-geometric
这次也意识到了安装虚拟环境的重要性,在以后的学习机器学习的过程中会更加注重这些。
PyCharm中选择虚拟环境PyTorch Geometric已经包含有很多常见的基准数据集,包括:
import torch
import torch.nn.functional as F
from torch_geometric.nn import MessagePassing
from torch_geometric.utils import add_self_loops, degree
# dataset
from torch_geometric.datasets import Planetoid
dataset = Planetoid(root='/tmp/Cora', name='Cora')
class GCNConv(MessagePassing):
def __init__(self, in_channels, out_channels):
super(GCNConv, self).__init__(aggr='add')
self.lin = torch.nn.Linear(in_channels, out_channels)
def forward(self, x, edge_index):
# 1: 增加自连接到邻接矩阵
edge_index, _ = add_self_loops(edge_index, num_nodes=x.size(0))
# 2: 对节点的特征矩阵进行线性变换
x = self.lin(x)
# 3-5: Start propagating messages.
return self.propagate(edge_index, size=(x.size(0), x.size(0)), x=x)
def message(self, x_j, edge_index, size):
# Step 3: Normalize node features.
row, col = edge_index
deg = degree(row, size[0], dtype=x_j.dtype)
deg_inv_sqrt = deg.pow(-0.5)
norm = deg_inv_sqrt[row] * deg_inv_sqrt[col]
return norm.view(-1, 1) * x_j
def update(self, aggr_out):
# Step 5: Return new node embeddings.
return aggr_out
class Net(torch.nn.Module):
def __init__(self):
super(Net, self).__init__()
self.conv1 = GCNConv(dataset.num_node_features, 16)
self.conv2 = GCNConv(16, dataset.num_classes)
def forward(self, data):
x, edge_index = data.x, data.edge_index
x = self.conv1(x, edge_index)
x = F.relu(x)
x = F.dropout(x, training=self.training)
x = self.conv2(x, edge_index)
return F.log_softmax(x, dim=1)
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
model = Net().to(device)
data = dataset[0].to(device)
optimizer = torch.optim.Adam(model.parameters(), lr=0.01, weight_decay=5e-4)
model.train()
for epoch in range(200):
optimizer.zero_grad()
out = model(data)
loss = F.nll_loss(out[data.train_mask], data.y[data.train_mask])
loss.backward()
optimizer.step()
model.eval()
_, pred = model(data).max(dim=1)
correct = float(pred[data.test_mask].eq(data.y[data.test_mask]).sum().item())
acc = correct / data.test_mask.sum().item()
print('Accuracy: {:.4f}'.format(acc))
由于数据集需要去github上下载,但是网络比较不稳定,出现了runtimeout报错:
planetoid.py里面第48行:
url = 'https://github.com/kimiyoung/planetoid/raw/master/data'
改成 url='https://gitee.com/jiajiewu/planetoid/raw/master/data'
(gitee国内更容易访问)
参考资料:
一文读懂图卷积GCN - 知乎
图神经网络库PyTorch geometric(PYG)零基础上手教程 - 知乎
Planetoid无法直接下载Cora等数据集的3个解决方式