三大类应用:
# 安装PGL学习库
!pip install pgl
Looking in indexes: https://mirror.baidu.com/pypi/simple/
Collecting pgl
Downloading https://mirror.baidu.com/pypi/packages/e2/84/6aac242f80a794f1169386d73bdc03f2e3467e4fa85b1286979ddf51b1a0/pgl-1.2.1-cp37-cp37m-manylinux1_x86_64.whl (7.9MB)
|████████████████████████████████| 7.9MB 10.9MB/s eta 0:00:01
Requirement already satisfied: visualdl>=2.0.0b; python_version >= "3" in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from pgl) (2.0.3)
Collecting redis-py-cluster (from pgl)
Downloading https://mirror.baidu.com/pypi/packages/2b/c5/3236720746fa357e214f2b9fe7e517642329f13094fc7eb339abd93d004f/redis_py_cluster-2.1.0-py2.py3-none-any.whl (41kB)
|████████████████████████████████| 51kB 18.8MB/s eta 0:00:01
Requirement already satisfied: cython>=0.25.2 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from pgl) (0.29)
Requirement already satisfied: numpy>=1.16.4 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from pgl) (1.16.4)
Requirement already satisfied: requests in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from visualdl>=2.0.0b; python_version >= "3"->pgl) (2.22.0)
Requirement already satisfied: flake8>=3.7.9 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from visualdl>=2.0.0b; python_version >= "3"->pgl) (3.8.2)
Requirement already satisfied: Flask-Babel>=1.0.0 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from visualdl>=2.0.0b; python_version >= "3"->pgl) (1.0.0)
Requirement already satisfied: pre-commit in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from visualdl>=2.0.0b; python_version >= "3"->pgl) (1.21.0)
Requirement already satisfied: flask>=1.1.1 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from visualdl>=2.0.0b; python_version >= "3"->pgl) (1.1.1)
Requirement already satisfied: six>=1.14.0 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from visualdl>=2.0.0b; python_version >= "3"->pgl) (1.15.0)
Requirement already satisfied: protobuf>=3.11.0 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from visualdl>=2.0.0b; python_version >= "3"->pgl) (3.12.2)
Requirement already satisfied: Pillow>=7.0.0 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from visualdl>=2.0.0b; python_version >= "3"->pgl) (7.1.2)
Collecting redis<4.0.0,>=3.0.0 (from redis-py-cluster->pgl)
Downloading https://mirror.baidu.com/pypi/packages/a7/7c/24fb0511df653cf1a5d938d8f5d19802a88cef255706fdda242ff97e91b7/redis-3.5.3-py2.py3-none-any.whl (72kB)
|████████████████████████████████| 81kB 14.7MB/s eta 0:00:01
Requirement already satisfied: chardet<3.1.0,>=3.0.2 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from requests->visualdl>=2.0.0b; python_version >= "3"->pgl) (3.0.4)
Requirement already satisfied: certifi>=2017.4.17 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from requests->visualdl>=2.0.0b; python_version >= "3"->pgl) (2019.9.11)
Requirement already satisfied: idna<2.9,>=2.5 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from requests->visualdl>=2.0.0b; python_version >= "3"->pgl) (2.8)
Requirement already satisfied: urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from requests->visualdl>=2.0.0b; python_version >= "3"->pgl) (1.25.6)
Requirement already satisfied: mccabe<0.7.0,>=0.6.0 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from flake8>=3.7.9->visualdl>=2.0.0b; python_version >= "3"->pgl) (0.6.1)
Requirement already satisfied: importlib-metadata; python_version < "3.8" in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from flake8>=3.7.9->visualdl>=2.0.0b; python_version >= "3"->pgl) (0.23)
Requirement already satisfied: pycodestyle<2.7.0,>=2.6.0a1 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from flake8>=3.7.9->visualdl>=2.0.0b; python_version >= "3"->pgl) (2.6.0)
Requirement already satisfied: pyflakes<2.3.0,>=2.2.0 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from flake8>=3.7.9->visualdl>=2.0.0b; python_version >= "3"->pgl) (2.2.0)
Requirement already satisfied: Jinja2>=2.5 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from Flask-Babel>=1.0.0->visualdl>=2.0.0b; python_version >= "3"->pgl) (2.10.3)
Requirement already satisfied: pytz in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from Flask-Babel>=1.0.0->visualdl>=2.0.0b; python_version >= "3"->pgl) (2019.3)
Requirement already satisfied: Babel>=2.3 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from Flask-Babel>=1.0.0->visualdl>=2.0.0b; python_version >= "3"->pgl) (2.8.0)
Requirement already satisfied: nodeenv>=0.11.1 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from pre-commit->visualdl>=2.0.0b; python_version >= "3"->pgl) (1.3.4)
Requirement already satisfied: pyyaml in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from pre-commit->visualdl>=2.0.0b; python_version >= "3"->pgl) (5.1.2)
Requirement already satisfied: toml in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from pre-commit->visualdl>=2.0.0b; python_version >= "3"->pgl) (0.10.0)
Requirement already satisfied: cfgv>=2.0.0 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from pre-commit->visualdl>=2.0.0b; python_version >= "3"->pgl) (2.0.1)
Requirement already satisfied: aspy.yaml in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from pre-commit->visualdl>=2.0.0b; python_version >= "3"->pgl) (1.3.0)
Requirement already satisfied: virtualenv>=15.2 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from pre-commit->visualdl>=2.0.0b; python_version >= "3"->pgl) (16.7.9)
Requirement already satisfied: identify>=1.0.0 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from pre-commit->visualdl>=2.0.0b; python_version >= "3"->pgl) (1.4.10)
Requirement already satisfied: click>=5.1 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from flask>=1.1.1->visualdl>=2.0.0b; python_version >= "3"->pgl) (7.0)
Requirement already satisfied: Werkzeug>=0.15 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from flask>=1.1.1->visualdl>=2.0.0b; python_version >= "3"->pgl) (0.16.0)
Requirement already satisfied: itsdangerous>=0.24 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from flask>=1.1.1->visualdl>=2.0.0b; python_version >= "3"->pgl) (1.1.0)
Requirement already satisfied: setuptools in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from protobuf>=3.11.0->visualdl>=2.0.0b; python_version >= "3"->pgl) (41.4.0)
Requirement already satisfied: zipp>=0.5 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from importlib-metadata; python_version < "3.8"->flake8>=3.7.9->visualdl>=2.0.0b; python_version >= "3"->pgl) (0.6.0)
Requirement already satisfied: MarkupSafe>=0.23 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from Jinja2>=2.5->Flask-Babel>=1.0.0->visualdl>=2.0.0b; python_version >= "3"->pgl) (1.1.1)
Requirement already satisfied: more-itertools in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from zipp>=0.5->importlib-metadata; python_version < "3.8"->flake8>=3.7.9->visualdl>=2.0.0b; python_version >= "3"->pgl) (7.2.0)
Installing collected packages: redis, redis-py-cluster, pgl
Successfully installed pgl-1.2.1 redis-3.5.3 redis-py-cluster-2.1.0
假设我们有下面这张图,其中包含10个节点,14条边。
我们的目的是:训练一个图模型,使得该图模型可以区分图上的绿色节点和黄色节点。我们可以使用以下代码来构图。
import pgl
from pgl import graph # 导入PGL中的图模块
import paddle.fluid as fluid # 导入飞桨框架
import numpy as np
def build_graph():
# 定义图中的节点数目,我们使用数字来表示图中的每个节点
num_nodes = 10
# 定义图中的边集
edge_list = [(2,0),(2,1),(3,1),(4,0),(5,0),(6, 0), (6, 4), (6, 5), (7, 0), (7, 1),
(7, 2), (7, 3), (8, 0), (9, 7)]
# 随机初始化节点特征,特征维度为d
d = 16
feature = np.random.randn(num_nodes, d).astype("float32")
# 随机地为每条边赋值一个权重
edge_feature = np.random.randn(len(edge_list), 1).astype("float32")
# 创建图对象,最多四个输入
g = graph.Graph(num_nodes = num_nodes, edges = edge_list, node_feat = {'feature': feature}, edge_feat = {'edge_feat': edge_feature})
return g
g = build_graph
定义图模型
# 定义一个同时传递节点特征和边权重的简单模型层
def model_layer(gw, nfeat, efeat, hidden_size, name, activation):
'''
gw: GraphWrapper 图数据容器,用于在定义模型的时候使用,后续训练时再feed入真实数据
nfeat: 节点特征
efeat: 边权重
hidden_size: 模型隐藏层维度
activation: 使用的激活函数
'''
# 定义send函数
def send_func(src_feat, dst_feat, edge_feat):
# 将源节点的节点特征和边权重共同作为消息发送
return src_feat['h'] * edge_feat['e']
# 定义recv函数
def recv_func(feat):
# 目标节点接收源节点信息,采用sum的聚合方式
return fluid.layers.sequence_pool(feat, pool_type='sum')
# 触发消息传递机制
msg = gw.send(send_func, nfeat_list=[('h', nfeat)], efeat_list = [('e', efeat)]
output = gw.recv(msg, recv_func)
output = fluid.layers.fc(output, size=hidden_size, bias_attr=False, act=activation, name=name)
return output
模型定义
class Model(object):
def __init__(self, graph):
"""
graph: 我们前面创建好的图
"""
# 创建 GraphWrapper 图数据容器,用于在定义模型的时候使用,后续训练时再feed入真实数据
self.gw = pgl.graph_wrapper.GraphWrapper(name='graph',
node_feat=graph.node_feat_info(),
edge_feat=graph.edge_feat_info())
# 作用同 GraphWrapper,此处用作节点标签的容器
self.node_label = fluid.layers.data("node_label", shape=[None, 1],
dtype="float32", append_batch_size=False)
def build_model(self):
# 定义两层model_layer
output = model_layer(self.gw,
self.gw.node_feat['feature'],
self.gw.edge_feat['edge_feature'],
hidden_size=8,
name='layer_1',
activation='relu')
output = model_layer(self.gw,
output,
self.gw.edge_feat['edge_feature'],
hidden_size=1,
name='layer_2',
activation=None)
# 对于二分类任务,可以使用以下 API 计算损失
loss = fluid.layers.sigmoid_cross_entropy_with_logits(x=output,
label=self.node_label)
# 计算平均损失
loss = fluid.layers.mean(loss)
# 计算准确率
prob = fluid.layers.sigmoid(output)
pred = prob > 0.5
pred = fluid.layers.cast(prob > 0.5, dtype="float32")
correct = fluid.layers.equal(pred, self.node_label)
correct = fluid.layers.cast(correct, dtype="float32")
acc = fluid.layers.reduce_mean(correct)
return loss, acc
训练前准备
# 是否在 GPU 或 CPU 环境运行
use_cuda = False
place = fluid.CUDAPlace(0) if use_cuda else fluid.CPUPlace()
# 定义程序,也就是我们的 Program
startup_program = fluid.Program() # 用于初始化模型参数
train_program = fluid.Program() # 训练时使用的主程序,包含前向计算和反向梯度计算
test_program = fluid.Program() # 测试时使用的程序,只包含前向计算
with fluid.program_guard(train_program, startup_program):
model = Model(g)
# 创建模型和计算 Loss
loss, acc = model.build_model()
# 选择Adam优化器,学习率设置为0.01
adam = fluid.optimizer.Adam(learning_rate=0.01)
adam.minimize(loss) # 计算梯度和执行梯度反向传播过程
# 复制构造 test_program,与 train_program的区别在于不需要梯度计算和反向过程。
test_program = train_program.clone(for_test=True)
# 定义一个在 place(CPU)上的Executor来执行program
exe = fluid.Executor(place)
# 参数初始化
exe.run(startup_program)
# 获取真实图数据
feed_dict = model.gw.to_feed(g)
# 获取真实标签数据
# 由于我们是做节点分类任务,因此可以简单的用0、1表示节点类别。其中,黄色点标签为0,绿色点标签为1。
y = [0,1,1,1,0,0,0,1,0,1]
label = np.array(y, dtype="float32")
label = np.expand_dims(label, -1)
feed_dict['node_label'] = label
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/pgl/graph_wrapper.py:151: UserWarning: The edge features in argument `efeat_list` should be fetched from a instance of `pgl.graph_wrapper.GraphWrapper`, because we have sorted the edges and the order of edges is changed.
Therefore, if you use external edge features, the order of features of each edge may not match its edge, which can cause serious errors.
If you use the `efeat_list` correctly, please ignore this warning.
"The edge features in argument `efeat_list` should be fetched "
开始训练
for epoch in range(30):
train_loss = exe.run(train_program, feed=feed_dict, fetch_list=[loss], return_numpy=True)[0]
print('Epoch %d | Loss: %f' % (epoch, train_loss))
Epoch 0 | Loss: 0.906734
Epoch 1 | Loss: 0.839262
Epoch 2 | Loss: 0.777020
Epoch 3 | Loss: 0.722640
Epoch 4 | Loss: 0.678117
Epoch 5 | Loss: 0.642708
Epoch 6 | Loss: 0.621016
Epoch 7 | Loss: 0.607005
Epoch 8 | Loss: 0.597986
Epoch 9 | Loss: 0.592153
Epoch 10 | Loss: 0.588311
Epoch 11 | Loss: 0.585515
Epoch 12 | Loss: 0.583296
Epoch 13 | Loss: 0.581387
Epoch 14 | Loss: 0.579625
Epoch 15 | Loss: 0.577935
Epoch 16 | Loss: 0.576279
Epoch 17 | Loss: 0.574633
Epoch 18 | Loss: 0.572992
Epoch 19 | Loss: 0.571337
Epoch 20 | Loss: 0.569672
Epoch 21 | Loss: 0.568003
Epoch 22 | Loss: 0.566335
Epoch 23 | Loss: 0.564675
Epoch 24 | Loss: 0.563316
Epoch 25 | Loss: 0.562035
Epoch 26 | Loss: 0.560765
Epoch 27 | Loss: 0.559507
Epoch 28 | Loss: 0.558265
Epoch 29 | Loss: 0.557040
模型测试
test_acc = exe.run(test_program, feed=feed_dict, fetch_list=[acc], return_numpy=True)[0]
print("Test Acc: %f" % test_acc)
Test Acc: 0.700000
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/executor.py:613: UserWarning: The variable graph/edges_dst is not found in program. It is not declared or is pruned.
% name)
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/executor.py:613: UserWarning: The variable graph/indegree is not found in program. It is not declared or is pruned.
% name)
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/executor.py:613: UserWarning: The variable graph/graph_lod is not found in program. It is not declared or is pruned.
% name)
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/executor.py:613: UserWarning: The variable graph/num_graph is not found in program. It is not declared or is pruned.
% name)