运行PPR-GCN版本时遇到的关于pytorch计算图的bug
报错如下
Trying to backward through the graph a second time, but the saved intermediate results have already been freed. Specify retain_graph=True when calling .backward() or autograd.grad() the first time
当添加参数后
loss.backward(retain_graph=True)
报错如下
Exception has occurred: RuntimeError
one of the variables needed for gradient computation has been modified by an inplace operation: [torch.FloatTensor [64, 64]],
which is output 0 of TBackward, is at version 2;
expected version 1 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient,
with torch.autograd.set_detect_anomaly(True).
参考连接:retain_graph=True例子
参考连接
需要安装两个包
pip install torchviz
pip install graphviz
参考:make_dot(prediction)
但是报错
ExecutableNotFound("failed to execute ['dot', '-Kdot', '-Tpdf', '-O', 'Digraph.gv'], make sure the Graphviz executables are on your systems' PATH",)
需要单独安装graphviz参考
因为make_dot要用到graphviz,要注意graphviz的安装方法,不仅在实验环境上要安装,还要在系统里安装,设置环境系统变量
下载官网
设置好后,打开命令提示符,执行dot -version可查看是否配置成功,若出现以下信息说明配置成功:
from torchviz import make_dot # make_dot(prediction) 可视化
# 第一个参数是模型的输出,第二个是模型的参数先列表化再字典化
graph=make_dot(model_output,params=dict(list(model.named_parameters())))
#第一个参数是文件名 第二个是保存路径
graph.view('model_structure.pdf','.\\figure\\')
缺点就是生成的一个pdf文件,静态的点击看不到权重什么的,不方便
参考连接
在浏览器中打开生成的tensorboard文件
writer = SummaryWriter(log_dir=args.input_path+'log_file/', comment='seq_model')
with writer:
writer.add_graph(seq_model, (b_x,)) # seq_model模型实例 b_x 模型输入
tensorboard --logdir=C:\Users\lenovo\Desktop\PPR-master\dataset\toyset\log_file # 方生成文件的目录
tensorboard参考链接
后来重新组织了以下程序,没有了报错,可能是因为两个模型gcn_model 和 seq_model没有组织好,seq_model更新10次,gcn_model 才更新1次,类似这种问题。