pva-faster-rcnn(caffe)绘制训练loss和lr曲线

为了观察神经网络训练效果,绘制loss曲线,得到模型训练的变化趋势。同时,为了观察学习率的变化对loss的影响,同时在一个图中绘制出learningrate曲线。

1 保存训练日志

创建一个sh文件,输入以下内容

#!/usr/bin/env sh
LOG="/home/lthpc/pva-faster-rcnn2/experiments/logs-`date +%Y-%m-%d-%H-%M-%S`.log"
exec &> >(tee -a "$LOG")
echo Logging output to "$LOG"
python tools/train_net.py --solver models/pvanet/example_train_384/solver.prototxt  --weights models/pvanet/example_train_384/original.model --cfg models/pvanet/cfgs/train.yml

其中:

LOG输入保存日志的路径;最后一行为执行训练时的语句

2 绘制loss和lr曲线

输出的训练日志的格式如下

I0704 16:56:08.642755 48434 solver.cpp:238] Iteration 410, loss = 0.515833
I0704 16:56:08.642813 48434 solver.cpp:254]     Train net output #0: cls_loss = 0.117412 (* 1 = 0.117412 loss)
I0704 16:56:08.642822 48434 solver.cpp:254]     Train net output #1: loss_bbox = 0.000175524 (* 1 = 0.000175524 loss)
I0704 16:56:08.642828 48434 solver.cpp:254]     Train net output #2: rpn_cls_loss = 0.0477779 (* 1 = 0.0477779 loss)
I0704 16:56:08.642833 48434 solver.cpp:254]     Train net output #3: rpn_loss_bbox = 0.0296075 (* 1 = 0.0296075 loss)
I0704 16:56:08.642839 48434 sgd_solver.cpp:138] Iteration 410, lr = 0.001
I0704 16:56:13.248749 48434 solver.cpp:238] Iteration 420, loss = 0.509081
I0704 16:56:13.248798 48434 solver.cpp:254]     Train net output #0: cls_loss = 0.294207 (* 1 = 0.294207 loss)
I0704 16:56:13.248808 48434 solver.cpp:254]     Train net output #1: loss_bbox = 0.0603984 (* 1 = 0.0603984 loss)
I0704 16:56:13.248813 48434 solver.cpp:254]     Train net output #2: rpn_cls_loss = 0.0187975 (* 1 = 0.0187975 loss)
I0704 16:56:13.248819 48434 solver.cpp:254]     Train net output #3: rpn_loss_bbox = 0.0398177 (* 1 = 0.0398177 loss)
I0704 16:56:13.248826 48434 sgd_solver.cpp:138] Iteration 420, lr = 0.001
I0704 16:56:17.768085 48434 solver.cpp:238] Iteration 430, loss = 0.510896

总结可发现:对于每个出现字段] Iteration和loss =的文本行,含有训练的迭代次数以及损失代价;对于每个出现字段] Iteration和lr =的文本行,含有训练的迭代次数以及学习率,因此设计如下程序绘制曲线图

#!/usr/bin/env python  
import os  
import sys  
import numpy as np  
import matplotlib.pyplot as plt  
import math  
import re  
import pylab  
from pylab import figure, show, legend  
from mpl_toolkits.axes_grid1 import host_subplot  
 
fp = open('train-2018-07-09-18-10-18.log', 'r') 
  
train_iterations = []  
learningrate = [] 
train_loss = []  
test_iterations = []  
#test_accuracy = []  
  
for ln in fp:  
  if '] Iteration ' in ln and 'loss = ' in ln:  
    arr = re.findall(r'ion \b\d+\b,',ln)  
    train_iterations.append(int(arr[0].strip(',')[4:]))  
    train_loss.append(float(ln.strip().split(' = ')[-1]))
  # get train_iterations and train_loss  
  if '] Iteration ' in ln and 'lr = ' in ln:  
    learningrate.append(float(ln.strip().split(' = ')[-1]))  
    
fp.close()  
  
a1 = plt.subplot(211) 
a1.set_title("loss")
a1.plot(train_iterations,train_loss)

a2 = plt.subplot(212) 
a2.set_title("learningrate")
a2.plot(train_iterations,learningrate)

# a1.margins(x=0.2)
plt.tight_layout()
plt.show()

注意找到日志路径,运行后输出曲线如图所示

pva-faster-rcnn(caffe)绘制训练loss和lr曲线_第1张图片

可以看到学习率下降的太快,导致后边loss降不下来,因此需重新调整学习策略,调整方法参考《caffe不同lr_policy参数设置方法》

你可能感兴趣的:(Python,人工智能)