在paddle静态图训练中,训练集和测试集效果都有很好,但验证集上效果很差
- 在paddle的训练中,如果使用这样的方式进行训练
main_program = fluid.default_main_program()
cost = fluid.layers.cross_entropy(input=model, label=label)
avg_cost = fluid.layers.mean(cost)
acc = fluid.layers.accuracy(input=model, label=label)
...
train_cost, train_acc = exe.run(program=main_program,
feed=feeder.feed(data),
fetch_list=[avg_cost, acc])
- 在epoch的循环中,使用这种结构训练
...
for batch_id, data in enumerate(train_reader()):
train_cost, train_acc = exe.run(program=main_program,
feed=feeder.feed(data),
fetch_list=[avg_cost, acc])
if batch_id % 50 == 0:
print("pass:%d, batch:%d, cost:%0.5f, acc:%0.5f" %(pass_id, batch_id, train_cost[0], train_acc[0]))
eval_accs = []
eval_costs = []
for batch_id, data in enumerate(eval_reader()):
eval_cost, eval_acc = exe.run(program=main_program,
feed=feeder.feed(data),
fetch_list=[avg_cost, acc])
eval_accs.append(eval_acc[0])
eval_costs.append(eval_cost[0])
eval_avgcost = (sum(eval_costs) / len(eval_costs))
eval_avgacc = (sum(eval_accs) / len(eval_accs))
...
- 那么如果没有问题,有可能会发现训练集和测试集的cost稳步下降,acc越来越高。但如果使用全新的test的数据进行验证,就会得到一个非常差的结果。这个问题有可能就是数据过拟合造成的,但是为什么在训练集和测试集的表现都非常好呢?
- 原因是上述代码有误,应复制一个program用于测试,这个program的过程不能进行反向传播。否则等于让算法学习了测试集,所以测试集的效果也会逐渐变好。所以此处需要一个test_program,并将其for_test设置为True,表示其结果不会用于更新参数【这里要注意test_program要写在avg_cost和acc后面,因为这个program是复制的main_program,如果在avg_cost和acc之前就复制了,那个这个test_program中将没有avg_cost和acc】
main_program = fluid.default_main_program()
cost = fluid.layers.cross_entropy(input=model, label=label)
avg_cost = fluid.layers.mean(cost)
acc = fluid.layers.accuracy(input=model, label=label)
test_program = fluid.default_main_program().clone(for_test=True)
...
train_cost, train_acc = exe.run(program=main_program,
feed=feeder.feed(data),
fetch_list=[avg_cost, acc])
- 在测试时将program设置为test_program
...
for batch_id, data in enumerate(train_reader()):
train_cost, train_acc = exe.run(program=main_program,
feed=feeder.feed(data),
fetch_list=[avg_cost, acc])
if batch_id % 50 == 0:
print("pass:%d, batch:%d, cost:%0.5f, acc:%0.5f" %(pass_id, batch_id, train_cost[0], train_acc[0]))
eval_accs = []
eval_costs = []
for batch_id, data in enumerate(eval_reader()):
eval_cost, eval_acc = exe.run(program=test_program,
feed=feeder.feed(data),
fetch_list=[avg_cost, acc])
eval_accs.append(eval_acc[0])
eval_costs.append(eval_cost[0])
eval_avgcost = (sum(eval_costs) / len(eval_costs))
eval_avgacc = (sum(eval_accs) / len(eval_accs))
...
- 此时训练查看结果,如果训练集效果很好,但测试集效果很差,就说明该训练过拟合了,可以采用正则化、dropout、调整网络、增强/多数据等方法进行改进。这时的测试集才做到了它应该有的作用
pass:0, batch:0, cost:2.95071, acc:0.06250
pass:0, batch:50, cost:3.77059, acc:0.15625
eval:0, cost:3.10242, acc:0.25000
pass:1, batch:0, cost:0.07284, acc:0.96875
pass:1, batch:50, cost:0.08076, acc:0.96875
eval:1, cost:3.05008, acc:0.28795
...