PaddlePaddle入门03:使用Fluid进行单机训练

根据《90分钟PaddlePaddle快速上手》整理。

神经网络简介

  • 网络结构
  • 模型参数

配置模型结构

import paddle
Import paddle.fluid as fluid

#input层

image = fluid.layers.data(name='pixel', shape=[1,28,28], dtype='float32')#(channels, width, height)
label = fluid.layers.data(name='label', shape=[1], dtype='int64')

#model
conv1 = fluid.layers.conv2d(input=image, filter_size=5, num_filters=20)
relu1 = fluid.layers.relu(conv1)
pool1 = fluid.layers.pool2d(input=relu1, pool_size=2, pool_stride=2)
conv2 = fluid.layers.conv2d(input=pool1, filter_size=5, filter_num=50)
relu2 = fluid.layers.relu(conv2)
pool2 = fluid.layers.pool2d(input=relu2, pool_size=2, pool_stride=2)

predict = fluid.layers.fc(input=pool2, size=10, act='softmax')

#loss
cost = fluid.layers.cross_entropy(input=predict, label=label)
avg_cost = fluid.layers.mean(cost)
batch_acc = fluid.layers.accuracy(input=predict, label=label)
#optimizer
opt = fluid.optimizer.AdamOptimizer()
opt.minimize(avg_cost)
#initialize
place = fluid.CPUPlace()
exe = fluid.Executor(place)
#模型训练之前要对参数进行初始化,且只需执行一次初始化操作
#stratup_program存储模型参数的初始化操作
#main_program存储模型网络结构
exe.run(fluid.default_startup_program())

#初始化后的参数存放在fluid.global_scope()中,可通过参数名从该scope中获取参数


单卡训练

单卡训练可以使用fluid.Executor()中的run方法,运行fluid.Program即可;

train_reader = paddle.batch(paddle.dataset.mnist.train(), batch_size=128)

for epoch_id in range(5):
	for batch_id, data in enumerate(train_reader()):
		img_data = np.array([x[0].reshape([1,28,28]) for x in data]).astype('float64')
		y_data = np.array([x[1] for x in data]).reshape([len(img_data),1]).astype('int64')
		loss, acc = exe.run(fluid.default_main_program(), feed={'pixel':img_data, 'label':y_data}, fetch=[avg_cost, batch_acc])
		print("epoch:%d, batch:%d, loss:%.5f, acc:%.5f"%(epoch_id, batch_id, loss, acc))

多卡训练

数据并行,将数据分为n份放在不同的卡上训练,最后将结果汇总。

from paddle.fluid import compiler
#将构建的Program转换为数据并行模式的Program
compiled_program = compiler.CompiledProgram(fluid.default_main_program())
compiled_program.with_data_parallel(loss_name=avg_cost.name)

#训练
...
exe.run(compiled_program(),...)
...

  • 对于CPU训练,通过设置环境变量 export CPU_NUM=4 指定模型用多个线程训练
  • 一个程序包含多个模型,切换Program
#Define Program1
main_program_1 = fluid.Program()
startup_program_1 = fluid.Program()

with fluid.program_guard(main_program_1, startup_program_1):
	im_data_1, label_1, loss_1 = model1()

exe.run(startup_program_1)

for batch_id, data in enumerate(train.reader1()):
	img_data, y_data = ...
	loss = exe.run(main_program_1, feed={in_data_1.name: img.data, 'label':y_data}, fetch_list=[loss_1])
	print("...")

#Define Program2
...
  • 多个Program之间共享参数
    Paddle采用变量名区分不同变量,且变量名是根据unique_name模块中的计数器自动生成的,每生成一个变量名计数值加1.
    fluid.unique_name.guard()的作用是重置unique_name模块中的计数器,保证多次调用fluid.unique_name.guard()配置网络时对应变量的变量名相同,从而实现参数共享。

你可能感兴趣的:(PaddlePaddle,paddlepaddle,paddlefluid,入门)