首先先放其代码,期间我加了注释
# -*- coding: utf8 -*-
import paddle.v2 as paddle
import paddle.v2.dataset.uci_housing as uci_housing
# --use_gpu
#
# 训练过程是否使用GPU,设置为true使用GPU模式,否则使用CPU模式。
# 类型: bool (默认: 1).
#
# 指定一台机器上使用的线程数。例如,trainer_count = 4, 意思是在GPU模式下使用4个GPU,或者在CPU模式下使用4个线程。每个线程(或GPU)分配到当前数据块样
# 本数的四分之一。也就是说,如果在训练配置中设置batch_size为512,每个线程分配到128个样本用于训练。
# 类型: int32(默认: 1).
paddle.init(use_gpu=False,trainer_count=1)
# paddle.data_type.dense_vector stands for dense-vector
# for example: vector(5.2,0.0,5.5)will be stored as [5.2,0.0,5.5]in dense-vector
# and it will be stored at sparse-vector as [3,[0,2],[5.2,5.5]
# the first number stands for the length of vector,the second vector stands for the index of number that don't equals 0
# the third vector store the numbers' value that isn't 0
# if there are lots of 0 in vector, a sparse vector will be helpful
# paddle.v2.data_type.dense_array(dim, seq_type=0)
# dim(int) – dimension of this vector.
# seq_type(int) – sequence type of input.
# name pragma of paddle.layer.data nick
x = paddle.layer.data(name='x', type=paddle.data_type.dense_vector(13))
# paddle.activation.Linear()stands for liner activation function
# 参数:
#
# name (basestring) – The name of this layer. It is optional.
# input (paddle.v2.config_base.Layer | list | tuple) – The input of this layer.
# size (int) – The layer dimension.
# act (paddle.v2.activation.Base) – Activation Type. paddle.v2.activation.Tanh is the default activation.
# param_attr (paddle.v2.attr.ParameterAttribute) – The Parameter Attribute|list.
# bias_attr (paddle.v2.attr.ParameterAttribute | None | bool | Any) – The bias attribute. If the parameter is set to False or an object whose type is not paddle.v2.attr.ParameterAttribute, no bias is defined. If the parameter is set to True, the bias is initialized to zero.
# layer_attr (paddle.v2.attr.ExtraAttribute | None) – Extra Layer config.
#
# 返回:paddle.v2.config_base.Layer object.
# 返回类型:paddle.v2.config_base.Layer
# train dataset
# fully connected layer
y_predict = paddle.layer.fc(input=x,size=1,act=paddle.activation.Linear())
# Data layer
y = paddle.layer.data(name='y', type=paddle.data_type.dense_vector(1))
# sum of square error cost:
#
# 参数:
#
# name (basestring) – The name of this layer. It is optional.
# input (paddle.v2.config_base.Layer) – The first input layer.
# label (paddle.v2.config_base.Layer) – The input label.
# weight (paddle.v2.config_base.Layer) – The weight layer defines a weight for each sample in the mini-batch. It is optional.
# coeff (float) – The weight of the gradient in the back propagation. 1.0 is the default value.
# layer_attr (paddle.v2.attr.ExtraAttribute) – The extra layer attribute. See paddle.v2.attr.ExtraAttribute for details.
#
# 返回:
#
# paddle.v2.config_base.Layer object.
# 返回类型:
#
# paddle.v2.config_base.Layer
cost = paddle.layer.square_error_cost(input=y_predict, label=y)
下面我们来解释下这段代码:
这段代码的作用其实很简单:
先import那些东西,然后
paddle.init(use_gpu=False,trainer_count=1)
调用了paddle,可以理解为给其传参,他的demo设置了只使用cpu而且cpu只开一个线程
x = paddle.layer.data(name='x', type=paddle.data_type.dense_vector(13))
这段话的意思是我们指定了一个数据层:即只存储数据的层,这个数据层的名字叫做x,数据的类型为紧凑向量,并且向量具有13维度
然后我们创建一个全连接层,激活函数为线性函数:
y_predict = paddle.layer.fc(input=x,size=1,act=paddle.activation.Linear())
这个层里,输入是x,通过上一句可以知道x是一个数据层,这个全连接层是1维的
y = paddle.layer.data(name='y', type=paddle.data_type.dense_vector(1))
这句话的意思可以参考上面一句,,即指定了一个名为y的数据层,数据类型为紧凑向量,具有1个维度
cost = paddle.layer.square_error_cost(input=y_predict, label=y)
这句话的意思就是求方差
# Save the inference topology to protobuf.
inference_topology = paddle.topology.Topology(layers=y_predict)
with open("inference_topology.pkl", 'wb') as f:
inference_topology.serialize_for_inference(f)
这实现保存网络拓扑的到protobuf,with…as的语法也很简单,这里就不在讲了
# Parameters manages all the learnable parameters in a neural network. It stores parameters’ information in an OrderedDict.
# The key is the name of a parameter, and value is a parameter’s configuration(in protobuf format), such as initialization
# mean and std, its size, whether it is a static parameter, and so on.
# 参数:
#
# __param_conf__ (OrderedDict) – store the configurations of learnable parameters in the network in an OrderedDict.
# Parameter is added one by one into the dict by following their created order in the network: parameters of the previous
# layers in a network are careted first. You can visit the parameters from bottom to top by iterating over this dict.
# __gradient_machines__ (list) – all of the parameters in a neural network are appended to a PaddlePaddle gradient
# machine, which is used internally to copy parameter values between C++ and Python end.
# __tmp_params__ (dict) – a dict to store dummy parameters if no __gradient_machines__ is appended to Parameters.
parameters = paddle.parameters.create(cost)
创建参数
注意,这里存入的数据必须是可训练的,并且是protobuf格式的,你可以通过迭代器访问数据,可以使用machine在c++和Python之间传递数据
# momentum(float) – the momentum factor.
# sparse(bool) – with sparse support or not, False by default.
optimizer = paddle.optimizer.Momentum(momentum=0)
这是加速手段,可以加速收敛,也有些其他的好处,比如可以避免对初始项的依赖
# cost (paddle.v2.config_base.Layer) – Target cost that neural network should be
# optimized.
# parameters (paddle.v2.parameters.Parameters) – The parameters dictionary.
# update_equation (paddle.v2.optimizer.Optimizer) – The optimizer object.
# extra_layers (paddle.v2.config_base.Layer) – Some layers in the neural network
# graph are not in the path of cost layer.
# is_local (bool) – Whether trainning locally
# pserver_spec (string) – comma string for pserver location,
# eg:127.10.0.10:3000,127.10.0.11:3000, and this parameter is only used for fault
# tolerant mode cluster training.
# use_etcd – Whether using etcd pserver.
# use_etcd – bool
trainer = paddle.trainer.SGD(cost=cost,
parameters=parameters,
update_equation=optimizer)
创建一个trainer
这样我们基本的结构就分解完了,接下来是一些读取数据,打印进度之类的,我就不再介绍。