参考链接(1)。
(5)训练次数
(2)model函数
该函数负责神经网络的前向计算,即已知输入层的输入来计算输出层的输出。h2的维度为(图像样本数目,100),w_o的维度为(100,10),p_y_given_x的维度为(图像样本数目,10),输出为模型预测的各个位的概率。
P.S. 维度的结果为推导的结果,如有错误请指出~
(3)模型预测
概率最大位的索引为y。y的维度为(图像样本数,1)。(4)权重更新
损失cost的计算见前篇“Logistic回归”部分。权重更新用momentum函数。(5)momentum函数
该函数有4个输入:损失cost,参数params,学习率learning_rate,动量momentum。其中参数包括所有的权重和偏置。训练和测试的上层过程不赘述了。
import theano
import theano.tensor as T
import numpy as np
import load_cifar
# 加载数据
x_train, t_train, x_test, t_test = load_cifar.cifar10(dtype=theano.config.floatX)
labels_test = np.argmax(t_test, axis=1)
# 定义标识性Theano变量
x = T.matrix()
t = T.matrix()
# 定义模型:神经网络
def floatX(x):
return np.asarray(x, dtype=theano.config.floatX)
def init_weights(shape):
return theano.shared(floatX(np.random.randn(*shape) * 0.1))
def momentum(cost, params, learning_rate, momentum):
grads = theano.grad(cost, params)
updates = []
for p, g in zip(params, grads):
mparam_i = theano.shared(np.zeros(p.get_value().shape, dtype=theano.config.floatX))
v = momentum * mparam_i - learning_rate * g
updates.append((mparam_i, v))
updates.append((p, p + v))
return updates
def model(x, w_h1, b_h1, w_h2, b_h2, w_o, b_o):
h1 = T.maximum(0, T.dot(x, w_h1) + b_h1)
h2 = T.maximum(0, T.dot(h1, w_h2) + b_h2)
p_y_given_x = T.nnet.softmax(T.dot(h2, w_o) + b_o)
return p_y_given_x
w_h1 = init_weights((32 * 32, 100))
b_h1 = init_weights((100,))
w_h2 = init_weights((100, 100))
b_h2 = init_weights((100,))
w_o = init_weights((100, 10))
b_o = init_weights((10,))
params = [w_h1, b_h1, w_h2, b_h2, w_o, b_o]
p_y_given_x = model(x, *params)
y = T.argmax(p_y_given_x, axis=1)
cost = T.mean(T.nnet.categorical_crossentropy(p_y_given_x, t))
updates = momentum(cost, params, learning_rate=0.001, momentum=0.9)
# 编译theano函数
train = theano.function([x, t], cost, updates=updates)
predict = theano.function([x], y)
# train model
batch_size = 50
for i in range(50):
for start in range(0, len(x_train), batch_size):
x_batch = x_train[start:start + batch_size]
t_batch = t_train[start:start + batch_size]
cost = train(x_batch, t_batch)
predictions_test = predict(x_test)
accuracy = np.mean(predictions_test == labels_test)
print "iteration %d - accuracy: %.5f" % (i + 1, accuracy)
(1)神经网络介绍:略