环境:Ubuntu 16.04,Python 3,Tensoflow-gpu v1.9.0 ,CUDA 9.0
神经网络的参数即是神经元线上的权重w,如下图的w1和w2:
一般做法是先随机生成参数:
w = tf.Variable(tf.random_normal([2, 3], stddev=2, mean=0, seed=1))
其中:
tf.zeros([3, 2], int32) # 生成[[0, 0], [0, 0], [0, 0]]
tf.ones([3, 2], int32) # 生成[[1, 1], [1, 1], [1, 1]]
tf.fill([3, 2], 4) # 生成[[4, 4], [4, 4], [4, 4]]
tf.constant([1, 2, 3]) #生成[1, 2, 3]
前向传播即搭建模型、实现推理的过程。下面以一个搭建某全连接网络为例说明。
假设:生产一批零件,将体积X1和重量X2为特征输入NN,通过NN后输出一个数值。下图以 X 1 = 0.7 , X 2 = 0.5 X1=0.7,X2=0.5 X1=0.7,X2=0.5为例:
由搭建的神经网络,隐藏层节点 a 11 = X 1 ∗ w 11 + X 2 ∗ w 21 = 0.29 a_{11} = X_1*w_{11} + X_2*w_{21} = 0.29 a11=X1∗w11+X2∗w21=0.29,同理求得 a 12 a12 a12和 a 13 a13 a13,最终求得 y y y,实现了前向传播。
TensorFlow描述:
第一层:
第二层:
4. W ( 2 ) W^{(2)} W(2)前三个节点,后1个节点,所以为3×1矩阵: W ( 1 ) = [ w 1 , 1 ( 2 ) w 2 , 1 ( 2 ) w 3 , 1 ( 2 ) ] W^{(1)} = \begin{bmatrix} w_{1,1^{(2)}} \\ w_{2,1^{(2)}} \\ w_{3,1^{(2)}}\\ \end{bmatrix} W(1)=⎣⎡w1,1(2)w2,1(2)w3,1(2)⎦⎤
把每层输入乘以线上权重 w w w,做矩阵乘法即可计算出 y y y
a = tf.matmul(X, W1)
y = tf.matmul(a, W2)
编写程序时注意:
with tf.Session() as sess:
sess.run()
tf.global_variables_initalizer()
init_op = tf.global_variables_initalizer()
sess.run(init_op)
sess.run()
中写入待计算节点sess.run(y)
tf.placeholder()
占位,在sess.run()中用feed_dict
喂数据# 喂一组数据
x = tf.placeholder(tf.float32, shape=(1, 2)) #喂入1组数据,每组数据2个特征(分量)
sess.run(y, feed_dict={x: [[0.7, 0.5]]})
# 喂多组数据
x = tf.placeholder(tf.float32, shape=(1, 2)) #喂入多组数据,每组数据2个特征(分量)
sess.run(y, feed_dict={x: [[0.1, 0.2], [0.2, 0.3], [0.3, 0.4]]})
#coding:utf-8
import tensorflow as tf
# 定义输入和参数
x = tf.constant([[0.7, 0.5]])
w1 = tf.Variable(tf.random_normal([2, 3], stddev=1, seed=1))
w2 = tf.Variable(tf.random_normal([3, 1], stddev=1, seed=1))
# 定义前向传播过程
a = tf.matmul(x, w1)
y = tf.matmul(a, w2)
# 计算结果
with tf.Session() as sess:
init_op = tf.global_variables_initializer()
sess.run(init_op)
print(sess.run(y))
结果为 [[3.0904665]]
#coding:utf-8
import tensorflow as tf
from keras import backend as K
K.clear_session()
# 定义输入和参数
x = tf.placeholder(tf.float32, shape=(1, 2)) # 用placeholder定义输入
w1 = tf.Variable(tf.random_normal([2, 3], stddev=1, seed=1))
w2 = tf.Variable(tf.random_normal([3, 1], stddev=1, seed=1))
# 定义前向传播过程
a = tf.matmul(x, w1)
y = tf.matmul(a, w2)
# 计算结果
with tf.Session() as sess:
init_op = tf.global_variables_initializer()
sess.run(init_op)
print(sess.run(y, feed_dict={x: [[0.7, 0.5]]})) # 喂入一组x
结果为 [[3.0904665]]
#coding:utf-8
import tensorflow as tf
# 定义输入和参数
x = tf.placeholder(tf.float32, shape=(None, 2)) # 用placeholder定义输入
w1 = tf.Variable(tf.random_normal([2, 3], stddev=1, seed=1))
w2 = tf.Variable(tf.random_normal([3, 1], stddev=1, seed=1))
# 定义前向传播过程
a = tf.matmul(x, w1)
y = tf.matmul(a, w2)
# 计算结果
with tf.Session() as sess:
init_op = tf.global_variables_initializer()
sess.run(init_op)
print("y:\n", sess.run(y, feed_dict={x: [[0.7, 0.5], [0.2, 0.3], [0.3, 0.4], [0.4, 0.5], [0.5, 0.6]]})) # 喂入多组x
print("w1:\n", sess.run(w1))
print("w2:\n", sess.run(w2))
结果为
y:
[[3.0904665]
[1.2236414]
[1.7270732]
[2.2305048]
[2.7339368]]
w1:
[[-0.8113182 1.4845988 0.06532937]
[-2.4427042 0.0992484 0.5912243 ]]
w2:
[[-0.8113182 ]
[ 1.4845988 ]
[ 0.06532937]]