tensorflow 的本质是用Python构建一个计算图,然后用优化后的C++代码来运行这个图,因此适用于大型机器学习和分布式计算中。
构建一个图:
import tensorflow as tf
# 构建图
x=tf.Variable(3,name='x')
y=tf.Variable(4,name='y')
f=x*x+x+y+2
# 1
sess=tf.Session()
sess.run(x.initializer)
sess.run(y.initializer)
result=sess.run(f)
print(result)
# 2
with tf.Session() as sess:
x.initialzer.run()
y.initialzer.run()
result=f.eval()
# 3
init=tf.global_variables_initializer() # prepare an init node
with tf.Session() as sess:
init.run() # actually run
result=f.eval()
# 4
sess=tf.InteractiveSession() # change to defalut sess
init.run()
result=f.eval()
sess.close()
上面列出了四种创建图的方法。
但是实际的运用场景多是只调用一个或几个特定的计算节点,tf会自己检查依赖的其他节点,并进行计算。注意节点的值是不保存的,这也符合分布式计算的思想,因为节点可能甚至不在同一个服务器上。如果要保存,需要保存为变量,变量是session维护的状态。(会话之间也相互隔离)
在验证代码前,首先对数据集进行归一化。这里采用常用的线性变换,它将数据投到[0,1]区间。讲一点题外话,它保存了不同特征的重要程度(权重),并让他们在一个度量内比较,加快收敛速度(例如椭圆变圆,梯度下降更快)。它一定程度上改变了分布情况,和此对应的保存分布的有标准化。
归一化代码:
# dataSet is a list / to [0,1]
def normList2Data(dataSet):
dataSet = np.array(dataSet)
minColVals = dataSet.min(0)
maxColVals = dataSet.max(0)
ranges = maxColVals - minColVals
normDataSet = np.zeros(np.shape(dataSet))
m = dataSet.shape[0]
normDataSet = dataSet - np.tile(minColVals, (m, 1))
normDataSet = normDataSet / np.tile(ranges, (m, 1))
normDataSet = np.nan_to_num(normDataSet)
return normDataSet
def normalize(d):
# d is a (n x dimension) np array
d -= np.min(d, axis=0)
d /= np.ptp(d, axis=0)
d = np.nan_to_num(d)
return d
normList2Data()
是手写的代码,np.tile()
为拓展矩阵为特定shape的函数。
normalize()
中写法比较简单,np.ptp()
直接返回所求 range
。
下文用到的:
为了让测试数据可逆采用的归一化方法
tf.math.l2_normalize
output = x / sqrt(max(sum(x**2), epsilon))
线性回归的梯度下降
import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
import tensorflow as tf
import numpy as np
from sklearn.datasets import fetch_california_housing
from normalize import *
from sklearn import preprocessing
# 读取数据集
housing=fetch_california_housing()
m,n=housing.data.shape
# np.c_() column 不变,可以理解为2维中上下拼接
housing_data_plus_bias=np.c_[np.ones((m,1)),housing.data]
print('housing_data_orig: {}'.format(housing_data_plus_bias[:5]))
'''not invertible so that not use
min_max_scaler = preprocessing.MinMaxScaler()
x_scaled = min_max_scaler.fit_transform(housing_data_plus_bias)
'''
# housing_data_plus_bias=normalize(housing_data_plus_bias)
# tf.nn.l2_normalize 归一化,output = x / sqrt(max(sum(x**2), epsilon))
housing_data_plus_bias=tf.nn.l2_normalize(housing_data_plus_bias, dim = 0)
# X=tf.constant(housing_data_plus_bias,dtype=tf.float32,name='X')
# change attributes
X=tf.cast(housing_data_plus_bias,dtype=tf.float32,name='X')
y=tf.constant(housing.target.reshape(-1,1),dtype=tf.float32,name='y')
XT=tf.transpose(X)
theta=tf.matmul(tf.matmul(tf.matrix_inverse(tf.matmul(XT,X)),XT),y)
with tf.Session() as sess:
x_values=sess.run(X)
theta_value=theta.eval()
print(x_values[:10])
print(theta_value)
显然数值微分仍然不好用,幸好tf有 autodiff
自动微分。
import tensorflow as tf
import numpy as np
from sklearn.datasets import fetch_california_housing
n_epochs = 1000
learning_rate = 0.01
housing = fetch_california_housing()
m, n = housing.data.shape
housing_data_plus_bias = np.c_[np.ones((m, 1)), housing.data]
housing_data_plus_bias=tf.nn.l2_normalize(housing_data_plus_bias)
# X=tf.constant(housing_data_plus_bias,dtype=tf.float32,name='X')
X=tf.cast(housing_data_plus_bias,dtype=tf.float32,name='X')
y=tf.constant(housing.target.reshape(-1,1),dtype=tf.float32,name='y')
theta = tf.Variable(tf.random_uniform([n + 1, 1], -1.0, 1.0), name='theta')
y_pred = tf.matmul(X, theta, name='predictions')
error = y_pred - y
mse = tf.reduce_mean(tf.square(error), name='mse')
# not autodiff: gradients = 2 / m * tf.matmul(tf.transpose(X), error)
# d(ys)/d(xs)
# theta is like x-axis in 2D & use to calc partial derivative
# or say: gradients
gradients=tf.gradients(mse,[theta])[0]
training_op = tf.assign(theta, theta - learning_rate * gradients)
init = tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init)
for epoch in range(n_epochs):
if epoch % 100 == 0:
print("Epoch", epoch, " MSE = ", mse.eval())
sess.run(training_op)
best_theta = theta.eval()
gradients=tf.gradients(mse,[theta])[0]
tf.gradients(
ys,
xs,
grad_ys=None,
name='gradients',
colocate_gradients_with_ops=False,
gate_gradients=False,
aggregation_method=None,
stop_gradients=None
)
ys and xs are each a Tensor or a list of tensors
因此返回一个列表。
另外可选其他优化器:
将gradients=...
和training_op=...
改为下面两行:
optimizer=tf.train.GradientDescentOpitimizer(learning_rate=learning_rate)
training_op=optimizer.minimize(mse)
常见的一种方法是创建一个占位符节点
,占位符节点没有计算的步骤,仅输出需要的值。
创建占位符节点:
A=tf.placeholder(tf.float32,shape=(None,3))
B=A+5
with tf.Session() as sess:
B_val_1=B.eval(feed_dict={A:[[1,2,3]]})
B_val_2=B.eval(feed_dict={A:[[2,3,4]]}
print(B_val_1)
print(B_val_2)
在运行节点之后
saver=tf.train.Saver()
saver.restore(sess,'./my_model_final.ckpt')
读取节点:
saver.restore(sess,'./my_model_final.ckpt')