tf.estimator框架可以很方便地通过高级Estimator API来构建和训练机器学习模型。Estimator提供了一些类,你可以快速配置实例化来创建regressors和classifiers:
如果tf.estimator预定义的模型不能满足你的需求时,该怎么办?可能你需要更多关于模型的细粒度控制,比如:定制自己的loss function进行优化,或者为每个layer指定不同的activation function。或者你要实现一个排序系统或推荐系统,上述的classifier和regressor不适合这样的预测。
本文描述了如何使用tf.estimator来创建你自己的Estimator,基于生理特征来预测鲍鱼(abalones)的年龄:
首先,你需要了解tf.estimator API基础,比如:feature columns, input functions, 以及 train()/evaluate()/predict()操作。如果你从未了解过,你可以参考:
通常估计鲍鱼的年龄通过贝壳的环数来确定。然而,由于该任务需要: 切割、染色、通过显微器观看贝壳。因而去发现预测年龄的其它方法还是很有必要的。
鲍鱼的生理特征:
数据集:
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import argparse
import sys
import tempfile
# Import urllib
from six.moves import urllib
import numpy as np
import tensorflow as tf
FLAGS = None
# 开启loggging.
tf.logging.set_verbosity(tf.logging.INFO)
# 定义下载数据集.
def maybe_download(train_data, test_data, predict_data):
"""Maybe downloads training data and returns train and test file names."""
if train_data:
train_file_name = train_data
else:
train_file = tempfile.NamedTemporaryFile(delete=False)
urllib.request.urlretrieve(
"http://download.tensorflow.org/data/abalone_train.csv",
train_file.name)
train_file_name = train_file.name
train_file.close()
print("Training data is downloaded to %s" % train_file_name)
if test_data:
test_file_name = test_data
else:
test_file = tempfile.NamedTemporaryFile(delete=False)
urllib.request.urlretrieve(
"http://download.tensorflow.org/data/abalone_test.csv", test_file.name)
test_file_name = test_file.name
test_file.close()
print("Test data is downloaded to %s" % test_file_name)
if predict_data:
predict_file_name = predict_data
else:
predict_file = tempfile.NamedTemporaryFile(delete=False)
urllib.request.urlretrieve(
"http://download.tensorflow.org/data/abalone_predict.csv",
predict_file.name)
predict_file_name = predict_file.name
predict_file.close()
print("Prediction data is downloaded to %s" % predict_file_name)
return train_file_name, test_file_name, predict_file_name
# 创建main()函数,加载train/test/predict数据集.
def main(unused_argv):
# Load datasets
abalone_train, abalone_test, abalone_predict = maybe_download(
FLAGS.train_data, FLAGS.test_data, FLAGS.predict_data)
# Training examples
training_set = tf.contrib.learn.datasets.base.load_csv_without_header(
filename=abalone_train, target_dtype=np.int, features_dtype=np.float64)
# Test examples
test_set = tf.contrib.learn.datasets.base.load_csv_without_header(
filename=abalone_test, target_dtype=np.int, features_dtype=np.float64)
# Set of 7 examples for which to predict abalone ages
prediction_set = tf.contrib.learn.datasets.base.load_csv_without_header(
filename=abalone_predict, target_dtype=np.int, features_dtype=np.float64)
if __name__ == "__main__":
parser = argparse.ArgumentParser()
parser.register("type", "bool", lambda v: v.lower() == "true")
parser.add_argument(
"--train_data", type=str, default="", help="Path to the training data.")
parser.add_argument(
"--test_data", type=str, default="", help="Path to the test data.")
parser.add_argument(
"--predict_data",
type=str,
default="",
help="Path to the prediction data.")
FLAGS, unparsed = parser.parse_known_args()
tf.app.run(main=main, argv=[sys.argv[0]] + unparsed)
当使用 tf.estimator提供的类(比如:DNNClassifier)来定义一个模型时,在构造函数中应用所有的配置参数:
my_nn = tf.estimator.DNNClassifier(feature_columns=[age, height, weight],
hidden_units=[10, 10, 10],
activation_fn=tf.nn.relu,
dropout=0.2,
n_classes=3,
optimizer="Adam")
你不需要编写额外的代码来指示tensorflow来如何训练模型,计算loss以及返回预测;该logic已经被整合到DNNClassifier中了。
当你要从头创建自己的estimator时,构造函数会接受两个高级参数进行模型配置,model_fn和params。
nn = tf.estimator.Estimator(model_fn=model_fn, params=model_params)
注意:正如tf.estimator预定义的regressors和classifiers,Estimator的initializer也接受通用的配置参数:model_dir 和 config。
对于鲍鱼年龄预测器,该模型会接受一个超参数:learning rate。可以在你的代码初始位置将LEARNING_RATE定义为一个常数。
tf.logging.set_verbosity(tf.logging.INFO)
# Learning rate for the model
LEARNING_RATE = 0.001
注意:这里LEARNING_RATE设为0.001, 你可以随需调整该值来在模型训练期间达到最好的结果。
接着,添加下面的代码到main()中,创建model_params字典,它包含了learning rate,用它来实例化Estimator:
# Set model params
model_params = {"learning_rate": LEARNING_RATE}
# Instantiate Estimator
nn = tf.estimator.Estimator(model_fn=model_fn, params=model_params)
Estimator API模型函数的基本框架为:
def model_fn(features, labels, mode, params):
# Logic to do the following:
# 1. Configure the model via TensorFlow operations
# 2. Define the loss function for training/evaluation
# 3. Define the training operation/optimizer
# 4. Generate predictions
# 5. Return predictions/loss/train_op/eval_metric_ops in EstimatorSpec object
return EstimatorSpec(mode, predictions, loss, train_op, eval_metric_ops)
model_fn接受3个参数:
model_fn也接受一个params参数,它包含了一个用于训练的超参数字典(所上所述)
该函数体会执行下面的任务:
model_fn必须返回一个tf.estimator.EstimatorSpec对象,它包含了以下的值:
构建一个神经网络必须创建和连接input layer/hidden layers/output layer。
input layer是一系列节点(模型中每个特征一个节点),它们接受被传给model_fn的features参数所指定的feature数据。如果features包含了一个n维的Tensor,包含所有的特征数据,那么你可以将它看成是input layer。如果features包含了一个通过一个input function传给模型的feature columns字典,你可以使用tf.feature_column.input_layer函数来将它转化成一个input-layer Tensor:
input_layer = tf.feature_column.input_layer(
features=features, feature_columns=[age, height, weight])
如上所述,input_layer()会采用两个必须的参数:
NN的input layer,接着必须通过一个activation function(它会对前一层执行一个非线性变换)被连接到一或多个hidden layers。最后一层的hidden layer接着连接到output layer上。tf.layers提供了tf.layers.dense函数来构建fully connected layers。该activation通过activation参数进行控制。activation参数可选的一些选项为:
hidden_layer = tf.layers.dense( inputs=input_layer, units=10, activation=tf.nn.relu)
second_hidden_layer = tf.layers.dense( inputs=hidden_layer, units=20, activation=tf.nn.relu)
output_layer = tf.layers.dense( inputs=second_hidden_layer, units=3, activation=None)
其它的activation函数(比如sigmoid)也是可能的,例如:
output_layer = tf.layers.dense(inputs=second_hidden_layer,
units=10,
activation_fn=tf.sigmoid)
上述代码创建了神经网络层:output_layer, 它与second_hidden_layer通过一个sigmoid激活函数(tf.sigmoid)完全连接,对于预定义的activation,可参见API docs
完整的预测器代码如下:
def model_fn(features, labels, mode, params):
"""Model function for Estimator."""
# Connect the first hidden layer to input layer
# (features["x"]) with relu activation
first_hidden_layer = tf.layers.dense(features["x"], 10, activation=tf.nn.relu)
# Connect the second hidden layer to first hidden layer with relu
second_hidden_layer = tf.layers.dense(
first_hidden_layer, 10, activation=tf.nn.relu)
# Connect the output layer to second hidden layer (no activation fn)
output_layer = tf.layers.dense(second_hidden_layer, 1)
# Reshape output layer to 1-dim Tensor to return predictions
predictions = tf.reshape(output_layer, [-1])
predictions_dict = {"ages": predictions}
...
这里,因为你使用numpy_input_fn来创建abalone Datasets,features是一个关于{“x”: data_tensor}的字典,因此,features[“x”]是input layer。该网络包含了两个hidden layers,每个layer具有10个节点以及一个ReLU激活函数。output layer没有activation函数,可以通过tf.reshape 到一个1维的tensor,以便捕获模型的predictions,它被存到predictions_dict中。
通过model_fn返回的EstimatorSpec必须包含loss:一个Tensor代表loss值,它可以量化模型的predict value在训练和评测期间与label value间的优化。tf.losses模块提供了很方便的函数来计算loss:
下面的示例,为model_fn添加了一个loss的定义,它使用mean_squared_error():
def model_fn(features, labels, mode, params):
"""Model function for Estimator."""
# Connect the first hidden layer to input layer
# (features["x"]) with relu activation
first_hidden_layer = tf.layers.dense(features["x"], 10, activation=tf.nn.relu)
# Connect the second hidden layer to first hidden layer with relu
second_hidden_layer = tf.layers.dense(
first_hidden_layer, 10, activation=tf.nn.relu)
# Connect the output layer to second hidden layer (no activation fn)
output_layer = tf.layers.dense(second_hidden_layer, 1)
# Reshape output layer to 1-dim Tensor to return predictions
predictions = tf.reshape(output_layer, [-1])
predictions_dict = {"ages": predictions}
# Calculate loss using mean squared error
loss = tf.losses.mean_squared_error(labels, predictions)
...
evaluation的metrics可以被添加到一个eval_metric_ops字典中。下面的代码定义了一个rmse metrics,它会为模型预测计算root mean squared error。
training op定义了当对训练数据进行fit模型的所用的优化算法。通常当训练时,目标是最小化loss。一种简单的方法是,创建training op来实例化一个tf.train.Optimizer子类,并调用minimize方法。
下面的代码为model_fn定义了一个training op,使用上述计算的loss value,在params中传给的learning rate,gradient descent optimizer,对于global_step,函数 tf.train.get_global_step会小心的生成一个整型变量:
optimizer = tf.train.GradientDescentOptimizer(
learning_rate=params["learning_rate"])
train_op = optimizer.minimize(
loss=loss, global_step=tf.train.get_global_step())
对于optimizers的列表,详见API guide。
这里给出了最终完整的model_fn。下面的代码配置了NN;定义了loss和training op;并返回一个包含mode,predictions_dict, loss, 和 train_op的EstimatorSpec对象。
def model_fn(features, labels, mode, params):
"""Model function for Estimator."""
# Connect the first hidden layer to input layer
# (features["x"]) with relu activation
first_hidden_layer = tf.layers.dense(features["x"], 10, activation=tf.nn.relu)
# Connect the second hidden layer to first hidden layer with relu
second_hidden_layer = tf.layers.dense(
first_hidden_layer, 10, activation=tf.nn.relu)
# Connect the output layer to second hidden layer (no activation fn)
output_layer = tf.layers.dense(second_hidden_layer, 1)
# Reshape output layer to 1-dim Tensor to return predictions
predictions = tf.reshape(output_layer, [-1])
# Provide an estimator spec for `ModeKeys.PREDICT`.
if mode == tf.estimator.ModeKeys.PREDICT:
return tf.estimator.EstimatorSpec(
mode=mode,
predictions={"ages": predictions})
# Calculate loss using mean squared error
loss = tf.losses.mean_squared_error(labels, predictions)
# Calculate root mean squared error as additional eval metric
eval_metric_ops = {
"rmse": tf.metrics.root_mean_squared_error(
tf.cast(labels, tf.float64), predictions)
}
optimizer = tf.train.GradientDescentOptimizer(
learning_rate=params["learning_rate"])
train_op = optimizer.minimize(
loss=loss, global_step=tf.train.get_global_step())
# Provide an estimator spec for `ModeKeys.EVAL` and `ModeKeys.TRAIN` modes.
return tf.estimator.EstimatorSpec(
mode=mode,
loss=loss,
train_op=train_op,
eval_metric_ops=eval_metric_ops)
你已经实例化了一个Estimator,并在model_fn中定义了自己的行为;接着你可以去train/evaluate/predict。
将下面的代码添加到main()的尾部来对训练数据进行fit,并评估accuracy:
train_input_fn = tf.estimator.inputs.numpy_input_fn(
x={"x": np.array(training_set.data)},
y=np.array(training_set.target),
num_epochs=None,
shuffle=True)
# Train
nn.train(input_fn=train_input_fn, steps=5000)
# Score accuracy
test_input_fn = tf.estimator.inputs.numpy_input_fn(
x={"x": np.array(test_set.data)},
y=np.array(test_set.target),
num_epochs=1,
shuffle=False)
ev = nn.evaluate(input_fn=test_input_fn)
print("Loss: %s" % ev["loss"])
print("Root Mean Squared Error: %s" % ev["rmse"])
注意:上述代码使用input function(对于training (train_input_fn) , 对于evaluation (test_input_fn)),来将 feature(x)和label(y) Tensors feed给模型,更多input fun的定义详见input_fn.
运行代码可以看到:
...
INFO:tensorflow:loss = 4.86658, step = 4701
INFO:tensorflow:loss = 4.86191, step = 4801
INFO:tensorflow:loss = 4.85788, step = 4901
...
INFO:tensorflow:Saving evaluation summary for 5000 step: loss = 5.581
Loss: 5.581
当运行时,model_fn所返回的mean squared error loss分值会输出。
为了预测age,可以添加下面的代码到main()中:
# Print out predictions
predict_input_fn = tf.estimator.inputs.numpy_input_fn(
x={"x": prediction_set.data},
num_epochs=1,
shuffle=False)
predictions = nn.predict(input_fn=predict_input_fn)
for i, p in enumerate(predictions):
print("Prediction %s: %s" % (i + 1, p["ages"]))
这里,predict() 函数会返回predictions的结果作为一个iterable。for loop会枚举和打印所有的结果。重新运行该代码,你可以看到如下输出:
...
Prediction 1: 4.92229
Prediction 2: 10.3225
Prediction 3: 7.384
Prediction 4: 10.6264
Prediction 5: 11.0862
Prediction 6: 9.39239
Prediction 7: 11.1289
tensorflow input_fn