Estimator
- Estimator是高级别的Tensorflow API。
- 使用Estimator完成项目,需要以下几个步骤:
- 创建一个或多个输入函数,以规整输入数据的格式,格式为:函数必须返回两个值,一个是特征名与特征张量组合而成的字典,一个是标签值或张量。建议使用Dataset API。
- 定义特征列:特征列(feature columns)是一个对象,用于描述模型输入数据的格式,确定一个正确的接口。每个tf.feature_column确定一个功能名称,并指定其输入类型、大小等信息。使用tf.feature_column.numeric_column()。
- 实例化Estimator:Tensorflow 提供了几个预创建的 Estimator 分类器,其中包括:
- tf.estimator.DNNClassifier 用于多类别分类的深度模型。
- tf.estimator.DNNLinearCombinedClassifier 用于广度与深度模型。
- tf.estimator.LinearClassifier 用于基于线性模型的分类器。
Estimator实战
from __future__ import absolute_import, division, print_function, unicode_literals
import tensorflow as tf
import pandas as pd
CSV_COLUMN_NAMES = ['SepalLength', 'SepalWidth', 'PetalLength', 'PetalWidth', 'Species']
SPECIES = ['Setosa', 'Versicolor', 'Virginica']
train_path = tf.keras.utils.get_file(
"iris_training.csv", "https://storage.googleapis.com/download.tensorflow.org/data/iris_training.csv")
test_path = tf.keras.utils.get_file(
"iris_test.csv", "https://storage.googleapis.com/download.tensorflow.org/data/iris_test.csv")
train = pd.read_csv(train_path, names=CSV_COLUMN_NAMES, header=0)
test = pd.read_csv(test_path, names=CSV_COLUMN_NAMES, header=0)
train.head()
|
SepalLength |
SepalWidth |
PetalLength |
PetalWidth |
Species |
0 |
6.4 |
2.8 |
5.6 |
2.2 |
2 |
1 |
5.0 |
2.3 |
3.3 |
1.0 |
1 |
2 |
4.9 |
2.5 |
4.5 |
1.7 |
2 |
3 |
4.9 |
3.1 |
1.5 |
0.1 |
0 |
4 |
5.7 |
3.8 |
1.7 |
0.3 |
0 |
train_y = train.pop('Species')
test_y = test.pop('Species')
train.head()
|
SepalLength |
SepalWidth |
PetalLength |
PetalWidth |
0 |
6.4 |
2.8 |
5.6 |
2.2 |
1 |
5.0 |
2.3 |
3.3 |
1.0 |
2 |
4.9 |
2.5 |
4.5 |
1.7 |
3 |
4.9 |
3.1 |
1.5 |
0.1 |
4 |
5.7 |
3.8 |
1.7 |
0.3 |
def input_fn(features, labels, training=True, batch_size=256):
"""An input function for training or evaluating"""
dataset = tf.data.Dataset.from_tensor_slices((dict(features), labels))
if training:
dataset = dataset.shuffle(1000).repeat()
return dataset.batch(batch_size)
my_feature_columns = []
for key in train.keys():
print(key)
my_feature_columns.append(tf.feature_column.numeric_column(key=key))
print(my_feature_columns)
SepalLength
SepalWidth
PetalLength
PetalWidth
[NumericColumn(key='SepalLength', shape=(1,), default_value=None, dtype=tf.float32, normalizer_fn=None), NumericColumn(key='SepalWidth', shape=(1,), default_value=None, dtype=tf.float32, normalizer_fn=None), NumericColumn(key='PetalLength', shape=(1,), default_value=None, dtype=tf.float32, normalizer_fn=None), NumericColumn(key='PetalWidth', shape=(1,), default_value=None, dtype=tf.float32, normalizer_fn=None)]
classifier = tf.estimator.DNNClassifier(
feature_columns=my_feature_columns,
hidden_units=[30, 10],
n_classes=3)
INFO:tensorflow:Using default config.
WARNING:tensorflow:Using temporary folder as model directory: C:\Users\Smile\AppData\Local\Temp\tmphgc259yl
INFO:tensorflow:Using config: {'_model_dir': 'C:\\Users\\Smile\\AppData\\Local\\Temp\\tmphgc259yl', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600, '_session_config': allow_soft_placement: true
graph_options {
rewrite_options {
meta_optimizer_iterations: ONE
}
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_service': None, '_cluster_spec': ClusterSpec({}), '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}
classifier.train(
input_fn=lambda: input_fn(train, train_y, training=True),
steps=5000)
WARNING:tensorflow:From E:\Anaconda\anaconda\lib\site-packages\tensorflow_core\python\ops\resource_variable_ops.py:1635: calling BaseResourceVariable.__init__ (from tensorflow.python.ops.resource_variable_ops) with constraint is deprecated and will be removed in a future version.
Instructions for updating:
If using Keras pass *_constraint arguments to layers.
WARNING:tensorflow:From E:\Anaconda\anaconda\lib\site-packages\tensorflow_core\python\training\training_util.py:236: Variable.initialized_value (from tensorflow.python.ops.variables) is deprecated and will be removed in a future version.
Instructions for updating:
Use Variable.read_value. Variables in 2.X are initialized automatically both in eager and graph (inside tf.defun) contexts.
INFO:tensorflow:Calling model_fn.
WARNING:tensorflow:Layer dnn is casting an input tensor from dtype float64 to the layer's dtype of float32, which is new behavior in TensorFlow 2. The layer has dtype float32 because it's dtype defaults to floatx.
If you intended to run this layer in float32, you can safely ignore this warning. If in doubt, this warning is likely only an issue if you are porting a TensorFlow 1.X model to TensorFlow 2.
To change all layers to have dtype float64 by default, call `tf.keras.backend.set_floatx('float64')`. To change just this layer, pass dtype='float64' to the layer constructor. If you are the author of this layer, you can disable autocasting by passing autocast=False to the base Layer constructor.
WARNING:tensorflow:From E:\Anaconda\anaconda\lib\site-packages\tensorflow_core\python\keras\optimizer_v2\adagrad.py:103: calling Constant.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Create CheckpointSaverHook.
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Saving checkpoints for 0 into C:\Users\Smile\AppData\Local\Temp\tmphgc259yl\model.ckpt.
INFO:tensorflow:loss = 2.7596803, step = 0
INFO:tensorflow:global_step/sec: 417.249
INFO:tensorflow:loss = 1.5864722, step = 100 (0.240 sec)
INFO:tensorflow:global_step/sec: 682.968
INFO:tensorflow:loss = 1.3686752, step = 200 (0.159 sec)
INFO:tensorflow:global_step/sec: 691.149
INFO:tensorflow:loss = 1.2175257, step = 300 (0.132 sec)
INFO:tensorflow:global_step/sec: 695.377
INFO:tensorflow:loss = 1.1447594, step = 400 (0.144 sec)
INFO:tensorflow:global_step/sec: 694.894
INFO:tensorflow:loss = 1.1203682, step = 500 (0.144 sec)
INFO:tensorflow:global_step/sec: 649.194
INFO:tensorflow:loss = 1.073328, step = 600 (0.157 sec)
INFO:tensorflow:global_step/sec: 737.162
INFO:tensorflow:loss = 1.0393159, step = 700 (0.133 sec)
INFO:tensorflow:global_step/sec: 688.04
INFO:tensorflow:loss = 1.0136361, step = 800 (0.145 sec)
INFO:tensorflow:global_step/sec: 647.965
INFO:tensorflow:loss = 0.99417055, step = 900 (0.154 sec)
INFO:tensorflow:global_step/sec: 712.095
INFO:tensorflow:loss = 0.9741398, step = 1000 (0.140 sec)
INFO:tensorflow:global_step/sec: 702.392
INFO:tensorflow:loss = 0.94853187, step = 1100 (0.142 sec)
INFO:tensorflow:global_step/sec: 689.783
INFO:tensorflow:loss = 0.94446087, step = 1200 (0.145 sec)
INFO:tensorflow:global_step/sec: 644.138
INFO:tensorflow:loss = 0.9349041, step = 1300 (0.157 sec)
INFO:tensorflow:global_step/sec: 757.395
INFO:tensorflow:loss = 0.9281113, step = 1400 (0.146 sec)
INFO:tensorflow:global_step/sec: 677.107
INFO:tensorflow:loss = 0.9123041, step = 1500 (0.146 sec)
INFO:tensorflow:global_step/sec: 684.452
INFO:tensorflow:loss = 0.90185827, step = 1600 (0.132 sec)
INFO:tensorflow:global_step/sec: 688.624
INFO:tensorflow:loss = 0.89590585, step = 1700 (0.145 sec)
INFO:tensorflow:global_step/sec: 633.714
INFO:tensorflow:loss = 0.8893813, step = 1800 (0.160 sec)
INFO:tensorflow:global_step/sec: 736.433
INFO:tensorflow:loss = 0.8834985, step = 1900 (0.133 sec)
INFO:tensorflow:global_step/sec: 681.711
INFO:tensorflow:loss = 0.8767073, step = 2000 (0.159 sec)
INFO:tensorflow:global_step/sec: 671.442
INFO:tensorflow:loss = 0.86436415, step = 2100 (0.151 sec)
INFO:tensorflow:global_step/sec: 687.153
INFO:tensorflow:loss = 0.86384547, step = 2200 (0.146 sec)
INFO:tensorflow:global_step/sec: 671.386
INFO:tensorflow:loss = 0.859602, step = 2300 (0.146 sec)
INFO:tensorflow:global_step/sec: 694.817
INFO:tensorflow:loss = 0.843158, step = 2400 (0.146 sec)
INFO:tensorflow:global_step/sec: 689.812
INFO:tensorflow:loss = 0.8570354, step = 2500 (0.147 sec)
INFO:tensorflow:global_step/sec: 666.513
INFO:tensorflow:loss = 0.82521486, step = 2600 (0.147 sec)
INFO:tensorflow:global_step/sec: 634.746
INFO:tensorflow:loss = 0.82707626, step = 2700 (0.148 sec)
INFO:tensorflow:global_step/sec: 668.07
INFO:tensorflow:loss = 0.8173684, step = 2800 (0.149 sec)
INFO:tensorflow:global_step/sec: 736.048
INFO:tensorflow:loss = 0.8228967, step = 2900 (0.133 sec)
INFO:tensorflow:global_step/sec: 671.978
INFO:tensorflow:loss = 0.8122652, step = 3000 (0.161 sec)
INFO:tensorflow:global_step/sec: 676.142
INFO:tensorflow:loss = 0.8174802, step = 3100 (0.148 sec)
INFO:tensorflow:global_step/sec: 626.573
INFO:tensorflow:loss = 0.8014016, step = 3200 (0.149 sec)
INFO:tensorflow:global_step/sec: 671.279
INFO:tensorflow:loss = 0.8030711, step = 3300 (0.150 sec)
INFO:tensorflow:global_step/sec: 659.774
INFO:tensorflow:loss = 0.8061998, step = 3400 (0.151 sec)
INFO:tensorflow:global_step/sec: 659.655
INFO:tensorflow:loss = 0.782421, step = 3500 (0.152 sec)
INFO:tensorflow:global_step/sec: 661.854
INFO:tensorflow:loss = 0.79174757, step = 3600 (0.151 sec)
INFO:tensorflow:global_step/sec: 660.913
INFO:tensorflow:loss = 0.7846266, step = 3700 (0.153 sec)
INFO:tensorflow:global_step/sec: 723.83
INFO:tensorflow:loss = 0.78182733, step = 3800 (0.152 sec)
INFO:tensorflow:global_step/sec: 603.637
INFO:tensorflow:loss = 0.77771455, step = 3900 (0.147 sec)
INFO:tensorflow:global_step/sec: 636.475
INFO:tensorflow:loss = 0.7850309, step = 4000 (0.159 sec)
INFO:tensorflow:global_step/sec: 659.655
INFO:tensorflow:loss = 0.7761879, step = 4100 (0.153 sec)
INFO:tensorflow:global_step/sec: 710.711
INFO:tensorflow:loss = 0.7695364, step = 4200 (0.151 sec)
INFO:tensorflow:global_step/sec: 679.563
INFO:tensorflow:loss = 0.76040304, step = 4300 (0.150 sec)
INFO:tensorflow:global_step/sec: 610.901
INFO:tensorflow:loss = 0.75900936, step = 4400 (0.150 sec)
INFO:tensorflow:global_step/sec: 665.128
INFO:tensorflow:loss = 0.76111466, step = 4500 (0.151 sec)
INFO:tensorflow:global_step/sec: 667.254
INFO:tensorflow:loss = 0.76714766, step = 4600 (0.151 sec)
INFO:tensorflow:global_step/sec: 661.249
INFO:tensorflow:loss = 0.7604923, step = 4700 (0.147 sec)
INFO:tensorflow:global_step/sec: 626.097
INFO:tensorflow:loss = 0.74273324, step = 4800 (0.166 sec)
INFO:tensorflow:global_step/sec: 648.068
INFO:tensorflow:loss = 0.74195194, step = 4900 (0.148 sec)
INFO:tensorflow:Saving checkpoints for 5000 into C:\Users\Smile\AppData\Local\Temp\tmphgc259yl\model.ckpt.
INFO:tensorflow:Loss for final step: 0.7463323.
eval_result = classifier.evaluate(
input_fn=lambda: input_fn(test, test_y, training=False))
print('\nTest set accuracy: {accuracy:0.3f}\n'.format(**eval_result))
INFO:tensorflow:Calling model_fn.
WARNING:tensorflow:Layer dnn is casting an input tensor from dtype float64 to the layer's dtype of float32, which is new behavior in TensorFlow 2. The layer has dtype float32 because it's dtype defaults to floatx.
If you intended to run this layer in float32, you can safely ignore this warning. If in doubt, this warning is likely only an issue if you are porting a TensorFlow 1.X model to TensorFlow 2.
To change all layers to have dtype float64 by default, call `tf.keras.backend.set_floatx('float64')`. To change just this layer, pass dtype='float64' to the layer constructor. If you are the author of this layer, you can disable autocasting by passing autocast=False to the base Layer constructor.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Starting evaluation at 2020-02-27T11:20:14Z
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Restoring parameters from C:\Users\Smile\AppData\Local\Temp\tmphgc259yl\model.ckpt-5000
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Inference Time : 0.25205s
INFO:tensorflow:Finished evaluation at 2020-02-27-11:20:14
INFO:tensorflow:Saving dict for global step 5000: accuracy = 0.7, average_loss = 0.7753303, global_step = 5000, loss = 0.7753303
INFO:tensorflow:Saving 'checkpoint_path' summary for global step 5000: C:\Users\Smile\AppData\Local\Temp\tmphgc259yl\model.ckpt-5000
Test set accuracy: 0.700
expected = ['Setosa', 'Versicolor', 'Virginica']
predict_x = {
'SepalLength': [5.1, 5.9, 6.9],
'SepalWidth': [3.3, 3.0, 3.1],
'PetalLength': [1.7, 4.2, 5.4],
'PetalWidth': [0.5, 1.5, 2.1],
}
def input_fn(features, batch_size=256):
"""An input function for prediction."""
return tf.data.Dataset.from_tensor_slices(dict(features)).batch(batch_size)
predictions = classifier.predict(
input_fn=lambda: input_fn(predict_x))
for pred_dict, expec in zip(predictions, expected):
class_id = pred_dict['class_ids'][0]
probability = pred_dict['probabilities'][class_id]
print('Prediction is "{}" ({:.1f}%), expected "{}"'.format(
SPECIES[class_id], 100 * probability, expec))
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Restoring parameters from C:\Users\Smile\AppData\Local\Temp\tmphgc259yl\model.ckpt-5000
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
Prediction is "Setosa" (70.9%), expected "Setosa"
Prediction is "Versicolor" (40.8%), expected "Versicolor"
Prediction is "Versicolor" (39.3%), expected "Virginica"