我们采用Auto MPG 数据集,它记录了各种汽车效能指标与气缸数、重量、马力等其它因子的真实数据,查看数据集的前5 项,如表 6.1 所示,其中每个字段的含义列在表6.2 中。除了产地的数字字段表示类别外,其他字段都是数值类型。对于产地地段,1 表示美国,2 表示欧洲,3 表示日本。
import tensorflow as tf
import pandas as pd
from tensorflow import keras
from tensorflow.keras import layers, losses
import matplotlib.pyplot as mp
Auto MPG 数据集一共记录了398 项数据,我们从UCI 服务器下载并读取数据集到DataFrame 对象中,代码如下:
# 在线下载汽车效能数据集
dataset_path = keras.utils.get_file("auto-mpg.data",
"http://archive.ics.uci.edu/ml/machine-learning-databases/auto-mpg/auto-mpg.data")
# 效能(公里数每加仑),气缸数,排量,马力,重量
# 加速度,型号年份,产地
column_names = ['MPG', 'Cylinders', 'Displacement', 'Horsepower', 'Weight',
'Acceleration', 'Model Year', 'Origin']
raw_dataset = pd.read_csv(dataset_path, names=column_names,
na_values="?", comment='\t',
sep=" ", skipinitialspace=True)
首先我们发现数据集中又叫做Origin的一列,1表示美国,2表示欧洲,3表示日本。我们对这一列进行如下处理:
# delete origin row and get it
origin = raw_dataset.pop('Origin')
dataset = raw_dataset.copy()
# use continues' name instead origin
dataset['USA'] = (origin == 1) * 1.0
dataset['Europe'] = (origin == 2) * 1.0
dataset['Japan'] = (origin == 3) * 1.0
之后以为数据集中有一些空白项,所以我们将其删除
# clear blank data
dataset = dataset.dropna()
按8:2的比例分为训练集和测试集
# splice data to get train_db and test_db (8:2)
train_dataset = dataset.sample(frac=0.8, random_state=0)
test_dataset = dataset.drop(train_dataset.index)
获取标签并将其从训练集和测试集中删除,同时使用panads中的decribe()方法获取训练集的均方差,标准差等信息
# splice data to get train_db and test_db (8:2)
train_dataset = dataset.sample(frac=0.8, random_state=0)
test_dataset = dataset.drop(train_dataset.index)
# get MPG labels meantime pop it from dataset
train_labels = train_dataset.pop('MPG')
test_labels = test_dataset.pop('MPG')
# get statistical data
train_stats = train_dataset.describe()
train_stats = train_stats.transpose()
为什么要标准化呢? 简要地说,为了保证网络可以良好的收敛,在不清楚各个维度的相对重要程度之前,标准化使得输入的各个维度分布相近,从而允许我们在网络训练过程中,对各个维度“一视同仁”(即设置相同的学习率、正则项系数、权重初始化、以及激活函数)。反过来,当我们使用全局相同的学习率、权重初始化、以及激活函数等网络设置时,方差更大的维度将获得更多的重视。
def norm(dataset):
return (dataset - train_stats['mean']) / train_stats['std']
normed_train_data = norm(train_dataset)
normed_test_data = norm(test_dataset)
# structure tf dataset
train_db = tf.data.Dataset.from_tensor_slices((normed_train_data.values, train_labels.values))
train_db = train_db.shuffle(1000).batch(32)
考虑到 Auto MPG 数据集规模较小,我们只创建一个3 层的全连接网络来完成MPG值的预测任务。输入的特征共有9 种,因此第一层的输入节点数为9。第一层、第二层的输出节点数设计为64和64,由于只有一种预测值,输出层输出节点设计为1。考虑MPG ∈+,因此输出层的激活函数可以不加,也可以添加ReLU 激活函数。
我们将网络实现为一个自定义网络类,只需要在初始化函数中创建各个子网络层,并在前向计算函数call 中实现自定义网络类的计算逻辑即可。自定义网络类继承自keras.Model 基类,这也是自定义网络类的标准写法,以方便地利用keras.Model 基类提供的trainable_variables、save_weights 等各种便捷功能。网络模型类实现如下:
class Network(keras.Model):
def __init__(self):
super(Network, self).__init__()
# create three connection layers
self.fc1 = layers.Dense(64, activation='relu')
self.fc2 = layers.Dense(64, activation='relu')
self.fc3 = layers.Dense(1, activation='relu')
def call(self, inputs, training=None, mask=None):
# loop through three connection layers
result = self.fc1(inputs)
result = self.fc2(result)
result = self.fc3(result)
return result
在完成主网络模型类的创建后,我们来实例化网络对象和创建优化器,之后就可以进行训练了。接下来实现网络训练部分,通过Epoch 和Step 组成的双层循环训练网络,共训练200个Epoch,在训练的过程中将mae loss 保存,之后进行画图对比
model = Network()
model.build(input_shape=(4, 9))
# create optimizer
optimizer = tf.keras.optimizers.RMSprop(0.001)
# create list to save data
test_loss = []
train_loss = []
# train
for epoch in range(200):
for step, (x, y) in enumerate(train_db):
with tf.GradientTape() as type:
out = model.call(x)
loss = tf.reduce_mean(losses.MSE(y, out))
mae_loss = tf.reduce_mean(losses.MAE(y, out))
if step % 10 == 0:
print(epoch, step, float(loss))
grads = type.gradient(loss, model.trainable_variables)
# update params
optimizer.apply_gradients(zip(grads, model.trainable_variables))
train_loss.append(float(mae_loss))
out = model.call(tf.constant(normed_test_data.values))
test_loss.append(tf.reduce_mean(losses.MAE(test_labels, out)))
# draw picture
mp.figure()
mp.xlabel('Epoch')
mp.ylabel('MAE')
mp.plot(train_loss, label='Train')
mp.plot(test_loss, label='Test')
mp.savefig('auto.svg')
mp.show()
import tensorflow as tf
import pandas as pd
import seaborn as sns
from tensorflow import keras
from tensorflow.keras import layers, losses
# 加速度,型号年份,产地
column_names = ['MPG', 'Cylinders', 'Displacement', 'Horsepower', 'Weight', 'Acceleration', 'Model Year', 'Origin']
raw_dataset = pd.read_csv("auto-mpg.data",
names=column_names,
na_values="?",
comment='\t',
sep=" ",
skipinitialspace=True)
dataset = raw_dataset.copy()
dataset = dataset.dropna() # 删除空白数据项
origin = dataset.pop('Origin')
dataset['USA'] = (origin == 1) * 1.0
dataset['Europe'] = (origin == 2) * 1.0
dataset['Japen'] = (origin == 3) * 1.0
train_dataset = dataset.sample(frac=0.8, random_state=0)
test_dataset = dataset.drop(train_dataset.index)
# %% 统计数据
sns.pairplot(train_dataset[["Cylinders", "Displacement", "Weight", "MPG"]],
diag_kind="kde")
train_stats = train_dataset.describe()
train_stats.pop('MPG')
train_stats = train_stats.transpose()
train_labels = train_dataset.pop('MPG')
test_labels = test_dataset.pop('MPG')
def norm(dataset):
return (dataset - train_stats['mean']) / train_stats['std']
normed_train_data = norm(train_dataset)
normed_test_data = norm(test_dataset)
print(normed_train_data.shape, train_labels.shape)
print(normed_test_data.shape, test_labels.shape)
train_db = tf.data.Dataset.from_tensor_slices((normed_train_data.values, train_labels.values))
train_db = train_db.shuffle(100).batch(32)
class Network(keras.Model):
def __init__(self):
super(Network, self).__init__()
self.fc1 = layers.Dense(64, activation='relu')
self.fc2 = layers.Dense(64, activation='relu')
self.fc3 = layers.Dense(1, activation='relu')
def call(self, inputs, training=None, mask=None):
x = self.fc1(inputs)
x = self.fc2(x)
x = self.fc3(x)
return x
model = Network()
model.build(input_shape=(4, 9))
print(model.summary())
optimizer = tf.keras.optimizers.RMSprop(0.001)
for epoch in range(200):
for step, (x, y) in enumerate(train_db):
with tf.GradientTape() as tape:
out = model.call(x)
loss = tf.reduce_mean(losses.MSE(y, out))
mae_loss = tf.reduce_mean(losses.MAE(y,out))
if step%10==0:
print(epoch,step,float(loss))
grads = tape.gradient(loss,model.trainable_variables)
optimizer.apply_gradients=(zip(grads,model.trainable_variables))