keras——classification model
- keras 创建NN分类模型对fashion mnist数据集分类
- 数据集归一化
- callbacks使用(tensorboard)
- batch normalization
- 激活函数selu
- dropout
1.keras 创建NN分类模型对fashion mnist数据集分类
· fashion-mnist数据集含70000张28*28像素的灰度图,含鞋、包、T-shirt等10个类别
· 导入所需要的包
import os
os.environ["CUDA_VISIBLE_DEVICES"] = "-1" #由于安装的tensorflow-gpu,但电脑不行,所以设置用CPU来跑程序
import matplotlib as mpl
import matplotlib.pyplot as plt
%matplotlib inline
import numpy as np
import pandas as pd
import sklearn
import sys
import time
import tensorflow as tf
from tensorflow import keras
print(tf.__version__)
print(sys.version_info)
for module in mpl,np,pd,sklearn,tf,keras:
print(module.__name__,module.__version__)
· 导入数据集,并分割训练集
fashion_mnist = keras.datasets.fashion_mnist
(x_train_all,y_train_all),(x_test,y_test) = fashion_mnist.load_data()
x_valid,x_train = x_train_all[:5000],x_train_all[5000:]
y_valid,y_train = y_train_all[:5000],y_train_all[5000:]
print(x_valid.shape,y_valid.shape)
print(x_train.shape,y_train.shape)
print(x_test.shape,y_test.shape)
out:
(5000, 28, 28) (5000,)
(55000, 28, 28) (55000,)
(10000, 28, 28) (10000,)
· 数据集展示
def show_single_image(img_arr):
plt.imshow(img_arr,cmap = 'binary')
plt.show()
show_single_image(x_train[0])
def show_imgs(n_rows,n_cols,x_data,y_data,class_name):
assert len(x_data) == len(y_data)
assert n_rows * n_cols < len(x_data)
plt.figure(figsize = (n_cols*1.4,n_rows*1.6))
for row in range(n_rows):
for col in range(n_cols):
index = n_cols * row + col
plt.subplot(n_rows,n_cols,index+1)
plt.imshow(x_data[index],cmap = 'binary',interpolation='nearest')
plt.axis('off')
plt.title(class_name[y_data[index]])
plt.show()
class_name = ['T-shirt','Trouser','Pullover','Dress','Coat','Sandal','Shirt','Sneaker','Bag','Ankle boot']
show_imgs(4,6,x_train,y_train,class_name)
· 创建NN模型
#tf.keras.models.Sequential()
model = keras.models.Sequential()
model.add(keras.layers.Flatten(input_shape=[28,28]))
model.add(keras.layers.Dense(300,activation='relu'))
model.add(keras.layers.Dense(100,activation='relu'))
model.add(keras.layers.Dense(10,activation='softmax'))
#relu:y = max{0,x}
# softmax:将向量变成概率分布,x = [x1,x2,x3]
# y = [e^x1/sum,e^x2/sum,e^x3/sum] sum = e^x1+e^x2+e^x3
model.compile(loss = "sparse_categorical_crossentropy",
optimizer = "adam",
metrics = ["accuracy"])
· 模型训练过程loss 及 acc
his = model.fit(x_train,y_train,epochs=10,validation_data=(x_valid,y_valid))
his.history.keys()
· 绘制loss(acc)与epoch关系图
def plot_learning_curves(history):
pd.DataFrame(history.history).plot(figsize=(8,5))
plt.grid(True)
plt.gca().set_ylim(0,1)
plt.show()
plot_learning_curves(his)
· 测试集评价
model.evaluate(x_test_scaled,y_test)
2.数据集归一化
· 由于已将数据集分割为训练集、交叉验证集、测试集,若分别进行标准化,由于三个集合的分布不同(最大值、最小值、方差等不同),将导致各集合归一化后数据不在同一分布,因此需要用训练集的标准化参数对另外两个集合进行归一化,这里采用sklearn中StandardScaler对数据进行归一化。
#数据归一化 x = (x - u) / std
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
#x_train:[None,28,28] -> [None,784]
x_train_scaled = scaler.fit_transform(
x_train.astype(np.float32).reshape(-1,1)).reshape(-1,28,28)
x_valid_scaled = scaler.transform(
x_valid.astype(np.float32).reshape(-1,1)).reshape(-1,28,28)
x_test_scaled = scaler.transform(
x_test.astype(np.float32).reshape(-1,1)).reshape(-1,28,28)
"""
在标准化时,先将三维数组转化为一维数组,再重构成(None,28,28)的数组
"""
· 为什么要归一化?
(1)数值问题;
(2)每一维的偏导数计算结果量级不同,导致梯度下降过程曲折,缓慢。
3.callbacks应用
· Tensorboard, earlystopping, ModelCheckpoint
· Tensorboard:数据可视化工具
# Tensorboard, earlystopping, ModelCheckpoint
# logdir = './callbacks' 这么写在windows存在bug上会报错,linux上不会
logdir = os.path.join("callbacks" )
if not os.path.exists(logdir):
os.mkdir(logdir)
output_model_file = os.path.join(logdir,
"fashion_mnist_model.h5")
callbacks = [
keras.callbacks.TensorBoard(logdir),
keras.callbacks.ModelCheckpoint(output_model_file,
save_best_only=True),
keras.callbacks.EarlyStopping(patience=5,min_delta=1e-3),
]
his = model.fit(x_train_scaled,y_train,epochs=10,validation_data=(x_valid_scaled,y_valid),
callbacks = callbacks)
Tensorboard 在windows10 打开方式
(1)cmd 命令界面,将目录切换到tensorboard.exe
文件所在文件夹
(2)输入命令tensorboard --logdir=D:\.....\callbacks\train
,该路径为events.out.tfevents.1581393610.DESKTOP-3PND2RT.536.262.v2
等文件所在目录
(3)将生成的地址在浏览器打开
4. Batch Normalization
·仅在数据输入NN前对数据进行归一化是不够的,在中间层输出的数据分布可能发生了改变,再经过多层网络计算后,数据的分布变化可能会很大,这可能是不利于模型训练的。为使训练效果更好,需在NN中间层也进行归一化处理,这就是Batch Normalization。
Batch Normalization 介绍:https://blog.csdn.net/weixin_43199584/article/details/97898176
#tf.keras.models.Sequential()
model = keras.models.Sequential()
model.add(keras.layers.Flatten(input_shape=[28,28]))
for _ in range(20):
model.add(keras.layers.Dense(300,activation='relu'))
model.add(keras.layers.BatchNormalization())#批归一化
"""
归一化放到激活函数之前
model.add(keras.layers.Dense(100))
model.add(keras.layers.BatchNormalization())
model.add(keras.layers.Activation('relu'))
"""
model.add(keras.layers.Dense(10,activation='softmax'))
model.compile(loss = "sparse_categorical_crossentropy",
optimizer = "adam",
metrics = ["accuracy"])
5.激活函数selu
· 激活函数链接:https://www.jianshu.com/p/d216645251ce
#tf.keras.models.Sequential()
model = keras.models.Sequential()
model.add(keras.layers.Flatten(input_shape=[28,28]))
for _ in range(20):
model.add(keras.layers.Dense(300,activation='selu'))#selu自带归一化
model.add(keras.layers.Dense(10,activation='softmax'))
model.compile(loss = "sparse_categorical_crossentropy",
optimizer = "sgd",
metrics = ["accuracy"])
6. Dropout
dropout指在DNN训练过程中,随机丢弃一定比例的神经元,减弱特征之间的协同作用,减弱某个神经元的训练对于另一个神经元的依赖,以缓解过拟合。或者可以理解为,每次随机的dropout,都将是训练一个结构不同的DNN,但训练的参数不变,对多个DNN预测结果取平均
#tf.keras.models.Sequential()
model = keras.models.Sequential()
model.add(keras.layers.Flatten(input_shape=[28,28]))
for _ in range(20):
model.add(keras.layers.Dense(300,activation='selu'))
model.add(keras.layers.AlphaDropout(rate=0.5))
#model.add(keras.layers.Dropout(rate=0.5)) #Droupout会改变归一化性质
#AlphaDropout:1.均值和方差不变 2.归一化性质不变
model.add(keras.layers.Dense(10,activation='softmax'))
model.compile(loss = "sparse_categorical_crossentropy",
optimizer = "sgd",
metrics = ["accuracy"])