深度学习从入门到放弃100天挑战

写在前面

为了人类正义与荣光,最近突然心血来潮,想在此博客里面每天更新一点深度学习的学习经验,坚持一百天。
其功利性的目的是为了督促自己学习,坚持不断的学习,so,let’s start it!

【1】-2022年7月15日-构建一个CNN

import keras
from keras.datasets import mnist # 数据集
from keras.models import Sequential # 用来实例化一个类
from keras.layers import Activation,Dense,Flatten,Conv2D,MaxPooling2D,Dropout # keras的各个模块
from sklearn.utils import shuffle # 打乱数据排列
from sklearn.preprocessing import StandardScaler # 归一化
from keras.layers import SpatialDropout2D # Drop层的升级版
import time,os  # 用于时间统计和系统导入等
from keras_flops import get_flops # 浮点数计算模块

第一个卷积神经网络的结构非常简单

n_hidden_1 = 64 # 设隐藏层
n_classes = 1 # 设定最后输出层
training_epochs = 5 # 迭代次数
batch_size = 100 # 每次设定批处理图片多少张


model = Sequential()
model.add(Dense(n_hidden_1,activation='relu',input_shape=(42,)))
model.add(Dense(32,activation='relu'))
model.add(Dense(n_classes,activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam',
             metrics=[ 'accuracy'])

print(model.summary()) # 查看网络结构参数等信息
flops = get_flops(model, batch_size=batch_size) # 计算网络的浮点数(计算复杂度)
# 模型的训练	
history = model.fit(X_train,y_train,batch_size=batch_size,
                    epochs=training_epochs,validation_data=(X_test,y_test))
model.evaluate(X_test,y_test)

【2】-2022年7月16日-一些参数解释

关于参数解释:

EPOCHS = 200 定义训练速度的时间
BATCH_SIZE = 128 表示一次输入神经网络的样本量
VERBOSE = 1 进度条显示
N_HIDDEN = 128 神经元个数
NB_CLASSES = 10 输入
VALIDATION_SPLIT = 0.2 检查或证明训练有效性二保留的数据量(切分比例)

verbose:日志显示
verbose = 0 为不在标准输出流输出日志信息
verbose = 1 为输出进度条记录
verbose = 2 为每个epoch输出一行记录

关于binary_crossentropy【二进制交叉熵】和categorical_crossentropy【分类交叉熵】

binary crossentropy:

常用于二分类问题,通常需要在网络的最后一层添加sigmoid进行配合使用

categorical crossentropy:

适用于多分类问题,并使用softmax作为输出层的激活函数的情况

关于损失函数的选择:binary_crossentropy、categorical_crossentropy、sparse_categorical_crossentropy

借鉴博文: https://blog.csdn.net/qq_35599937/article/details/105608354

二分类问题:

    如果是二分类问题,即最终的结果只能是两个分类中的一个,则损失函数loss使用binary_crossentropy

多分类问题:

   对于多分类问题,在选择损失函数loss时,主要是看数据是如何编码的:

      1.如果是分类编码(one-hot编码),则使用categorical_crossentropy

        我对one-hot编码的理解是:one-hot编码就是在标签向量化的时候,每个标签都是一个N维的向量(N由自己确定),其中这个向量只有一个值为1,其余的都为0。也就是将整数索引i转换为长度为N的二进制向量,这个向量只有第i个元素是1,其余的都是0

        Keras有内置的将标签向量化的方法:

from keras.utils.np_utils import to_categorical
 
one_hot_train_labels = to_categorical(train_labels)
one_hot_test_labels = to_categorical(test_labels)
      2.如果是整数编码,则使用sparse_categorical_crossentropy

       我对整数编码的理解是:整数编码就是对所有标签都放到一个向量中,每个标签对应向量的一个值
————————————————
版权声明:本文为CSDN博主「傅华涛Fu」的原创文章,遵循CC 4.0 BY-SA版权协议,转载请附上原文出处链接及本声明。
原文链接:https://blog.csdn.net/fu_jian_ping/article/details/107707780

【3】-2022年7月17日-搞清楚了Keras中的两种模型:Sequential和Model

原文地址:https://blog.csdn.net/weixin_39916966/article/details/88049179

  • Sequential更容易定义网络结构,但是简单的同时,也不利已构建更加复杂的网络
  • Model 方式可以构建更加复杂的网络,相较于比较难一点点
    代码展示:

Sequential

from keras.models import Sequential
from keras.layers import Dense, Activation

model = Sequential()
model.add(Dense(32, input_shape = (784,)))
model.add(Activation('relu'))
model.add(Dense(10))
model.add(Activation('softmax'))

Model

from keras.layers import Input, Dense
from keras.models import Model

# 定义输入层,确定输入维度
input = input(shape = (784, ))
# 2个隐含层,每个都有64个神经元,使用relu激活函数,且由上一层作为参数
x = Dense(64, activation='relu')(input)
x = Dense(64, activation='relu')(x)
# 输出层
y = Dense(10, activation='softmax')(x)
# 定义模型,指定输入输出
model = Model(input=input, output=y)
# 编译模型,指定优化器,损失函数,度量
model.compile(optimizer='rmsprop', loss='categorical_crossentropy', metrics=['accuracy'])
# 模型拟合,即训练
model.fit(data, labels)

【4】-2022年7月18日-定义一个简单的神经网络[基于TensorFlow2.0]

import tensorflow as tf
import numpy as np
from tensorflow import keras

## 1.设置参数
EPOCHS = 50
VERBOSE = 1
BATCH_SIZE = 128
NB_CLASSES = 10
N_HIDDEN = 128
VALIDATION_SPLIT = 0.2  # Train_set num = 60000, Test_set num = 10000

## 2.数据处理
# 2.1 加载数据
mnist = keras.datasets.mnist
(X_train,y_train),(X_test,y_test) = mnist.load_data()
RESHAPED = 784 # 训练集 size(60000,28,28)-> reshape(60000,784)
X_train = X_train.reshape(60000,RESHAPED)
X_test = X_test.reshape(10000,RESHAPED)
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')

# 2.2 数据归一化限制到[0-1]
X_train /= 255
X_test /= 255
print(X_train.shape[0],'train samples')
print(X_test.shape[0],'test samples')

# 2.3 One-hot
y_train = tf.keras.utils.to_categorical(y_train,NB_CLASSES)
y_test = tf.keras.utils.to_categorical(y_test,NB_CLASSES)


## 3.建立模型
model = tf.keras.models.Sequential()
model.add(keras.layers.Dense(NB_CLASSES,
                             input_shape=(RESHAPED,),
                             name='Dense_layer',
                             activation='softmax'))
## 3.1 编译模型
model.compile(optimizer='Adam',loss='categorical_crossentropy',metrics=['accuracy'])

## 4.模型训练
model.fit(
    X_train,y_train,batch_size = BATCH_SIZE,epochs = EPOCHS, verbose = VERBOSE , validation_split = VALIDATION_SPLIT
)

## 5.模型评价
test_loss,test_acc = model.evaluate(X_test,y_test)
print('测试精确度:',test_acc)

定义网络结构

model.summary()

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
Dense_layer (Dense)          (None, 10)                7850      
=================================================================
Total params: 7,850
Trainable params: 7,850
Non-trainable params: 0
_________________________________________________________________

Process finished with exit code 0

结果:

Epoch 50/50

  128/48000 [..............................] - ETA: 0s - loss: 0.1972 - accuracy: 0.9453
 4224/48000 [=>............................] - ETA: 0s - loss: 0.2163 - accuracy: 0.9396
 8192/48000 [====>.........................] - ETA: 0s - loss: 0.2304 - accuracy: 0.9360
12288/48000 [======>.......................] - ETA: 0s - loss: 0.2319 - accuracy: 0.9371
16384/48000 [=========>....................] - ETA: 0s - loss: 0.2298 - accuracy: 0.9371
20096/48000 [===========>..................] - ETA: 0s - loss: 0.2288 - accuracy: 0.9364
24064/48000 [==============>...............] - ETA: 0s - loss: 0.2275 - accuracy: 0.9364
27904/48000 [================>.............] - ETA: 0s - loss: 0.2281 - accuracy: 0.9369
31872/48000 [==================>...........] - ETA: 0s - loss: 0.2299 - accuracy: 0.9364
35712/48000 [=====================>........] - ETA: 0s - loss: 0.2307 - accuracy: 0.9363
39680/48000 [=======================>......] - ETA: 0s - loss: 0.2304 - accuracy: 0.9362
43648/48000 [==========================>...] - ETA: 0s - loss: 0.2312 - accuracy: 0.9366
47744/48000 [============================>.] - ETA: 0s - loss: 0.2316 - accuracy: 0.9364
48000/48000 [==============================] - 1s 17us/sample - loss: 0.2318 - accuracy: 0.9364 - val_loss: 0.2627 - val_accuracy: 0.9310

   32/10000 [..............................] - ETA: 0s - loss: 0.3136 - accuracy: 0.9688
 1056/10000 [==>...........................] - ETA: 0s - loss: 0.2830 - accuracy: 0.9186
 2144/10000 [=====>........................] - ETA: 0s - loss: 0.3395 - accuracy: 0.9067
 3264/10000 [========>.....................] - ETA: 0s - loss: 0.3290 - accuracy: 0.9115
 4352/10000 [============>.................] - ETA: 0s - loss: 0.3384 - accuracy: 0.9095
 5376/10000 [===============>..............] - ETA: 0s - loss: 0.3243 - accuracy: 0.9126
 6464/10000 [==================>...........] - ETA: 0s - loss: 0.3054 - accuracy: 0.9180
 7552/10000 [=====================>........] - ETA: 0s - loss: 0.2908 - accuracy: 0.9220
 8640/10000 [========================>.....] - ETA: 0s - loss: 0.2769 - accuracy: 0.9257
 9728/10000 [============================>.] - ETA: 0s - loss: 0.2626 - accuracy: 0.9295
10000/10000 [==============================] - 0s 48us/sample - loss: 0.2673 - accuracy: 0.9281
测试精确度:  0.9281

【5】-2022年7月19日-昨日网络的准确的基线提升改进策略

今天的实验收获是:在调节验证集划分的时候,将validation_split=0.3调整到validation_split=0.2
模型loss约从1.6降到0.3,accuracy从90.5%上升到95.9%

高改进的模型结构

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
Dense_layer (Dense)          (None, 128)               100480    
_________________________________________________________________
Dense_layer_2 (Dense)        (None, 128)               16512     
_________________________________________________________________
Dense_layer_3 (Dense)        (None, 10)                1290      
=================================================================
Total params: 118,282
Trainable params: 118,282
Non-trainable params: 0
_________________________________________________________________
Train on 48000 samples, validate on 12000 samples
import tensorflow as tf
from tensorflow import keras

# Newwork and training
EPOCHS = 50
BATCH_SZIE = 64
VERBOSE = 1
NB_CLASSES = 10
N_HIDDEN = 128
VALIDATION_SPLIT = 0.2

# LOADING MNIST dataset
mnist = keras.datasets.mnist
(X_train,y_train),(X_test,y_test) = mnist.load_data()

# X_train is 60000 rows of 28*28 values,we reshape it to 60000*784
RESHAPED = 784
X_train = X_train.reshape(60000,RESHAPED)
X_test = X_test.reshape(10000,RESHAPED)
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')

# Normalize input to be within in [0,1]
X_train,X_test = X_train / 255.0 , X_test / 255.0
print('Shape of Train set:',X_train.shape[0])
print('Shape of Test set:',X_test.shape[0])

# Label have one-hot representation
y_train = tf.keras.utils.to_categorical(y_train,NB_CLASSES)
y_test = tf.keras.utils.to_categorical(y_test,NB_CLASSES)

# Build the model
model = tf.keras.models.Sequential()
model.add(keras.layers.Dense(N_HIDDEN,input_shape=(RESHAPED,),
                             name='Dense_layer',activation='relu'))
model.add(keras.layers.Dense(N_HIDDEN,
                             name='Dense_layer_2',activation='relu'))
model.add(keras.layers.Dense(NB_CLASSES,
                             name='Dense_layer_3',activation='relu'))

# Summary of the model
model.summary()

# Compiling the model
model.compile(optimizer='adam',loss='categorical_crossentropy',metrics=['accuracy'])

# Training the model
model.fit(X_train,y_train,batch_size=BATCH_SZIE,epochs=EPOCHS,verbose=VERBOSE,validation_split=VALIDATION_SPLIT)

# Evaluating the model
test_loss,test_accuracy = model.evaluate(X_test,y_test)
print('Test loss:{:.4f}\nTest accuracy:{:.4f}'.format(test_loss,test_accuracy))
Epoch 50/50

   64/48000 [..............................] - ETA: 2s - loss: 0.2805 - accuracy: 0.9688
 1792/48000 [>.............................] - ETA: 1s - loss: 0.1180 - accuracy: 0.9760
 3520/48000 [=>............................] - ETA: 1s - loss: 0.1208 - accuracy: 0.9739
 5248/48000 [==>...........................] - ETA: 1s - loss: 0.1285 - accuracy: 0.9729
 6976/48000 [===>..........................] - ETA: 1s - loss: 0.1409 - accuracy: 0.9712
 8640/48000 [====>.........................] - ETA: 1s - loss: 0.1388 - accuracy: 0.9704
10304/48000 [=====>........................] - ETA: 1s - loss: 0.1474 - accuracy: 0.9706
11968/48000 [======>.......................] - ETA: 1s - loss: 0.1561 - accuracy: 0.9702
13696/48000 [=======>......................] - ETA: 1s - loss: 0.1596 - accuracy: 0.9706
15360/48000 [========>.....................] - ETA: 0s - loss: 0.1582 - accuracy: 0.9707
17024/48000 [=========>....................] - ETA: 0s - loss: 0.1532 - accuracy: 0.9709
18624/48000 [==========>...................] - ETA: 0s - loss: 0.1585 - accuracy: 0.9713
20352/48000 [===========>..................] - ETA: 0s - loss: 0.1556 - accuracy: 0.9720
22016/48000 [============>.................] - ETA: 0s - loss: 0.1554 - accuracy: 0.9719
23680/48000 [=============>................] - ETA: 0s - loss: 0.1534 - accuracy: 0.9722
25344/48000 [==============>...............] - ETA: 0s - loss: 0.1520 - accuracy: 0.9718
27072/48000 [===============>..............] - ETA: 0s - loss: 0.1525 - accuracy: 0.9719
28736/48000 [================>.............] - ETA: 0s - loss: 0.1491 - accuracy: 0.9723
30464/48000 [==================>...........] - ETA: 0s - loss: 0.1690 - accuracy: 0.9679
32192/48000 [===================>..........] - ETA: 0s - loss: 0.1724 - accuracy: 0.9654
33856/48000 [====================>.........] - ETA: 0s - loss: 0.1728 - accuracy: 0.9651
35520/48000 [=====================>........] - ETA: 0s - loss: 0.1723 - accuracy: 0.9651
37184/48000 [======================>.......] - ETA: 0s - loss: 0.1722 - accuracy: 0.9648
38848/48000 [=======================>......] - ETA: 0s - loss: 0.1712 - accuracy: 0.9649
40576/48000 [========================>.....] - ETA: 0s - loss: 0.1707 - accuracy: 0.9652
42176/48000 [=========================>....] - ETA: 0s - loss: 0.1695 - accuracy: 0.9655
43840/48000 [==========================>...] - ETA: 0s - loss: 0.1695 - accuracy: 0.9656
45504/48000 [===========================>..] - ETA: 0s - loss: 0.1709 - accuracy: 0.9655
47168/48000 [============================>.] - ETA: 0s - loss: 0.1737 - accuracy: 0.9656
48000/48000 [==============================] - 2s 37us/sample - loss: 0.1741 - accuracy: 0.9656 - val_loss: 0.2927 - val_accuracy: 0.9597

   32/10000 [..............................] - ETA: 0s - loss: 0.5246 - accuracy: 0.9688
 1120/10000 [==>...........................] - ETA: 0s - loss: 0.2816 - accuracy: 0.9536
 2240/10000 [=====>........................] - ETA: 0s - loss: 0.3842 - accuracy: 0.9402
 3392/10000 [=========>....................] - ETA: 0s - loss: 0.3227 - accuracy: 0.9469
 4512/10000 [============>.................] - ETA: 0s - loss: 0.3741 - accuracy: 0.9424
 5632/10000 [===============>..............] - ETA: 0s - loss: 0.3368 - accuracy: 0.9478
 6688/10000 [===================>..........] - ETA: 0s - loss: 0.3375 - accuracy: 0.9487
 7840/10000 [======================>.......] - ETA: 0s - loss: 0.3051 - accuracy: 0.9545
 9024/10000 [==========================>...] - ETA: 0s - loss: 0.2863 - accuracy: 0.9582
10000/10000 [==============================] - 0s 46us/sample - loss: 0.2852 - accuracy: 0.9585
Test loss:0.2852
Test accuracy:0.9585

Process finished with exit code 0

【6】-2022年7月20日-利用随机失活进一步改进

书上提供了第二种改进措施,用DROPOUT随机丢弃掉内部稠密的隐藏层网络传播。加入正则化,实现惩罚,达到模型的泛化能力,在测试集中表现更好的性能。
依据的原理有两点:

  • 每个神经元的临近节点因为互相的失活而变得更加有力,比如当你团队成员偷懒时,你的任务更重
  • 每个神经元强制储存了冗余信息
    还学到了:一般情况下,训练的accuracy应该大于测试的时候,如果不是,那么增加epochs
Test loss:0.2132
Test accuracy:0.9688

比改进方法1的loss更低,accuracy更高。

代码改进见下面注释!

  • 1.DROPOUT = 0.3 # 这里添加Drop的概率
  • 2.model = tf.keras.models.Sequential() model.add(keras.layers.Dense(N_HIDDEN,input_shape=(RESHAPED,), name='Dense_layer',activation='relu')) model.add(keras.layers.Dropout(DROPOUT)) # 加入网络结构 model.add(keras.layers.Dense(N_HIDDEN, name='Dense_layer_2',activation='relu')) model.add(keras.layers.Dropout(DROPOUT)) # 加入网络结构 model.add(keras.layers.Dense(NB_CLASSES, name='Dense_layer_3',activation='relu'))
import tensorflow as tf
from tensorflow import keras
import numpy as np


# Newwork and training
EPOCHS = 50
BATCH_SZIE = 64
VERBOSE = 1
NB_CLASSES = 10
N_HIDDEN = 128
VALIDATION_SPLIT = 0.2
DROPOUT = 0.3 # 这里添加Drop的概率


# LOADING MNIST dataset
mnist = keras.datasets.mnist
(X_train,y_train),(X_test,y_test) = mnist.load_data()

# X_train is 60000 rows of 28*28 values,we reshape it to 60000*784
RESHAPED = 784
X_train = X_train.reshape(60000,RESHAPED)
X_test = X_test.reshape(10000,RESHAPED)
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')

# Normalize input to be within in [0,1]
X_train,X_test = X_train / 255.0 , X_test / 255.0
print('Shape of Train set:',X_train.shape[0])
print('Shape of Test set:',X_test.shape[0])

# Label have one-hot representation
y_train = tf.keras.utils.to_categorical(y_train,NB_CLASSES)
y_test = tf.keras.utils.to_categorical(y_test,NB_CLASSES)

# Build the model
model = tf.keras.models.Sequential()
model.add(keras.layers.Dense(N_HIDDEN,input_shape=(RESHAPED,),
                             name='Dense_layer',activation='relu'))
model.add(keras.layers.Dropout(DROPOUT)) # 加入网络结构
model.add(keras.layers.Dense(N_HIDDEN,
                             name='Dense_layer_2',activation='relu'))
model.add(keras.layers.Dropout(DROPOUT)) # 加入网络结构
model.add(keras.layers.Dense(NB_CLASSES,
                             name='Dense_layer_3',activation='relu'))

# Summary of the model
model.summary()

# Compiling the model
model.compile(optimizer='adam',loss='categorical_crossentropy',metrics=['accuracy'])

# Training the model
model.fit(X_train,y_train,batch_size=BATCH_SZIE,epochs=EPOCHS,verbose=VERBOSE,validation_split=VALIDATION_SPLIT)

# Evaluating the model
test_loss,test_accuracy = model.evaluate(X_test,y_test)
print('Test loss:{:.4f}\nTest accuracy:{:.4f}'.format(test_loss,test_accuracy))


【7】-2022年7月21日-基于TensorFlow测试不同优化器的性能

https://keras.io/zh/optimizers/【原keras文档提供的优化器API】

当然,可能不同的损失函数针对不同的分类或者回归任务,选择是不同的

import tensorflow as tf
from tensorflow import keras
import numpy as np
import matplotlib.pyplot as plt

# Newwork and training
EPOCHS = 30 # 节约时间,改小epoch适当地
BATCH_SZIE = 64
VERBOSE = 0 # 调成0保持沉默
NB_CLASSES = 10
N_HIDDEN = 128
VALIDATION_SPLIT = 0.2
DROPOUT = 0.3 # 这里添加Drop的概率
optimizer = ['RMSprop','SGD','Adam','Adamax','Nadam','Adagrad','Adadelta']

def test_optimizer(optimizer=optimizer,EPOCHS = 30,BATCH_SZIE = 64,VERBOSE = 0,NB_CLASSES = 10,N_HIDDEN = 128,VALIDATION_SPLIT = 0.2):

    # LOADING MNIST dataset
    mnist = keras.datasets.mnist
    (X_train,y_train),(X_test,y_test) = mnist.load_data()

    # X_train is 60000 rows of 28*28 values,we reshape it to 60000*784
    RESHAPED = 784
    X_train = X_train.reshape(60000,RESHAPED)
    X_test = X_test.reshape(10000,RESHAPED)
    X_train = X_train.astype('float32')
    X_test = X_test.astype('float32')

    # Normalize input to be within in [0,1]
    X_train,X_test = X_train / 255.0 , X_test / 255.0
    print('Shape of Train set:',X_train.shape[0])
    print('Shape of Test set:',X_test.shape[0])

    # Label have one-hot representation
    y_train = tf.keras.utils.to_categorical(y_train,NB_CLASSES)
    y_test = tf.keras.utils.to_categorical(y_test,NB_CLASSES)

    # Build the model
    model = tf.keras.models.Sequential()
    model.add(keras.layers.Dense(N_HIDDEN,input_shape=(RESHAPED,),
                                 name='Dense_layer',activation='relu'))
    model.add(keras.layers.Dropout(DROPOUT)) # 加入网络结构
    model.add(keras.layers.Dense(N_HIDDEN,
                                 name='Dense_layer_2',activation='relu'))
    model.add(keras.layers.Dropout(DROPOUT)) # 加入网络结构
    model.add(keras.layers.Dense(NB_CLASSES,
                                 name='Dense_layer_3',activation='relu'))

    # Summary of the model
    # model.summary()

    # Compiling the model
    model.compile(optimizer=optimizer,loss='categorical_crossentropy',metrics=['accuracy'])

    # Training the model
    model.fit(X_train,y_train,batch_size=BATCH_SZIE,epochs=EPOCHS,verbose=VERBOSE,validation_split=VALIDATION_SPLIT)

    # Evaluating the model
    test_loss,test_accuracy = model.evaluate(X_test,y_test)
    print('Test loss:{:.4f}\nTest accuracy:{:.4f}'.format(test_loss,test_accuracy))
    return test_loss,test_accuracy
loss_reulut = []
acc_resutl = []
for opt in optimizer:
    print('优化器:',opt)


    loss,acc = test_optimizer(optimizer=opt)
    loss_reulut.append(loss)
    acc_resutl.append(acc)
x = range(len(optimizer))
plt.plot(x,loss_reulut,'r',label='loss')
plt.plot(x,acc_resutl,'g',label = 'acc')
plt.title('The comparing results')
plt.xlabel('Epochs')
plt.ylabel('Acc')
optimize_name = ['RMSprop','SGD','Adam','Adamax','Nadam','Adagrad','Adadelta']
font1 = {'family': 'SimSun', 'weight': 'normal', 'size': 10, }
plt.xticks(x, optimize_name, fontproperties=font1)
plt.legend()
plt.show()

结果

深度学习从入门到放弃100天挑战_第1张图片
当然这是在分类问题下的一个数据集下的实验结果,十分的不严谨
针对优化器的选择,可以发现这么排名:

  • 1.ADAM
  • 2.NADAM
  • 3.ADAMAX
  • 4.ADAGRAD
  • 5.SGD
  • 6.RMSPROP
  • 7.ADADELTA

【8】-2022年7月22日-基于TensorFlow测试不同LOSS FUNCTION的性能

https://keras.io/zh/losses/【原keras文档提供的优化器API】

损失函数的选择也要根据分类或者回归任务不同而选择,甚至还有区分是多分类还是二分类。

import tensorflow as tf
from tensorflow import keras
import numpy as np
import matplotlib.pyplot as plt

# Newwork and training
EPOCHS = 3 # 节约时间,改小epoch适当地
BATCH_SZIE = 64
VERBOSE = 0 # 调成0保持沉默
NB_CLASSES = 10
N_HIDDEN = 128
VALIDATION_SPLIT = 0.2
DROPOUT = 0.3 # 这里添加Drop的概率
optimizer = ['RMSprop','SGD','Adam','Adamax','Nadam','Adagrad','Adadelta']
loss_fun = ['categorical_crossentropy','sparse_categorical_crossentropy','binary_crossentropy',
            'kullback_leibler_divergence','poisson','cosine_proximity']

def test_optimizer(loss_fun =loss_fun ,optimizer='Adam',EPOCHS = 3,BATCH_SZIE = 64,VERBOSE = 0,NB_CLASSES = 10,N_HIDDEN = 128,VALIDATION_SPLIT = 0.2):

    # LOADING MNIST dataset
    mnist = keras.datasets.mnist
    (X_train,y_train),(X_test,y_test) = mnist.load_data()

    # X_train is 60000 rows of 28*28 values,we reshape it to 60000*784
    RESHAPED = 784
    X_train = X_train.reshape(60000,RESHAPED)
    X_test = X_test.reshape(10000,RESHAPED)
    X_train = X_train.astype('float32')
    X_test = X_test.astype('float32')

    # Normalize input to be within in [0,1]
    X_train,X_test = X_train / 255.0 , X_test / 255.0
    print('Shape of Train set:',X_train.shape[0])
    print('Shape of Test set:',X_test.shape[0])

    # Label have one-hot representation
    y_train = tf.keras.utils.to_categorical(y_train,NB_CLASSES)
    y_test = tf.keras.utils.to_categorical(y_test,NB_CLASSES)

    # Build the model
    model = tf.keras.models.Sequential()
    model.add(keras.layers.Dense(N_HIDDEN,input_shape=(RESHAPED,),
                                 name='Dense_layer',activation='relu'))
    model.add(keras.layers.Dropout(DROPOUT)) # 加入网络结构
    model.add(keras.layers.Dense(N_HIDDEN,
                                 name='Dense_layer_2',activation='relu'))
    model.add(keras.layers.Dropout(DROPOUT)) # 加入网络结构
    model.add(keras.layers.Dense(NB_CLASSES,
                                 name='Dense_layer_3',activation='relu'))

    # Summary of the model
    model.summary()

    # Compiling the model
    model.compile(optimizer=optimizer,loss='categorical_crossentropy',metrics=['accuracy'])

    # Training the model
    model.fit(X_train,y_train,batch_size=BATCH_SZIE,epochs=EPOCHS,verbose=VERBOSE,validation_split=VALIDATION_SPLIT)

    # Evaluating the model
    test_loss,test_accuracy = model.evaluate(X_test,y_test)
    print('Test loss:{:.4f}\nTest accuracy:{:.4f}'.format(test_loss,test_accuracy))
    return test_loss,test_accuracy
loss_reulut = []
acc_resutl = []
for opt in loss_fun:
    print('损失函数:',opt)
    loss,acc = test_optimizer(loss_fun=opt)
    loss_reulut.append(loss)
    acc_resutl.append(acc)
x = range(len(loss_fun))
plt.plot(x,loss_reulut,'r',label='loss')
plt.plot(x,acc_resutl,'g',label = 'acc')
plt.title('The comparing results')
plt.xlabel('Epochs')
plt.ylabel('Acc')
optimize_name = loss_fun
font1 = {'family': 'SimSun', 'weight': 'normal', 'size': 8}
plt.xticks(x, optimize_name, fontproperties=font1,rotation=45)
plt.legend()
plt.show()

深度学习从入门到放弃100天挑战_第2张图片

在损失函数的测试结果来看,【损失函数选择】
可以发现这么排名:

损失函数: kullback_leibler_divergence


Test loss:0.2417
Test accuracy:0.9664

损失函数: cosine_proximity


Test loss:0.2472
Test accuracy:0.9647

损失函数: poisson

Test loss:0.2307
Test accuracy:0.9645

损失函数: sparse_categorical_crossentropy


Test loss:0.2363
Test accuracy:0.9631

损失函数: binary_crossentropy

Test loss:0.2304
Test accuracy:0.9624

损失函数: categorical_crossentropy

Test loss:0.2551
Test accuracy:0.9557

这里有一点考虑掉了
因为不同的损失函数计算复杂度是不同的,所有在程序运行的时间上有的会非常的长,有些较短。
在加上:本次的实验结果显示,不同的损失函数的accuracy差异是不大的,故需要做权衡
思考一下:为了提高一点点的精确度去消耗一定的时间,还是以时间尺度为标准,选择训练较快的损失函数???

【9】-2022年7月23日-基于随机森林的特征重要性提取

import pandas as pd

from sklearn.model_selection import train_test_split
import numpy as np


# data processing:load  data
df = pd.read_excel(r'C:\Users\12810\Desktop\成都.xlsx')
X = df.iloc[:,2:]
y =df.iloc[:,1:2]

print(X)
print(y)
np.random.seed(seed = 42)

X_train, X_valid, y_train, y_valid = train_test_split(X, y, test_size = 0.1, random_state = 42)

#
from sklearn.ensemble import RandomForestRegressor
rf = RandomForestRegressor(n_estimators = 100,
                           n_jobs = -1,
                           oob_score = True,
                           bootstrap = True,
                           random_state = 42)
rf.fit(X_train, y_train)

print('R^2 Training Score: {:.2f} OOB Score: {:.2f} R^2 Validation Score: {:.2f}'.format(rf.score(X_train, y_train), rf.oob_score_,
                                                                                             rf.score(X_valid, y_valid)))

from sklearn.metrics import r2_score
from rfpimp import permutation_importances

def r2(rf, X_train, y_train):
 return r2_score(y_train, rf.predict(X_train))

perm_imp_rfpimp = permutation_importances(rf, X_train, y_train, r2)
print(perm_imp_rfpimp)
results = pd.DataFrame(perm_imp_rfpimp)
results.to_excel(r'C:\Users\12810\Desktop\成都特征提取.xlsx')

【9】-2022年7月24日-做深度学习的时候,accuracy与loss一直不变!

用softmax做损失函数的时候
Train on 10520 samples, validate on 7014 samples
Epoch 1/5
10520/10520 [==============================] - 3s 298us/step - loss: 4.2370 - accuracy: 0.7237 - val_loss: 4.1026 - val_accuracy: 0.7310
Epoch 2/5
10520/10520 [==============================] - 3s 253us/step - loss: 4.2370 - accuracy: 0.7237 - val_loss: 4.1026 - val_accuracy: 0.7310
Epoch 3/5
10520/10520 [==============================] - 3s 260us/step - loss: 4.2370 - accuracy: 0.7237 - val_loss: 4.1026 - val_accuracy: 0.7310
Epoch 4/5
10520/10520 [==============================] - 3s 257us/step - loss: 4.2370 - accuracy: 0.7237 - val_loss: 4.1026 - val_accuracy: 0.7310
Epoch 5/5
10520/10520 [==============================] - 3s 258us/step - loss: 4.2370 - accuracy: 0.7237 - val_loss: 4.1026 - val_accuracy: 0.7310
模型训练时间: 14.969579000000522
8233/8233 [==============================] - 1s 77us/step
损失 = 6.8495
Accuracy = 55.0832%
用sigmoid做损失函数的时候
Train on 10520 samples, validate on 7014 samples
Epoch 1/5
10520/10520 [==============================] - 3s 294us/step - loss: 0.3000 - accuracy: 0.8751 - val_loss: 0.0938 - val_accuracy: 0.9711
Epoch 2/5
10520/10520 [==============================] - 3s 254us/step - loss: 0.0793 - accuracy: 0.9757 - val_loss: 0.0723 - val_accuracy: 0.9763
Epoch 3/5
10520/10520 [==============================] - 3s 245us/step - loss: 0.0640 - accuracy: 0.9816 - val_loss: 0.0728 - val_accuracy: 0.9759
Epoch 4/5
10520/10520 [==============================] - 3s 247us/step - loss: 0.0564 - accuracy: 0.9826 - val_loss: 0.0498 - val_accuracy: 0.9849
Epoch 5/5
10520/10520 [==============================] - 3s 268us/step - loss: 0.0434 - accuracy: 0.9848 - val_loss: 0.0439 - val_accuracy: 0.9836
模型训练时间: 14.504233399999976
8233/8233 [==============================] - 1s 94us/step
损失 = 0.3933
Accuracy = 87.6230%

故,建议在损失函数的定义时,先不采用softmax的回归结果

你可能感兴趣的:(深度学习,深度学习,神经网络,人工智能)