活动地址:CSDN21天学习挑战赛
通过前两课的学习,加上私底下恶补基础,照猫画虎的基本算是掌握了卷积神经网络-CNN搭建模型的基本方法。
之前使用的,都是使用的现成的数据集,想想,如果今后真的需要应用,肯定需要使用自制数据集来训练模型,刚好k同学啊老师,就安排了这么一课
- 本文为365天深度学习训练营 中的学习记录博客
- 参考文章地址: 深度学习100例-卷积神经网络(CNN)天气识别 | 第5天
现将学习总结如下:(完整代码附后)
从老师那下载的数据,weather_photos文件夹下共四类(四个目录):
300
张图片215
张图片253
张图片357
张图片图片都是jpg格式
通过分析,可知,在使用数据集之前,至少提前做好三件事:
data_dir = "./weather_photos/" # 路径变量
data_dir = pathlib.Path(data_dir) # 构造pathlib模块下的Path对象
image_count = len(list(data_dir.glob('*/*.jpg'))) # 使用Path对象glob方法获取所有jpg格式图片
print("图片总数为:",image_count)
"./weather_photos/"
,你也可以根据实际路径,如:data_dir = "D:/datasets/weather_photos/"
显示图片:
roses = list(data_dir.glob('sunrise/*.jpg')) # 使用Path对象glob方法获取sunrise目录下所有jpg格式图片
PIL.Image.open(str(roses[6])) #显示一张图片
先定义几个重要变量:
batch_size = 32
img_height = 180
img_width = 180
使用: tf.keras.preprocessing.image_dataset_from_directory
将文件夹中的数据加载到tf.data.Dataset中,且加载的同时会打乱数据
train_ds = tf.keras.preprocessing.image_dataset_from_directory(
data_dir, # 上面定义好的变量
validation_split=0.2, # 保留20%当做测试集
subset="training",
seed=123,
image_size=(img_height, img_width),# 上面定义好的变量
batch_size=batch_size) # 上面定义好的变量
参数:
training
或validation
。仅在设置validation_split时使用。class_names = train_ds.class_names
print(class_names)
plt.figure(figsize=(20, 10))
for images, labels in train_ds.take(1):
for i in range(20):
ax = plt.subplot(5, 10, i + 1)
plt.imshow(images[i].numpy().astype("uint8"))
plt.title(class_names[labels[i]])
plt.axis("off")
for image_batch, labels_batch in train_ds:
print(image_batch.shape)
print(labels_batch.shape)
break
AUTOTUNE = tf.data.AUTOTUNE
train_ds = train_ds.cache().shuffle(1000).prefetch(buffer_size=AUTOTUNE)
val_ds = val_ds.cache().prefetch(buffer_size=AUTOTUNE) #
我们之前学习的,输入数据集形状都是(28, 28, 1)
,也就是说,28*28的图像,只有一个颜色通道(灰度)
今天的数据,明显是180*180的图片,并且是RGB三个维度
所以我们需要在声明第一层时定义数据形状,参数:input_shape
num_classes = 4
model = models.Sequential([
layers.experimental.preprocessing.Rescaling(1./255, input_shape=(img_height, img_width, 3)),
layers.Conv2D(16, (3, 3), activation='relu', input_shape=(img_height, img_width, 3)), # 卷积层1,卷积核3*3
layers.AveragePooling2D((2, 2)), # 池化层1,2*2采样
layers.Conv2D(32, (3, 3), activation='relu'), # 卷积层2,卷积核3*3
layers.AveragePooling2D((2, 2)), # 池化层2,2*2采样
layers.Conv2D(64, (3, 3), activation='relu'), # 卷积层3,卷积核3*3
layers.Dropout(0.3),
layers.Flatten(), # Flatten层,连接卷积层与全连接层
layers.Dense(128, activation='relu'), # 全连接层,特征进一步提取
layers.Dense(num_classes) # 输出层,输出预期结果
])
model.summary() # 打印网络结构
opt = tf.keras.optimizers.Adam(learning_rate=0.001)
model.compile(optimizer=opt,
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
metrics=['accuracy'])
这里没什么好说的,和之前用到的损失函数、优化器一样
使用:learning_rate=0.001
是设置学习率
epochs = 10
history = model.fit(
train_ds,
validation_data=val_ds,
epochs=epochs
)
validation_data:指定测试集数据
loss:训练集损失值
accuracy:训练集准确率
val_loss:测试集损失值
val_accruacy:测试集准确率
acc = history.history['accuracy']
val_acc = history.history['val_accuracy']
loss = history.history['loss']
val_loss = history.history['val_loss']
epochs_range = range(epochs)
plt.figure(figsize=(12, 4))
plt.subplot(1, 2, 1)
plt.plot(epochs_range, acc, label='Training Accuracy')
plt.plot(epochs_range, val_acc, label='Validation Accuracy')
plt.legend(loc='lower right')
plt.title('Training and Validation Accuracy')
plt.subplot(1, 2, 2)
plt.plot(epochs_range, loss, label='Training Loss')
plt.plot(epochs_range, val_loss, label='Validation Loss')
plt.legend(loc='upper right')
plt.title('Training and Validation Loss')
plt.show()
import matplotlib.pyplot as plt
import os,PIL
# 设置随机种子尽可能使结果可以重现
import numpy as np
np.random.seed(1)
# 设置随机种子尽可能使结果可以重现
import tensorflow as tf
tf.random.set_seed(1)
from tensorflow import keras
from tensorflow.keras import layers,models
import pathlib
data_dir = "./weather_photos/"
data_dir = pathlib.Path(data_dir)
image_count = len(list(data_dir.glob('*/*.jpg')))
print("图片总数为:",image_count)
roses = list(data_dir.glob('sunrise/*.jpg'))
PIL.Image.open(str(roses[0]))
batch_size = 32
img_height = 180
img_width = 180
train_ds = tf.keras.preprocessing.image_dataset_from_directory(
data_dir,
validation_split=0.2,
subset="training",
seed=123,
image_size=(img_height, img_width),
batch_size=batch_size)
class_names = train_ds.class_names
print(class_names)
plt.figure(figsize=(20, 10))
val_ds = tf.keras.preprocessing.image_dataset_from_directory(
data_dir,
validation_split=0.2,
subset="validation",
seed=123,
image_size=(img_height, img_width),
batch_size=batch_size)
for images, labels in train_ds.take(1):
for i in range(20):
ax = plt.subplot(5, 10, i + 1)
plt.imshow(images[i].numpy().astype("uint8"))
plt.title(class_names[labels[i]])
plt.axis("off")
for image_batch, labels_batch in train_ds:
print(image_batch.shape)
print(labels_batch.shape)
break
# AUTOTUNE = tf.data.AUTOTUNE
AUTOTUNE = tf.data.experimental.AUTOTUNE
train_ds = train_ds.cache().shuffle(1000).prefetch(buffer_size=AUTOTUNE)
val_ds = val_ds.cache().prefetch(buffer_size=AUTOTUNE)
num_classes = 4
model = models.Sequential([
layers.experimental.preprocessing.Rescaling(1./255, input_shape=(img_height, img_width, 3)),
layers.Conv2D(16, (3, 3), activation='relu', input_shape=(img_height, img_width, 3)), # 卷积层1,卷积核3*3
layers.AveragePooling2D((2, 2)), # 池化层1,2*2采样
layers.Conv2D(32, (3, 3), activation='relu'), # 卷积层2,卷积核3*3
layers.AveragePooling2D((2, 2)), # 池化层2,2*2采样
layers.Conv2D(64, (3, 3), activation='relu'), # 卷积层3,卷积核3*3
layers.Dropout(0.3),
layers.Flatten(), # Flatten层,连接卷积层与全连接层
layers.Dense(128, activation='relu'), # 全连接层,特征进一步提取
layers.Dense(num_classes) # 输出层,输出预期结果
])
model.summary() # 打印网络结构
# 设置优化器
opt = tf.keras.optimizers.Adam(learning_rate=0.001)
model.compile(optimizer=opt,
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
metrics=['accuracy'])
epochs = 10
history = model.fit(
train_ds,
validation_data=val_ds,
epochs=epochs
)
acc = history.history['accuracy']
val_acc = history.history['val_accuracy']
loss = history.history['loss']
val_loss = history.history['val_loss']
epochs_range = range(epochs)
plt.figure(figsize=(12, 4))
plt.subplot(1, 2, 1)
plt.plot(epochs_range, acc, label='Training Accuracy')
plt.plot(epochs_range, val_acc, label='Validation Accuracy')
plt.legend(loc='lower right')
plt.title('Training and Validation Accuracy')
plt.subplot(1, 2, 2)
plt.plot(epochs_range, loss, label='Training Loss')
plt.plot(epochs_range, val_loss, label='Validation Loss')
plt.legend(loc='upper right')
plt.title('Training and Validation Loss')
plt.show()
1,学习知识点
a、自制数据集的基本使用方法
b、pathlib模块的基本使用
c、tf.keras.preprocessing.image_dataset_from_directory基本使用方法
2,学习遇到的问题
继续啃西瓜书,恶补基础