搭建 LeNet 用来学习 FashionMNIST 数据集,batch size 为 64,学习率为 0.05,训练/测试 10 个 Epoch:
import torchvision
from torch import nn
from torch.utils.data import DataLoader
from torchvision.transforms import ToTensor
from Experiment import Experiment as E
class LeNet(nn.Module):
def __init__(self):
super().__init__()
self.net = nn.Sequential(
nn.Conv2d(1, 6, kernel_size=5, padding=2), nn.Sigmoid(),
nn.AvgPool2d(kernel_size=2, stride=2),
nn.Conv2d(6, 16, kernel_size=5), nn.Sigmoid(),
nn.AvgPool2d(kernel_size=2, stride=2),
nn.Flatten(),
nn.Linear(400, 120), nn.Sigmoid(),
nn.Linear(120, 84), nn.Sigmoid(),
nn.Linear(84, 10),
)
def forward(self, x):
return self.net(x)
train_data = torchvision.datasets.FashionMNIST('/mnt/mydataset', train=True, transform=ToTensor(), download=True)
test_data = torchvision.datasets.FashionMNIST('/mnt/mydataset', train=False, transform=ToTensor(), download=True)
train_loader = DataLoader(train_data, batch_size=64, shuffle=True, num_workers=4)
test_loader = DataLoader(test_data, batch_size=64, num_workers=4)
lenet = LeNet()
e = E(train_loader, test_loader, lenet, 10, 0.05)
e.main()
在 NVIDIA GeForce RTX 3090 上的训练/测试结果如下:
Epoch 1
--------------------------------------------------
Train Avg Loss: 2.307199, Train Accuracy: 0.099383
Test Avg Loss: 2.311069, Test Accuracy: 0.100000
Epoch 2
--------------------------------------------------
Train Avg Loss: 2.306746, Train Accuracy: 0.101433
Test Avg Loss: 2.304611, Test Accuracy: 0.100000
Epoch 3
--------------------------------------------------
Train Avg Loss: 2.306037, Train Accuracy: 0.099633
Test Avg Loss: 2.304882, Test Accuracy: 0.100000
Epoch 4
--------------------------------------------------
Train Avg Loss: 2.306285, Train Accuracy: 0.098717
Test Avg Loss: 2.304786, Test Accuracy: 0.100000
Epoch 5
--------------------------------------------------
Train Avg Loss: 2.305623, Train Accuracy: 0.100350
Test Avg Loss: 2.312366, Test Accuracy: 0.100000
Epoch 6
--------------------------------------------------
Train Avg Loss: 2.305421, Train Accuracy: 0.100033
Test Avg Loss: 2.304870, Test Accuracy: 0.100000
Epoch 7
--------------------------------------------------
Train Avg Loss: 2.304797, Train Accuracy: 0.102300
Test Avg Loss: 2.307974, Test Accuracy: 0.100000
Epoch 8
--------------------------------------------------
Train Avg Loss: 2.304658, Train Accuracy: 0.102000
Test Avg Loss: 2.302825, Test Accuracy: 0.100000
Epoch 9
--------------------------------------------------
Train Avg Loss: 2.303870, Train Accuracy: 0.102867
Test Avg Loss: 2.304149, Test Accuracy: 0.100000
Epoch 10
--------------------------------------------------
Train Avg Loss: 2.302295, Train Accuracy: 0.106083
Test Avg Loss: 2.301472, Test Accuracy: 0.100000
--------------------------------------------------
29408.7 samples/sec
--------------------------------------------------
Done!
可以看出 LeNet 在测试集上的精度始终为 0.1,猜测是使用了默认的初始化方案导致的,我们可能需要更换初始化方案。
保持其他参数不变,使用 Xavier 初始化,如下:
def init_net(m):
if type(m) == nn.Linear or type(m) == nn.Conv2d:
nn.init.xavier_uniform_(m.weight)
lenet = LeNet()
lenet.apply(init_net)
e = E(train_loader, test_loader, lenet, 10, 0.05)
e.main()
依然在 NVIDIA GeForce RTX 3090 进行训练/测试:
Epoch 1
--------------------------------------------------
Train Avg Loss: 2.307741, Train Accuracy: 0.097667
Test Avg Loss: 2.302877, Test Accuracy: 0.100000
Epoch 2
--------------------------------------------------
Train Avg Loss: 2.305074, Train Accuracy: 0.104150
Test Avg Loss: 2.306036, Test Accuracy: 0.100000
Epoch 3
--------------------------------------------------
Train Avg Loss: 2.302001, Train Accuracy: 0.107317
Test Avg Loss: 2.296702, Test Accuracy: 0.100000
Epoch 4
--------------------------------------------------
Train Avg Loss: 2.274006, Train Accuracy: 0.164833
Test Avg Loss: 2.182108, Test Accuracy: 0.285000
Epoch 5
--------------------------------------------------
Train Avg Loss: 1.605953, Train Accuracy: 0.461283
Test Avg Loss: 1.216395, Test Accuracy: 0.561700
Epoch 6
--------------------------------------------------
Train Avg Loss: 1.080627, Train Accuracy: 0.593417
Test Avg Loss: 0.990674, Test Accuracy: 0.624100
Epoch 7
--------------------------------------------------
Train Avg Loss: 0.922300, Train Accuracy: 0.654517
Test Avg Loss: 0.889618, Test Accuracy: 0.677200
Epoch 8
--------------------------------------------------
Train Avg Loss: 0.847294, Train Accuracy: 0.685733
Test Avg Loss: 0.831707, Test Accuracy: 0.690800
Epoch 9
--------------------------------------------------
Train Avg Loss: 0.799123, Train Accuracy: 0.702950
Test Avg Loss: 0.792512, Test Accuracy: 0.702100
Epoch 10
--------------------------------------------------
Train Avg Loss: 0.752247, Train Accuracy: 0.719433
Test Avg Loss: 0.752238, Test Accuracy: 0.719900
--------------------------------------------------
29523.7 samples/sec
--------------------------------------------------
Done!
到了第 10 个 Epoch 精度就已经上升到 0.7 了。
不知道是不是由于使用默认的初始化方案所导致的,事实上如果选择 Adam 优化器,也能够解决这一问题。但对于一些更深的网络,如 NiN,换用优化器起不到什么作用,但选择 Xavier 初始化就可以改善。
学到现在,感觉深度学习的炼丹是一件非常玄学的事情,如果有uu对这个问题有更深入的见解欢迎在评论区留言。