python3.6 + pytorch1.8.0
import torch
print(torch.__version__)
1.8.0
MNIST数字数据集是一组手写数字图像的数据集,用于机器学习中的图像分类任务。
该数据集包含60,000张训练图像和10,000张测试图像,每张图像都是28x28像素大小的灰度图像。每张图像都被标记为0到9中的一个数字。
该数据集是由美国国家标准与技术研究所(NIST)收集和创建,因此得名为MNIST(Modified National Institute of Standards and Technology)。它已成为机器学习领域中广泛使用的基准数据集之一。
网络层 参数 输出尺寸
Input N×1×28×28
Conv1 ksize=5, C_out=4, pad=0, stride=2 N×4×12×12
ReLU N×4×12×12
Conv2 ksize=3, C_out=8, pad=0, stride=2 N×8×5×5
ReLU N×8×5×5
Flatten N×200
Linear num_out=10 N×10
输出图像的维度计算方法:(图像大小 - 卷积核大小)÷步长+1
import torch
from torch.nn import ReLU
from torch.nn.functional import conv2d, cross_entropy
from torchvision import datasets, transforms
定义ReLU激活函数
def relu(x):
return torch.clamp(x, min=0)
torch.clamp函数是一个张量操作函数,在PyTorch中实现。该函数的作用是将输入的张量每个元素都限制在指定的区间内,返回一个新的张量。
可以通过指定min或max参数的值来实现对张量元素的限制,也可以同时指定同时限制上下限。若min和max均不指定,则默认为min=0,max=1。
定义线性函数
def linear(x, weight, bias):
out = torch.matmul(x, weight) + bias.view(1, -1)
return out
定义神经网络模型
conv2d的五大参数:x, w, b, stride, pad
def model(x, params):
# 卷积层1
# w=(4,1,5,5),输出通道为4,输入通道为1,卷积核为(5,5)
x = conv2d(x, params[0], params[1], 2, 0)
# (N×4×12×12)
x = relu(x)
# 卷积层2
# w=(8,4,3,3),输出通道为8,输入通道为4,卷积核为(3,3)
x = conv2d(x, params[2], params[3], 2, 0)
x = relu(x)
# (N×8×5×5)
x = x.view(-1, 200)
# 全连接层
x = linear(x, params[4], params[5])
return x
该程序定义了一个卷积神经网络模型,输入参数包括待处理的数据x和模型参数params。程序通过使用卷积层、ReLU激活函数和全连接层构建了一个简单的卷积神经网络模型。其中,程序使用了两个卷积层和一个全连接层。具体的模型参数包括两个卷积层的卷积核权重、偏置、以及一个全连接层的权重和偏置。程序返回模型的预测结果x。
初始化模型
init_std = 0.1
params = [
torch.randn(4, 1, 5, 5) * init_std,
torch.zeros(4),
torch.randn(8, 4, 3, 3) * init_std,
torch.zeros(8),
torch.randn(200, 10) * init_std,
torch.zeros(10)
]
for p in params:
p.requires_grad=True #自动微分
params
[tensor([[[[ 0.1062, -0.0596, -0.0730, 0.0613, 0.1273],
[ 0.0751, -0.0852, 0.0648, -0.0774, -0.0355],
[ 0.1565, -0.0221, 0.0574, -0.1055, 0.0350],
[-0.1299, 0.0169, 0.0297, 0.1494, -0.0993],
[ 0.0464, 0.0628, -0.0133, -0.0545, 0.1265]]],
[[[ 0.0291, 0.1538, -0.0692, -0.0637, 0.0829],
[ 0.0735, -0.0594, -0.1185, -0.0026, -0.0351],
[ 0.0697, 0.1032, -0.1001, -0.0212, -0.0946],
[ 0.0311, 0.1461, 0.0641, -0.0407, 0.1615],
[-0.0517, 0.0298, -0.0482, 0.0984, -0.0602]]],
[[[ 0.0102, -0.1541, -0.1040, 0.0335, 0.0115],
[-0.1167, 0.1155, 0.0832, 0.0561, 0.0435],
[ 0.0429, -0.1574, 0.0323, -0.1353, -0.1211],
[-0.2472, -0.2379, -0.0963, 0.0105, -0.0845],
[ 0.0059, 0.0433, 0.0111, 0.0422, -0.0131]]],
[[[-0.0741, 0.1411, 0.0006, 0.1485, 0.1257],
[ 0.0446, 0.0822, -0.0458, 0.1525, 0.0695],
[ 0.0616, 0.1892, 0.1525, -0.0594, 0.1515],
[-0.0490, 0.1179, -0.1175, 0.1448, 0.0811],
[-0.0641, -0.0494, -0.0980, -0.1119, 0.0599]]]], requires_grad=True),
tensor([0., 0., 0., 0.], requires_grad=True),
tensor([[[[-0.1564, 0.1481, 0.0995],
[ 0.0157, 0.0025, -0.0513],
[ 0.1011, -0.0417, -0.1049]],
[[-0.2052, -0.0514, -0.0995],
[ 0.1915, -0.0586, -0.0985],
[-0.1371, -0.2874, -0.0977]],
[[-0.0367, -0.2326, 0.0306],
[ 0.0193, 0.0762, 0.0243],
[-0.1507, -0.1265, -0.1493]],
[[ 0.0769, -0.1014, 0.0888],
[-0.0632, -0.0782, -0.1765],
[ 0.0521, 0.2349, -0.0833]]],
[[[ 0.0041, -0.0487, 0.1597],
[-0.0210, -0.1051, 0.0374],
[ 0.1981, 0.1395, 0.0108]],
[[ 0.0418, 0.1592, 0.0219],
[ 0.1168, 0.0305, -0.0702],
[ 0.2217, -0.0670, -0.0037]],
[[ 0.0501, -0.2496, 0.0381],
[-0.1487, -0.0202, 0.0236],
[-0.0738, 0.0733, -0.1244]],
[[-0.0825, 0.0158, -0.0877],
[ 0.0337, 0.2011, 0.1339],
[-0.1452, -0.1665, 0.0141]]],
[[[-0.0779, -0.1749, 0.0731],
[-0.0936, 0.0519, 0.1093],
[ 0.1049, 0.0406, -0.0594]],
[[ 0.1012, 0.0804, -0.0153],
[ 0.0899, -0.0954, -0.0520],
[ 0.0724, -0.0487, -0.0048]],
[[-0.0814, -0.0918, 0.0481],
[-0.0482, 0.1069, -0.1442],
[-0.0863, 0.0290, 0.0701]],
[[ 0.1068, 0.1104, -0.0105],
[ 0.1059, 0.0294, 0.2377],
[ 0.0855, -0.0029, 0.0322]]],
[[[ 0.0467, -0.1335, -0.0698],
[-0.0683, 0.0323, -0.0197],
[ 0.1748, 0.1601, 0.0385]],
[[ 0.0100, -0.0644, 0.0374],
[ 0.2065, -0.0637, -0.1515],
[-0.1963, 0.0413, 0.0476]],
[[ 0.0600, -0.0431, 0.0280],
[ 0.0428, 0.0220, -0.0793],
[-0.0876, 0.0013, -0.0618]],
[[ 0.3182, 0.0358, 0.1933],
[-0.0536, 0.1208, -0.0318],
[ 0.0144, 0.0707, -0.0400]]],
[[[ 0.1354, 0.0365, 0.0468],
[-0.0399, 0.0050, 0.0589],
[ 0.0335, 0.0415, -0.0896]],
[[-0.2733, -0.0715, -0.0500],
[-0.0501, -0.0715, -0.0644],
[ 0.0130, 0.0041, -0.0051]],
[[-0.0304, -0.0240, -0.0137],
[ 0.0931, 0.0814, 0.0466],
[ 0.1731, 0.0708, 0.1007]],
[[-0.2261, 0.0397, -0.1092],
[-0.0359, -0.1310, -0.0156],
[ 0.1561, -0.0511, 0.0229]]],
[[[ 0.1716, 0.1097, 0.0117],
[-0.1617, 0.2321, 0.1619],
[ 0.0721, 0.0619, 0.0418]],
[[ 0.0714, -0.0578, -0.0358],
[-0.0079, 0.0858, 0.1151],
[ 0.0559, 0.1615, -0.1431]],
[[ 0.0542, 0.0115, -0.0027],
[-0.0479, -0.0977, -0.0463],
[-0.0527, 0.1211, -0.3093]],
[[-0.0261, -0.0288, -0.0048],
[ 0.1089, 0.1711, 0.1310],
[ 0.0552, -0.0664, -0.0463]]],
[[[-0.0420, 0.1513, -0.1023],
[-0.1159, -0.1849, -0.0109],
[-0.1040, -0.0087, 0.0621]],
[[ 0.0287, 0.1017, 0.0153],
[-0.0875, -0.0534, -0.0378],
[ 0.0076, -0.0756, -0.3164]],
[[ 0.1255, 0.0647, -0.0439],
[ 0.0973, -0.1135, -0.1946],
[ 0.0965, 0.0432, 0.1767]],
[[ 0.0021, 0.0215, 0.0423],
[-0.0698, -0.0198, -0.0033],
[-0.0363, -0.0935, 0.0433]]],
[[[-0.0605, -0.0037, 0.0548],
[ 0.0856, -0.0815, 0.0877],
[ 0.0714, 0.1070, -0.1506]],
[[ 0.1159, 0.0648, -0.0505],
[-0.2376, 0.0337, 0.1003],
[ 0.1495, 0.0696, 0.0500]],
[[ 0.1888, 0.1441, 0.0078],
[ 0.1022, -0.0609, -0.2317],
[-0.0881, -0.0983, 0.1033]],
[[ 0.0118, 0.0604, 0.0070],
[ 0.0215, 0.0990, 0.1829],
[-0.0226, 0.1505, 0.0927]]]], requires_grad=True),
tensor([0., 0., 0., 0., 0., 0., 0., 0.], requires_grad=True),
tensor([[ 0.1092, -0.0588, -0.0678, ..., -0.0369, 0.1425, -0.1710],
[-0.2353, -0.1057, 0.0156, ..., 0.0120, 0.0512, 0.1080],
[-0.0902, 0.1036, 0.1558, ..., -0.1726, 0.1594, -0.0046],
...,
[-0.1067, 0.1090, 0.0657, ..., 0.1041, -0.1314, 0.0274],
[ 0.0735, -0.0332, 0.0949, ..., 0.0044, -0.1386, 0.0113],
[ 0.0212, -0.0620, 0.1167, ..., 0.0424, 0.0393, -0.0940]],
requires_grad=True),
tensor([0., 0., 0., 0., 0., 0., 0., 0., 0., 0.], requires_grad=True)]
train_batch_size = 100 #一共60000个数据,100个数据一组,一共600组(N=100)
test_batch_size = 100 #一共10000个数据,100个数据一组,一共100组(N=100)
# 训练集
train_loader = torch.utils.data.DataLoader(
datasets.MNIST(
'./data',train=True,download=True,
transform=transforms.Compose([
transforms.ToTensor(), #转为tensor
transforms.Normalize((0.5),(0.5)) #正则化
])
),
batch_size = train_batch_size, shuffle=True #打乱位置
)
# 测试集
test_loader = torch.utils.data.DataLoader(
datasets.MNIST(
'./data',train=False,download=True,
transform=transforms.Compose([
transforms.ToTensor(),
transforms.Normalize((0.5),(0.5))
])
),
batch_size = test_batch_size, shuffle=False
)
这段代码主要是用来读取MNIST数据集并进行数据预处理,将其转换为可以在神经网络中使用的格式。其中train_batch_size和test_batch_size分别为训练集和测试集每批处理的数据大小。通过torch.utils.data.DataLoader函数,将MNIST数据集读入,并通过transforms.Compose函数对数据进行预处理,包括将其转换为tensor以及进行正则化处理。在训练集和测试集中,还通过shuffle参数打乱数据集的顺序,以增加数据集的多样性。最终将处理好的数据通过train_loader和test_loader返回。
alpha = 0.1 #学习率
epochs = 100 #训练次数
interval = 100 #打印间隔
for epoch in range(epochs):
for i, (data, label) in enumerate(train_loader):
output = model(data, params)
loss = cross_entropy(output, label) #交叉熵函数
for p in params:
if p.grad is not None: #如果梯度不为零
p.grad.zero_() #梯度置零
loss.backward() #反向求导
for p in params: #更新参数
p.data = p.data - alpha * p.grad.data
if i % interval == 0:
print("Epoch %03d [%03d/%03d]\tLoss:%.4f"%(epoch, i, len(train_loader), loss.item()))
correct_num = 0
total_num = 0
with torch.no_grad():
for data, label in test_loader:
output = model(data, params)
pred = output.max(1)[1]
correct_num += (pred == label).sum().item()
total_num += len(data)
acc = correct_num / total_num
print('...Testing @ Epoch %03d\tAcc:%.4f'%(epoch, acc))
Epoch 000 [000/600] Loss:2.3276
Epoch 000 [100/600] Loss:0.4883
Epoch 000 [200/600] Loss:0.2200
Epoch 000 [300/600] Loss:0.1809
Epoch 000 [400/600] Loss:0.1448
Epoch 000 [500/600] Loss:0.2617
...Testing @ Epoch 000 Acc:0.9469
Epoch 001 [000/600] Loss:0.1120
Epoch 001 [100/600] Loss:0.1738
Epoch 001 [200/600] Loss:0.1212
Epoch 001 [300/600] Loss:0.1924
Epoch 001 [400/600] Loss:0.0743
Epoch 001 [500/600] Loss:0.3233
...Testing @ Epoch 001 Acc:0.9613
Epoch 002 [000/600] Loss:0.0553
Epoch 002 [100/600] Loss:0.2300
Epoch 002 [200/600] Loss:0.0489
Epoch 002 [300/600] Loss:0.0816
Epoch 002 [400/600] Loss:0.1188
Epoch 002 [500/600] Loss:0.1968
...Testing @ Epoch 002 Acc:0.9670
Epoch 003 [000/600] Loss:0.2476
Epoch 003 [100/600] Loss:0.1316
Epoch 003 [200/600] Loss:0.2912
Epoch 003 [300/600] Loss:0.0378
Epoch 003 [400/600] Loss:0.1028
Epoch 003 [500/600] Loss:0.1199
...Testing @ Epoch 003 Acc:0.9693
Epoch 004 [000/600] Loss:0.0853
Epoch 004 [100/600] Loss:0.1661
Epoch 004 [200/600] Loss:0.1067
Epoch 004 [300/600] Loss:0.0579
Epoch 004 [400/600] Loss:0.2422
Epoch 004 [500/600] Loss:0.0573
...Testing @ Epoch 004 Acc:0.9695
Epoch 005 [000/600] Loss:0.0816
Epoch 005 [100/600] Loss:0.0845
Epoch 005 [200/600] Loss:0.0607
Epoch 005 [300/600] Loss:0.0548
Epoch 005 [400/600] Loss:0.1351
Epoch 005 [500/600] Loss:0.0569
...Testing @ Epoch 005 Acc:0.9726
Epoch 006 [000/600] Loss:0.0441
Epoch 006 [100/600] Loss:0.0912
Epoch 006 [200/600] Loss:0.1213
Epoch 006 [300/600] Loss:0.0405
Epoch 006 [400/600] Loss:0.0311
Epoch 006 [500/600] Loss:0.0755
...Testing @ Epoch 006 Acc:0.9743
Epoch 007 [000/600] Loss:0.0342
Epoch 007 [100/600] Loss:0.0480
Epoch 007 [200/600] Loss:0.1276
Epoch 007 [300/600] Loss:0.1255
Epoch 007 [400/600] Loss:0.0572
Epoch 007 [500/600] Loss:0.0186
...Testing @ Epoch 007 Acc:0.9724
Epoch 008 [000/600] Loss:0.0515
Epoch 008 [100/600] Loss:0.0291
Epoch 008 [200/600] Loss:0.0362
Epoch 008 [300/600] Loss:0.0979
Epoch 008 [400/600] Loss:0.0950
Epoch 008 [500/600] Loss:0.0263
...Testing @ Epoch 008 Acc:0.9768
Epoch 009 [000/600] Loss:0.0766
Epoch 009 [100/600] Loss:0.1084
Epoch 009 [200/600] Loss:0.0495
Epoch 009 [300/600] Loss:0.0260
Epoch 009 [400/600] Loss:0.0708
Epoch 009 [500/600] Loss:0.0809
...Testing @ Epoch 009 Acc:0.9775
Epoch 010 [000/600] Loss:0.0179
Epoch 010 [100/600] Loss:0.0293
Epoch 010 [200/600] Loss:0.0534
Epoch 010 [300/600] Loss:0.0983
Epoch 010 [400/600] Loss:0.1292
Epoch 010 [500/600] Loss:0.0820
...Testing @ Epoch 010 Acc:0.9781
Epoch 011 [000/600] Loss:0.0542
Epoch 011 [100/600] Loss:0.0978
Epoch 011 [200/600] Loss:0.0596
Epoch 011 [300/600] Loss:0.0725
Epoch 011 [400/600] Loss:0.1350
Epoch 011 [500/600] Loss:0.0166
...Testing @ Epoch 011 Acc:0.9784
Epoch 012 [000/600] Loss:0.1395
Epoch 012 [100/600] Loss:0.0577
Epoch 012 [200/600] Loss:0.0437
Epoch 012 [300/600] Loss:0.1207
Epoch 012 [400/600] Loss:0.0795
Epoch 012 [500/600] Loss:0.0415
...Testing @ Epoch 012 Acc:0.9797
Epoch 013 [000/600] Loss:0.0897
Epoch 013 [100/600] Loss:0.0530
Epoch 013 [200/600] Loss:0.0421
Epoch 013 [300/600] Loss:0.0982
Epoch 013 [400/600] Loss:0.0646
Epoch 013 [500/600] Loss:0.0215
...Testing @ Epoch 013 Acc:0.9773
Epoch 014 [000/600] Loss:0.0544
Epoch 014 [100/600] Loss:0.0554
Epoch 014 [200/600] Loss:0.0236
Epoch 014 [300/600] Loss:0.0154
Epoch 014 [400/600] Loss:0.0262
Epoch 014 [500/600] Loss:0.1322
...Testing @ Epoch 014 Acc:0.9764
Epoch 015 [000/600] Loss:0.0204
Epoch 015 [100/600] Loss:0.0563
Epoch 015 [200/600] Loss:0.0309
Epoch 015 [300/600] Loss:0.0844
Epoch 015 [400/600] Loss:0.1167
Epoch 015 [500/600] Loss:0.0856
...Testing @ Epoch 015 Acc:0.9788
Epoch 016 [000/600] Loss:0.0052
Epoch 016 [100/600] Loss:0.1451
Epoch 016 [200/600] Loss:0.0279
Epoch 016 [300/600] Loss:0.0199
Epoch 016 [400/600] Loss:0.0765
Epoch 016 [500/600] Loss:0.1028
...Testing @ Epoch 016 Acc:0.9807
Epoch 017 [000/600] Loss:0.0149
Epoch 017 [100/600] Loss:0.0160
Epoch 017 [200/600] Loss:0.0454
Epoch 017 [300/600] Loss:0.1048
Epoch 017 [400/600] Loss:0.1013
Epoch 017 [500/600] Loss:0.0834
...Testing @ Epoch 017 Acc:0.9801
Epoch 018 [000/600] Loss:0.0062
Epoch 018 [100/600] Loss:0.0501
Epoch 018 [200/600] Loss:0.0478
Epoch 018 [300/600] Loss:0.0331
Epoch 018 [400/600] Loss:0.0333
Epoch 018 [500/600] Loss:0.0163
...Testing @ Epoch 018 Acc:0.9795
Epoch 019 [000/600] Loss:0.0331
Epoch 019 [100/600] Loss:0.0200
Epoch 019 [200/600] Loss:0.0242
Epoch 019 [300/600] Loss:0.0566
Epoch 019 [400/600] Loss:0.0917
这是一个简单的PyTorch训练循环,主要包括以下内容:
1.设置学习率(alpha)和迭代次数(epochs)
2.通过循环迭代训练数据集(train_loader),获得模型预测输出(output),并计算损失(loss)
3.将参数的梯度清零,进行反向传播求导(loss.backward())
4.使用梯度下降法更新参数的数值,即 p.data = p.data - alpha * p.grad.data
5.按照指定的间隔(interval)打印训练信息,包括训练次数(epoch),当前训练进度(i/len(train_loader))和当前损失(loss.item())
6.在测试数据集(test_loader)上对模型进行测试,获得预测结果(output),并计算准确率(acc)
7.打印测试结果,包括训练次数(epoch)和测试准确率(acc)
需要注意的是,这里采用的是交叉熵函数(cross_entropy),用于计算loss。同时也清空了参数梯度,防止梯度累加。
cross-entropy(交叉熵)是一种度量概率分布之间的差异性的方法,常用于机器学习中的分类问题。在分类问题中,预测结果通常表示为概率分布,而交叉熵则表示这个预测结果和实际类别分布之间的差异。交叉熵越小,则模型预测的概率分布越接近实际类别分布,模型的性能也越好。交叉熵公式如下:
H ( p , q ) = − ∑ i p ( i ) log q ( i ) H(p,q)=-\sum_{i}p(i)\log q(i) H(p,q)=−∑ip(i)logq(i)
其中, p p p 表示实际的类别分布, q q q 表示模型预测的概率分布。
序号 | 文章目录 | 直达链接 |
---|---|---|
1 | PyTorch应用实战一:实现卷积操作 | https://want595.blog.csdn.net/article/details/132575530 |
2 | PyTorch应用实战二:实现卷积神经网络进行图像分类 | https://want595.blog.csdn.net/article/details/132575702 |
3 | PyTorch应用实战三:构建神经网络 | https://want595.blog.csdn.net/article/details/132575758 |
4 | PyTorch应用实战四:基于PyTorch构建复杂应用 | https://want595.blog.csdn.net/article/details/132625270 |
5 | PyTorch应用实战五:实现二值化神经网络 | https://want595.blog.csdn.net/article/details/132625348 |
6 | PyTorch应用实战六:利用LSTM实现文本情感分类 | https://want595.blog.csdn.net/article/details/132625382 |