PyTorch1.0 在MNIST手写数字数据集上实现Logistic regression 逻辑回归

PyTorch1.0 在MNIST手写数字数据集上实现Logistic regression

MNIST手写数字数据集:
可在 http://yann.lecun.com/exdb/mnist/ 获取,它包含了四个部分:

  • Training set images: train-images-idx3-ubyte.gz (9.9 MB, 解压后 47 MB, 包含 60,000 个样本)
  • Training set labels: train-labels-idx1-ubyte.gz (29 KB, 解压后 60 KB, 包含 60,000 个标签)
  • Test set images: t10k-images-idx3-ubyte.gz (1.6 MB, 解压后 7.8 MB, 包含 10,000 个样本)
  • Test set labels: t10k-labels-idx1-ubyte.gz (5KB, 解压后 10 KB, 包含 10,000 个标签)

Training/Testing set images 格式介绍: (其实torchvision包已经对数据加载进行优化,不用仔细研究数据格式啦)

  • offset:代表了字节偏移量
  • type:属性的值的类型
  • value:属性的值
  • description:说明
[offset] [type] [value] [description]
0000 32 bit integer 0x00000803(2051) magic number , 用于校验
0004 32 bit integer 60000(训练集)/10000(测试集) number of images
0008 32 bit integer 28 number of rows
0012 32 bit integer 28 number of columns
0016 unsigned byte ?? pixel
0017 unsigned byte ?? pixel
xxxx unsigned byte ?? pixel

Training/Testing set labels 格式介绍:

[offset] [type] [value] [description]
0000 32 bit integer 0x00000801(2049) magic number (MSB first), 用于校验
0004 32 bit integer 60000(训练集)/1000(测试集) number of items
0008 unsigned byte ?? label(0到9)
0009 unsigned byte ?? label(0到9)
xxxx unsigned byte ?? label(0到9)
# 包
import torch
import torch.nn as nn
import torchvision
import torchvision.transforms as transforms
import torch.nn.functional as F

1. Hyper-parameters 设置

# 超参数设置 Hyper-parameters
input_size = 784
num_classes = 10
num_epochs = 5
batch_size = 100
learning_rate = 0.001

2. MINIST数据加载

2.1 获取数据集

# train (bool, optional): If True, creates dataset from ``training.pt``,otherwise from ``test.pt``
train_dataset = torchvision.datasets.MNIST(root='../../../data/minist',train=True,transform=transforms.ToTensor(),download=True)

test_dataset = torchvision.datasets.MNIST(root='../../../data/minist',train=False,transform=transforms.ToTensor())

2.2数据加载器torch.utils.data.DataLoader 介绍:

torch.utils.data.DataLoader(dataset, batch_size=1, shuffle=False, sampler=None, batch_sampler=None, num_workers=0, collate_fn=<function default_collate>, pin_memory=False, drop_last=False, timeout=0, worker_init_fn=None)

Parameters参数:

  • dataset (Dataset) – dataset from which to load the data.
  • batch_size (int, optional) – how many samples per batch to load (default: 1).
  • shuffle (bool, optional) – set to True to have the data reshuffled at every epoch每个epoch数据打乱 (default: False).
  • sampler (Sampler, optional) – defines the strategy to draw samples from the dataset. If specified, shuffle must be False.
  • batch_sampler (Sampler, optional) – like sampler, but returns a batch of indices at a time. Mutually exclusive with batch_size, shuffle, sampler, and drop_last.
  • num_workers (int, optional) – how many subprocesses to use for data loading. 0 means that the data will be loaded in the main process. (default: 0)
  • collate_fn (callable, optional) – merges a list of samples to form a mini-batch.
  • pin_memory (bool, optional) – If True, the data loader will copy tensors into CUDA pinned memory before returning them.
  • drop_last (bool, optional) – set to True to drop the last incomplete batch, if the dataset size is not divisible by the batch size. If False and the size of dataset is not divisible by the batch size, then the last batch will be smaller. (default: False)
  • timeout (numeric, optional) – if positive, the timeout value for collecting a batch from workers. Should always be non-negative. (default: 0)
  • worker_init_fn (callable, optional) – If not None, this will be called on each worker subprocess with the worker id (an int in [0, num_workers - 1]) as input, after seeding and before data loading. (default: None)

2.3 加载数据

# 数据加载器(data loader)
train_loader = torch.utils.data.DataLoader(dataset=train_dataset,batch_size=batch_size,shuffle=True)

test_loader = torch.utils.data.DataLoader(dataset=test_dataset,batch_size=batch_size,shuffle=False)


#for i,(images,labels)in enumerate(train_loader):
#    if(i%100==0):
#        print(images.size(),labels)

    
#print(type(train_loader))

3. PyTorch中的CrossEntropyLoss交叉熵损失函数介绍

torch.nn.CrossEntropyLoss(weight=None, size_average=None, ignore_index=-100, reduce=None, reduction='mean')

  • This criterion combines nn.LogSoftmax() and nn.NLLLoss() in one single class.
  • It is useful when training a classification problem with C classes.
  • The input is expected to contain scores for each class.
  • This criterion expects a class index (0 to C-1) as the target for each value of a 1D tensor of size minibatch

公式:
loss ⁡ ( x , \operatorname{loss}(x, loss(x,class ) = − log ⁡ ( exp ⁡ ( x [ class ] ) ∑ j exp ⁡ ( x [ j ] ) ) = − x [ )=-\log \left(\frac{\exp (x[\text {class}])}{\sum_{j} \exp (x[j])}\right)=-x[ )=log(jexp(x[j])exp(x[class]))=x[class ] + log ⁡ ( ∑ j exp ⁡ ( x [ j ] ) ) ]+\log \left(\sum_{j} \exp (x[j])\right) ]+log(jexp(x[j]))

  • x [ class ] x[\text {class}] x[class]:给实际类的打分
  • x 的形状:(N, C)where C = number of classes
  • class 的形状:(N) where each value is 0 ≤ targets [ i ] ≤ C − 10 ≤ t a r g e t s [ i ] ≤ C − 1 0 \leq \text{targets}[i] \leq C-10≤targets[i]≤C−1 0targets[i]C10targets[i]C1

4. Logistic Regression模型定义及训练

# 定义逻辑回归模型
class LR(nn.Module):
    def __init__(self,input_dims,output_dims):
        super().__init__()
        self.linear=nn.Linear(input_dims, output_dims,bias=True)
    def forward(self,x):
        x=self.linear(x)
        return x
        
LR_model=LR(input_size, num_classes)

# 定义逻辑回归的损失函数,采用nn.CrossEntropyLoss(),nn.CrossEntropyLoss()内部集成了softmax函数
criterion = nn.CrossEntropyLoss(reduction='mean')

# 定义optimizer
optimizer=torch.optim.SGD(LR_model.parameters(),lr=learning_rate)

# 训练模型
total_step = len(train_loader)
for epoch in range(num_epochs):
    for i,(images,labels)in enumerate(train_loader):
         # 将图像序列转换至大小为 (batch_size, input_size),应为(100,,784)
        images = images.reshape(-1, 28*28)
        
        
        # forward
        y_pred = LR_model(images)
        #print(y_pred.size())
        #print(labels.size())
        loss = criterion(y_pred,labels)
        
        # backward()
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()
        if (i%100==0):
            print('Epoch [{}/{}], Step [{}/{}], Loss: {:.4f}' .format(epoch+1, num_epochs, i+1, total_step, loss.item()))
            

Epoch [1/5], Step [1/600], Loss: 2.3466
Epoch [1/5], Step [101/600], Loss: 2.2670
Epoch [1/5], Step [201/600], Loss: 2.1388
Epoch [1/5], Step [301/600], Loss: 2.0264
Epoch [1/5], Step [401/600], Loss: 1.9344
Epoch [1/5], Step [501/600], Loss: 1.8658
Epoch [2/5], Step [1/600], Loss: 1.8553
Epoch [2/5], Step [101/600], Loss: 1.7081
Epoch [2/5], Step [201/600], Loss: 1.7209
Epoch [2/5], Step [301/600], Loss: 1.6017
Epoch [2/5], Step [401/600], Loss: 1.5479
Epoch [2/5], Step [501/600], Loss: 1.5100
Epoch [3/5], Step [1/600], Loss: 1.5170
Epoch [3/5], Step [101/600], Loss: 1.4937
Epoch [3/5], Step [201/600], Loss: 1.4455
Epoch [3/5], Step [301/600], Loss: 1.4116
Epoch [3/5], Step [401/600], Loss: 1.4038
Epoch [3/5], Step [501/600], Loss: 1.3281
Epoch [4/5], Step [1/600], Loss: 1.2757
Epoch [4/5], Step [101/600], Loss: 1.2151
Epoch [4/5], Step [201/600], Loss: 1.1528
Epoch [4/5], Step [301/600], Loss: 1.1582
Epoch [4/5], Step [401/600], Loss: 1.1666
Epoch [4/5], Step [501/600], Loss: 1.0810
Epoch [5/5], Step [1/600], Loss: 1.2833
Epoch [5/5], Step [101/600], Loss: 1.0939
Epoch [5/5], Step [201/600], Loss: 1.1018
Epoch [5/5], Step [301/600], Loss: 0.9825
Epoch [5/5], Step [401/600], Loss: 1.0706
Epoch [5/5], Step [501/600], Loss: 1.0025

5. 模型测试

# 在测试阶段,为了运行内存效率,就不需要计算梯度了
# PyTorch 默认每一次前向传播都会计算梯度
with torch.no_grad():
    correct = 0
    total = 0
    for images, labels in test_loader:
        images = images.reshape(-1, 28*28)
        outputs = LR_model(images)
        # torch.max的输出:out (tuple, optional) – the result tuple of two output tensors (max, max_indices)
        max, predicted = torch.max(outputs.data, 1)
        #print(max.data)
        #print(predicted)
        total += labels.size(0)
        correct += (predicted == labels).sum()

    print('Accuracy of the model on the 10000 test images: {} %'.format(100 * correct / total))

Accuracy of the model on the 10000 test images: 82 %
## 保存模型
torch.save(LR_model.state_dict(), 'model.ckpt')

你可能感兴趣的:(深度学习,PyTorch,PyTorch专栏)