Pytorch实现逻辑回归demo

关于Logistic Regression

逻辑回归简单来说,就是对于一组几维的数据,每个数据对应着一个类别,这里用 y y y表示。用于训练,然后目的是对于一组多维的输入能够预测其类别 y y y

实现

方法步骤大致与Linear Regression类似,深层原理不再赘述。

注意在分类问题中使用的是Cross-Entropy Loss Function,为什么,因为如果使用Quadratic Loss Function的话会让拟合的标准过于严格,因为要让一组数据属于某一类并不需要其与target一模一样,只需其属于该类的概率大于属于其它类的概率即可。

因此描述这种相似性的函数使用交叉熵损失函数。

至于如何将linear的值映射到0-1的区间内,是使用Sigmoid函数,因此forward里面要计算两次。

代码如下:

#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
Created on Mon Mar 22 23:33:58 2021

Logistic Regression
@author: haoyuhuang
"""
import torch
import torch.nn as nn
import matplotlib.pyplot as plt
import numpy as np
from torch.autograd import Variable

# create the data

n_data = torch.ones(100, 2)  # 100r2l tensor
x0 = torch.normal(2 * n_data, 1)  # normal distribution
y0 = torch.zeros(100)
x1 = torch.normal(-2 * n_data, 1)
y1 = torch.ones(100)

x = torch.cat((x0, x1), 0).type(torch.FloatTensor)
y = torch.cat((y0, y1), 0).type(torch.FloatTensor)

# using y 0,1 to express the color, clarify the data x
plt.scatter(x.data.numpy()[:, 0], x.data.numpy()[:, 1], c=y.data.numpy(), s=50, lw=0, cmap='RdYlGn')
plt.show()


class LogisticRegression(nn.Module):
    def __init__(self):
        super(LogisticRegression, self).__init__()
        self.lr = nn.Linear(2, 1)  # input 2, output 1
        self.sm = nn.Sigmoid()

    def forward(self, xx):
        xx = self.lr(xx)
        xx = self.sm(xx)
        return xx


logistic_model = LogisticRegression()
if torch.cuda.is_available():
    logistic_model.cuda()

# loss function and optimizer
criterion = nn.BCELoss()  # cross entropy loss function

# using momentum to stay the updated direction before

# We can understand as one bottle rowing down the hill
# and as the momentum is big enough it will never be stuck in the valley.
# define it as study rate
optimizer = torch.optim.SGD(logistic_model.parameters(), lr=1e-3, momentum=0.9)

# train
for epoch in range(10000):
    if torch.cuda.is_available():
        x_data = Variable(x).cuda()
        y_data = Variable(y).cuda()
    else:
        x_data = Variable(x)
        y_data = Variable(y)

    out = logistic_model(x_data)
    # find an error that one tensor is [200] and another is [200,1],
    # so we need to dimensionality reduce the out as out1
    out1 = out.squeeze(-1)
    loss = criterion(out1, y_data)  # one-hot vector and tensor with 0 and 1

    print_loss = loss.data.item()
    mask = out.ge(0.5).float()  # define 0.5 as the line----ge: a >= b
    correct = (mask[:, 0] == y_data).sum()  # the number of the correct samples
    acc = correct.item() / x_data.size(0)  # teh accuracy

    optimizer.zero_grad()
    loss.backward()
    optimizer.step()

    if (epoch + 1) % 20 == 0:
        print('*' * 10)
        print('epoch{}'.format(epoch + 1))
        print('loss is:{:.4f}'.format(print_loss))
        print('accuracy is:{:.4f}'.format(acc))

你可能感兴趣的:(NLP,机器学习,深度学习,python,机器学习,人工智能)