annaconda下安装pytorch,使用官方源安装巨慢,还出错
可以用中科大的镜像源,速度提高一万倍
具体做法
conda config --add channels https://mirrors.ustc.edu.cn/anaconda/pkgs/free/
conda config --set show_channel_urls yes
然后去官方找你对应的cuda版本去下载,例如
pip install torch==1.5.1+cu101 torchvision==0.6.1+cu101 -f https://download.pytorch.org/whl/torch_stable.html -i https://pypi.mirrors.ustc.edu.cn/simple
如何有这门那门的错误,建议直接离线安装,安装流程可以参考这篇博客
在安装完成后,随便建一个测试文件(.py),然后测试torch
模块
import torch
print(torch.__version__)
print(torch.version.cuda)
print(torch.cuda.is_available())
ngpu = 1
device = torch.device("cuda:0" if(torch.cuda.is_available() and ngpu>0) else "cpu")
print(device)
print(torch.cuda.get_device_name(0))
print(torch.rand(3,3).cuda())
如果pytorch安装成功而且是正确的cuda版本,应该会打印如下信息:
1.5.1+cu101
10.1
True
cuda:0
GeForce GTX 1050 Ti
tensor([[0.4271, 0.1155, 0.2633],
[0.9090, 0.1135, 0.8025],
[0.4208, 0.9740, 0.2207]], device='cuda:0')
由于numpy作为数据处理三大神器之一,其自带的ndarray也可以实现数据在n维的表征,这与tensorflow或者pytorch的tensor(张量)是极其相似的,所以pytorch提供了ndarray的数据转换接口,使得torch与numpy能很好的兼容。
其中一些常用的转换API:
tensor<-torch.from_numpy(ndarray)
:从ndarray转换到tensor,用于搭建模型
ndarray<-tensor.numpy()
:从tensor转换到ndarray,常用于可视化
Tensor<-torch.as_tensor(data, dtype=None, device=None)
: 从任意data (array_like) – Initial data for the tensor. Can be a list, tuple, NumPy ndarray
, scalar, and other types.转为tensor.
# Torch 中的数学运算
data = [-1, -2, 1, 2]
tensor = torch.FloatTensor(data)
print(
'\nabs',
'\nnumpy: ', np.abs(data), # numpy: [1 2 1 2]
'\ntorch: ', torch.abs(tensor) # torch: tensor([1., 2., 1., 2.])
)
print(
'\nsin',
'\nnumpy: ', np.sin(data), # [-0.84147098 -0.90929743 0.84147098 0.90929743]
'\ntorch: ', torch.sin(tensor) # tensor([-0.8415, -0.9093, 0.8415, 0.9093])
)
print(
'\nmean',
'\nnumpy: ', np.mean(data), # 0.0
'\ntorch: ', torch.mean(tensor) # tensor(0.)
)
## 矩阵运算
np_data = np.array([[1, 2], [3, 4]])
tensor = torch.FloatTensor(np_data)
print(
'\nmatrix multiplication(matmul)',
'\nnumpy matmul: ', np.matmul(np_data, np_data), # [[ 7 10][15 22]]
'\ntorch matmul: ', torch.matmul(tensor, tensor), # tensor[[ 7 10][15 22]]
'\ntorch mm: ', torch.mm(tensor, tensor), # tensor[[ 7 10][15 22]]
'\nnumpy a.dot(b): ', np_data.dot(np_data), # [[ 7 10][15 22]]
# '\ntorch a.dot(b): ', tensor.dot(tensor), # RuntimeError: 1D tensors expected, got 2D,
)
Tensor
(张量)类似于NumPy
的ndarray
,但还可以在GPU上使用来加速计算。
torch.tensor
的创建方式与ndarray非常相似,下面是一些常用的创建操作:
torch. tensor
(data, dtype=None, device=None, requires_grad=False, pin_memory=False) → Tensor
data (array_like) – Initial data for the tensor. Can be a list, tuple, NumPy ndarray
, scalar, and other types.
只要是array_like的数据都可用来创建张量tensor,比如python的列表、元组,numpy的ndarray、标量scala。
dtype (torch.dtype
, optional) – the desired data type of returned tensor. Default: if None
, infers data type from data
.
这个就和np.int32,np.float64巴拉巴拉之类的很像
device (torch.device
, optional) – the desired device of returned tensor. Default: if None
, uses the current device for the default tensor type (see torch.set_default_tensor_type()
). device
will be the CPU for CPU tensor types and the current CUDA device for CUDA tensor types.
在创建tensor时指定该tensor基于CPU还是GPU运行
requires_grad (bool, optional) – If autograd should record operations on the returned tensor. Default: False
.
是否需要记录梯度求解信息
pin_memory (bool, optional) – If set, returned tensor would be allocated in the pinned memory. Works only for CPU tensors. Default: False
.
是否利用缓存优化–仅适用于CPU-Tensor
Random sampling creation ops are listed under Random sampling and include: torch.rand()
torch.rand_like()
torch.randn()
torch.randn_like()
torch.randint()
torch.randint_like()
torch.randperm()
You may also use torch.empty()
with the In-place random sampling methods to create torch.Tensor
s with values sampled from a broader range of distributions.
以torch.randon()为例,它的函数构造如下:
torch.randn
(*size, out=None, dtype=None, layout=torch.strided, device=None, requires_grad=False) → Tensor
从均值为0和方差为1的正态分布(也称为标准正态分布)中返回一个填充了随机数的张量。
参数:
torch.dtype
, optional) – the desired data type of returned tensor. Default: if None
, uses a global default (see torch.set_default_tensor_type()
).torch.layout
, optional) – the desired layout of returned Tensor. Default: torch.strided
.torch.device
, optional) – the desired device of returned tensor. Default: if None
, uses the current device for the default tensor type (see torch.set_default_tensor_type()
). device
will be the CPU for CPU tensor types and the current CUDA device for CUDA tensor types.False
.torch.zeros
(*size, out=None, dtype=None, layout=torch.strided, device=None, requires_grad=False) → Tensor:创建给定size的zero-tensor
torch.zeros_like
(input, dtype=None, layout=None, device=None, requires_grad=False, memory_format=torch.preserve_format) → Tensor:创建给定input(tensor) size相同的zero-tensor
>>> torch.zeros(2, 3)
tensor([[ 0., 0., 0.],
[ 0., 0., 0.]])
>>> input = torch.empty(2, 3)
>>> torch.zeros_like(input)
tensor([[ 0., 0., 0.],
[ 0., 0., 0.]])
torch.ones
(*size, out=None, dtype=None, layout=torch.strided, device=None, requires_grad=False) → Tensortorch.ones_like
(input, dtype=None, layout=None, device=None, requires_grad=False, memory_format=torch.preserve_format) → Tensor>>> torch.ones(2, 3)
tensor([[ 1., 1., 1.],
[ 1., 1., 1.]])
>>> input = torch.empty(2, 3)
>>> torch.ones_like(input)
tensor([[ 1., 1., 1.],
[ 1., 1., 1.]])
torch.arange
(start=0, end, step=1, out=None, dtype=None, layout=torch.strided, device=None, requires_grad=False) → Tensor :创建一维tensor,类似numpy.arangetorch.linspace
(start, end, steps=100, out=None, dtype=None, layout=torch.strided, device=None, requires_grad=False) → Tensor:创建由区间内均匀分布的数据的组成的一维tensor,需要指定start,end,默认生成100个数据点(steps=100)。print(torch.arange(5,11))
print(torch.arange(5,11).shape)
print(torch.linspace(5,11).shape)
print(torch.linspace(5,11,6))
'''
tensor([ 5, 6, 7, 8, 9, 10])
torch.Size([6])
torch.Size([100])
tensor([ 5.0000, 6.2000, 7.4000, 8.6000, 9.8000, 11.0000])
'''
torch.eye
(n, m=None, out=None, dtype=None, layout=torch.strided, device=None, requires_grad=False) → Tensor
torch.empty
(*size, out=None, dtype=None, layout=torch.strided, device=None, requires_grad=False, pin_memory=False) → Tensor:返回用未初始化数据填充的张量。张量的形状由变量大小定义。
torch.full
(size, fill_value, out=None, dtype=None, layout=torch.strided, device=None, requires_grad=False) → Tensor:Returns a tensor of size size
filled with fill_value
.
当然,torch也提供了超级多API来帮助我们查看/修改tensor的属性
>>> torch.tensor([1.2, 3]).dtype # initial default for floating point is torch.float32
torch.float32
>>> torch.set_default_tensor_type(torch.DoubleTensor)
>>> torch.tensor([1.2, 3]).dtype # a new floating point tensor
torch.float64
>>> a = torch.randn(1, 2, 3, 4, 5)
>>> torch.numel(a)
120
>>> a = torch.zeros(4,4)
>>> torch.numel(a)
16
在torch中使用tensor与在numpy中使用ndarray手感完全一致,只不过tensor在ndarray基础上有了许多新的property,比如张量可以使用.to
方法移动到任何设备(device)上:
# let us run this cell only if CUDA is available
# 我们将使用`torch.device`来将tensor移入和移出GPU
if torch.cuda.is_available():
device = torch.device("cuda") # a CUDA device object
y = torch.ones_like(x, device=device) # 直接在GPU上创建tensor
x = x.to(device) # 或者使用`.to("cuda")`方法
z = x + y
print(z)
print(z.to("cpu", torch.double)) # `.to`也能在移动时改变dtype
输出:
tensor([1.0445], device='cuda:0')
tensor([1.0445], dtype=torch.float64)
为理解自动求导,或者说知道为什么,何时设置tensor的requires_grad属性为true,我们需要先看看神经网络是如何运作的:
PyTorch中,所有神经网络的核心是autograd
包。先简单介绍一下这个包,然后训练我们的第一个的神经网络。
autograd
包为张量上的所有操作提供了自动求导机制。PyTorch是一个在运行时定义(define-by-run)的框架,这意味着反向传播是根据代码如何运行来决定的,并且每次迭代可以是不同的.
在前面张量的各个创建方法中,我们几乎都能看到一个属性的存在:requires_grad=false
,如果设置它的属性 .requires_grad
为True
,那么它将会追踪对于该张量的所有操作。当完成计算后可以通过调用.backward()
,来自动计算所有的梯度。这个张量的所有梯度将会自动累加到.grad
属性.
x = torch.ones(2,2,requires_grad=True)
print("x:",x)
y = x + 2
print("y.grad_fn:",y.grad_fn) # y是计算的结果,所以它有grad_fn属性。
z = y * y * 3
out = z.mean()
print(out) # scalar
out.backward() # 反向传播-计算梯度
print("after backward x's grad become to:",x.grad)
输出:
x: tensor([[1., 1.],
[1., 1.]], requires_grad=True)
y.grad_fn:
tensor(27., grad_fn=)
after backward x's grad become to: tensor([[4.5000, 4.5000],
[4.5000, 4.5000]])
相反,如果要阻止一个张量被跟踪历史,可以调用.detach()
方法将其与计算历史分离,并阻止它未来的计算记录被跟踪。
为了防止跟踪历史记录(和使用内存),也可以将代码块包装在with torch.no_grad():
中。在评估模型时特别有用,因为模型可能具有requires_grad = True
的可训练的参数,但是我们不需要在此过程中对他们进行梯度计算。
x = torch.ones(2,2,requires_grad=True)
print("x.requires_grad:",x.requires_grad)
print("(x ** 2).requires_grad:",(x ** 2).requires_grad)
with torch.no_grad():
print("with torch.no_grade,(x ** 2).requires_grad:",(x ** 2).requires_grad)
还有一个类对于autograd的实现非常重要:Function
。
Tensor
和Function
互相连接生成了一个非循环图,它编码了完整的计算历史。每个张量都有一个.grad_fn
属性,它引用了一个创建了这个Tensor
的Function
(除非这个张量是用户手动创建的,即这个张量的grad_fn
是None
)。
如果需要计算导数,可以在Tensor
上调用.backward()
。如果Tensor
是一个标量(即它包含一个元素的数据),则不需要为backward()
指定任何参数,但是如果它有更多的元素,则需要指定一个gradient
参数,它是形状匹配的张量。
# 创建一个二元函数,即z=f(x,y)=x^2+y^2,x可求导,y设置不可求导
x = torch.tensor(3.0, requires_grad=True)
y = torch.tensor(4.0, requires_grad=False)
# y = torch.tensor(4.0, requires_grad=True)
z = torch.pow(x, 2) + torch.pow(y, 2)
# 判断x,y是否是可以求导的
print(x.requires_grad)
print(y.requires_grad)
print(z.requires_grad)
# 求导,通过backward函数来实现
z.backward()
# 查看导数,也即所谓的梯度
print(x.grad)
print(y.grad)
'''运行结果为:
True # x是可导的
False # y是不可导的
True # z是可导的,因为它有一个 leaf variable 是可导的,即x可导
tensor(6.) # x的导数
None # 因为y不可导,所以是none
'''
总结:
- 对于需要计算梯度的变量,现在不需要把它用Variable包裹了,只需要设定
requires_grad=True
即可,默认为False以节省不必要的计算记录- 对于不需要计算梯度的变量也可以手动声明,官方建议使用
with torch.no_grad():
来包括其中的计算过程。tensor(requires_grad=True) 计算时, 它在背景幕布后面一步步默默地搭建着一个庞大的系统, 叫做计算图, computational graph. 这个图是用来干嘛的? 原来是将所有的计算步骤 (节点) 都连接起来, 最后进行误差反向传递的时候, 一次性将所有 tensor(requires_grad=True) 里面的修改幅度 (梯度) 都计算出来, 而未声明
requires_grad=True
的tensor就没有这个能力啦.
import torch
import matplotlib.pyplot as plt
import torch.nn.functional as F # 激励函数都在这
# 做一些假数据来观看图像
X = torch.linspace(-5, 5, 200,requires_grad=True) # x data (tensor), shape=(100, 1)
X_np = X.data.numpy()# 将tensor换为ndarray,用于画图
# 几种常用的激励函数
y_relu = F.relu(input=X).data.numpy()
y_sigmoid = F.sigmoid(input=X).data.numpy()
y_tanh = F.tanh(input=X).data.numpy()
y_softplus = F.softplus(input=X).data.numpy()
y_softmax = F.softmax(input=X).data.numpy()
# 可视化激励函数
plt.figure(1,figsize=(8,6))
plt.subplot(221)
plt.plot(X_np,y_relu,c='red',label='relu')
plt.ylim((-1,5))
plt.legend(loc ="best")
plt.subplot(222)
plt.plot(X_np,y_sigmoid,c='red',label='sigmoid')
plt.ylim((-0.2,1.2))
plt.legend(loc ="best")
plt.subplot(223)
plt.plot(X_np,y_tanh,c='red',label='tanh')
plt.ylim((-1.2,1.2))
plt.legend(loc ="best")
plt.subplot(224)
plt.plot(X_np,y_softplus,c='red',label='softplus')
plt.ylim((-0.1,6))
plt.legend(loc ="best")
plt.savefig("activateFunctions")
plt.show()
建立一个神经网络我们可以直接运用 torch 中的体系. 先定义所有的层属性(__init__()
), 然后再一层层搭建(forward(x)
)层于层的关系链接. 建立关系的时候, 我们会用到激励函数。
import torch
import matplotlib.pyplot as plt
import torch.nn.functional as F
import warnings
warnings.filterwarnings("ignore")
# 自定义数据集
X = torch.unsqueeze(torch.linspace(-1, 1, 100), dim=1)
# unsqueeze把一维的数据按指定轴展开为二维数据
y = X.pow(2) + 0.2 * torch.rand(X.size()) # noisy y data (tensor), shape=(100, 1)
# 可视化数据
# plt.scatter(X.data.numpy(), y.data.numpy())
# plt.show()
# 定义神经网络
class LRNN(torch.nn.Module):
def __init__(self, n_feature, n_hidden, n_output):
'''
定义网络的layers,以及layer之间的关系
n_feature-->n_hidden-->n_output
:param n_feature: 神经网络的输入
:param n_hidden: Hidden Layer的输出
:param n_output: 预测结果输出
'''
super(LRNN, self).__init__()
self.hidden = torch.nn.Linear(in_features=n_feature, out_features=n_hidden) # 隐藏层为线性模型
self.predict = torch.nn.Linear(n_hidden, n_output)
def forward(self, X):
y_hidden = F.relu(self.hidden(X)) # 经过hidden layer与激活函数relu
y_predict = self.predict(y_hidden)
return y_predict
# 训练神经网络
linerNet = LRNN(1, 10, 1)
print(linerNet)
'''
LRNN(
(hidden): Linear(in_features=1, out_features=10, bias=True)
(predict): Linear(in_features=10, out_features=1, bias=True)
)
'''
optimizer = torch.optim.SGD(linerNet.parameters(), lr=0.5) # 使用随机梯度下降法优化模型参数
loss_func = torch.nn.MSELoss() # 使用均方差定义损失函数
###可视化#####
plt.ion()
plt.show()
## 开始训练
for t in range(200):
y_predict = linerNet(X)
loss = loss_func(y_predict, y) # y_predict在前,y_true在后
optimizer.zero_grad() # 初始化梯度为0
loss.backward() # 计算各待优化参数的梯度
optimizer.step() # 根据上一步计算的梯度优化相应的模型参数
if t % 5 == 0:
plt.cla()
plt.scatter(X.data.numpy(), y.data.numpy())
plt.plot(X.data.numpy(), y_predict.data.numpy(), 'r-', lw=5)
plt.text(0.5, 0, 'Loss=%.4f' % loss.data.numpy(), fontdict={'size': 20, 'color': 'red'})
plt.pause(0.1)
plt.ioff()
plt.show()
结果如下:
LRNN(
(hidden): Linear(in_features=1, out_features=10, bias=True)
(predict): Linear(in_features=10, out_features=1, bias=True)
)
import torch
import matplotlib.pyplot as plt
import torch.nn.functional as F
import warnings
import numpy as np
warnings.filterwarnings("ignore")
# 自定义数据集
n_data = torch.ones(100, 2)
x0 = torch.normal(2 * n_data, 1)
y0 = torch.zeros(100)
x1 = torch.normal(-2 * n_data, 1)
y1 = torch.ones(100)
X = torch.cat((x0, x1), 0).type(torch.FloatTensor) # torch.Size([200, 2])
y = torch.cat((y0, y1), ).type(torch.LongTensor) # torch.Size([200])
## 可视化数据
# plt.scatter(X.data.numpy()[:,0],X.data.numpy()[:,1],c=y.data.numpy(),s=100, lw=0, cmap='RdYlGn')
# plt.show()
# 定义神经网络
class LCNN(torch.nn.Module):
def __init__(self, n_feature, n_hidden, n_output):
'''
定义网络的layers,以及layer之间的关系
n_feature-->n_hidden-->n_output
:param n_feature: 神经网络的输入
:param n_hidden: Hidden Layer的输出
:param n_output: 预测结果输出
'''
super(LCNN, self).__init__()
self.hidden = torch.nn.Linear(in_features=n_feature, out_features=n_hidden) # 隐藏层为线性模型
self.predict = torch.nn.Linear(n_hidden, n_output)
def forward(self, X):
y_hidden = F.relu(self.hidden(X)) # 经过hidden layer与激活函数relu
y_predict = self.predict(y_hidden)
return y_predict
net = LCNN(2, 10, 2)
'''
输出为[0,1]则分类为1
输出为[1,0]则分类为0
'''
plt.ion()
plt.show()
optimizer = torch.optim.SGD(net.parameters(), lr=0.02)
loss_func = torch.nn.CrossEntropyLoss() # 交叉熵
for t in range(100):
y_predict = net(X)
loss = loss_func(y_predict, y)
optimizer.zero_grad()# 清空上一步的残余更新参数值
loss.backward() # 误差反向传播, 计算参数更新值
optimizer.step() # 将参数更新值施加到 net 的 parameters 上
if t % 2 == 0:
plt.cla()
tmp = torch.max(F.softmax(y_predict), 1)[1] # 通过softmax将预测值转换为判定的概率
y_predict = tmp.data.numpy().squeeze()
y_true = y.data.numpy()
plt.scatter(X.data.numpy()[:, 0], X.data.numpy()[:, 1], c=y.data.numpy(), s=100, lw=0, cmap='RdYlGn')
accuracy = sum(y_predict == y_true) / 200
X_np = X.data.numpy()
x_min, x_max = X_np[0, :].min() - 1, X_np[0, :].max() + 0.1
y_min, y_max = X_np[1, :].min() - 1, X_np[1, :].max() + 0.1
plt.text(1.5, -4, 'Accuracy=%.2f' % accuracy, fontdict={'size': 20, 'color': 'red'})
plt.pause(0.1)
plt.ioff()
plt.show()
输出:
import torch
import torch.nn.functional as F
class LCNN(torch.nn.Module):
def __init__(self, n_feature, n_hidden, n_output):
super(LCNN, self).__init__()
self.hidden = torch.nn.Linear(in_features=n_feature, out_features=n_hidden) # 隐藏层为线性模型
self.predict = torch.nn.Linear(n_hidden, n_output)
def forward(self, X):
y_hidden = F.relu(self.hidden(X)) # 经过hidden layer与激活函数relu
y_predict = self.predict(y_hidden)
return y_predict
net1 = LCNN(2, 10, 2)
## 通过序列模型快速 ##
## 搭建神经网络 ##
net2 = torch.nn.Sequential(
torch.nn.Linear(2, 10), # Hidden Layer's input_num and output_num
torch.nn.ReLU(), # after Hidden layer output,the data flow to ReLU
torch.nn.Linear(10, 2) # OutputLayer
)
print(net1)
print(net2)
import torch
import matplotlib.pyplot as plt
torch.manual_seed(1) # reproducible
# 假数据
x = torch.unsqueeze(torch.linspace(-1, 1, 100), dim=1) # x data (tensor), shape=(100, 1)
y = x.pow(2) + 0.2*torch.rand(x.size()) # noisy y data (tensor), shape=(100, 1)
def save():
# 建网络
net1 = torch.nn.Sequential(
torch.nn.Linear(1, 10),
torch.nn.ReLU(),
torch.nn.Linear(10, 1)
)
optimizer = torch.optim.SGD(net1.parameters(), lr=0.2)
loss_func = torch.nn.MSELoss()
# 训练
for t in range(200):
prediction = net1(x)
loss = loss_func(prediction, y)
optimizer.zero_grad()
loss.backward()
optimizer.step()
# plot result
prediction = net1(x)
plt.figure(1, figsize=(10, 3))
plt.subplot(131)
plt.title('Net1')
plt.scatter(x.data.numpy(), y.data.numpy())
plt.plot(x.data.numpy(), prediction.data.numpy(), 'r-', lw=5)
# 保存模型方式1--entire net
torch.save(net1, "net.pkl")
# 保存模型方式2--net's parameters
torch.save(net1.state_dict(), "net_params.pkl")
# 提取:entire net
def restore_net():
net2 = torch.load("net.pkl")
prediction = net2(x)
# plot result
plt.figure(1, figsize=(10, 3))
plt.subplot(132)
plt.title("Net 2")
plt.scatter(x.data.numpy(), y.data.numpy())
plt.plot(x.data.numpy(), prediction.data.numpy(), 'r-', lw=5)
# 提取:net params
def restore_params():
# 先建立相同结构的模型
net3 = torch.nn.Sequential(
torch.nn.Linear(1, 10),
torch.nn.ReLU(),
torch.nn.Linear(10, 1)
)
# 然后提取参数并赋予模型
net3.load_state_dict(torch.load("net_params.pkl"))
prediction = net3(x)
# plot result
plt.figure(1, figsize=(10, 3))
plt.subplot(133)
plt.title("Net 3")
plt.scatter(x.data.numpy(), y.data.numpy())
plt.plot(x.data.numpy(), prediction.data.numpy(), 'r-', lw=5)
plt.show()
save()
restore_net()
restore_params()
可视化输出:
转自https://github.com/yunjey/pytorch-tutorial
import torch
import torchvision
import torch.nn as nn
import numpy as np
import torchvision.transforms as transforms
# ================================================================== #
# Table of Contents #
# ================================================================== #
# 1. Basic autograd example 1 (Line 25 to 39)
# 2. Basic autograd example 2 (Line 46 to 83)
# 3. Loading data from numpy (Line 90 to 97)
# 4. Input pipline (Line 104 to 129)
# 5. Input pipline for custom dataset (Line 136 to 156)
# 6. Pretrained model (Line 163 to 176)
# 7. Save and load model (Line 183 to 189)
# ================================================================== #
# 1. Basic autograd example 1 #
# ================================================================== #
# Create tensors.
x = torch.tensor(1., requires_grad=True)
w = torch.tensor(2., requires_grad=True)
b = torch.tensor(3., requires_grad=True)
# Build a computational graph.
y = w * x + b # y = 2 * x + 3
# Compute gradients.
y.backward()
# Print out the gradients.
print(x.grad) # x.grad = 2
print(w.grad) # w.grad = 1
print(b.grad) # b.grad = 1
# ================================================================== #
# 2. Basic autograd example 2 #
# ================================================================== #
# Create tensors of shape (10, 3) and (10, 2).
x = torch.randn(10, 3)
y = torch.randn(10, 2)
# Build a fully connected layer.
linear = nn.Linear(3, 2)
print ('w: ', linear.weight)
print ('b: ', linear.bias)
# Build loss function and optimizer.
criterion = nn.MSELoss()
optimizer = torch.optim.SGD(linear.parameters(), lr=0.01)
# Forward pass.
pred = linear(x)
# Compute loss.
loss = criterion(pred, y)
print('loss: ', loss.item())
# Backward pass.
loss.backward()
# Print out the gradients.
print ('dL/dw: ', linear.weight.grad)
print ('dL/db: ', linear.bias.grad)
# 1-step gradient descent.
optimizer.step()
# You can also perform gradient descent at the low level.
# linear.weight.data.sub_(0.01 * linear.weight.grad.data)
# linear.bias.data.sub_(0.01 * linear.bias.grad.data)
# Print out the loss after 1-step gradient descent.
pred = linear(x)
loss = criterion(pred, y)
print('loss after 1 step optimization: ', loss.item())
# ================================================================== #
# 3. Loading data from numpy #
# ================================================================== #
# Create a numpy array.
x = np.array([[1, 2], [3, 4]])
# Convert the numpy array to a torch tensor.
y = torch.from_numpy(x)
# Convert the torch tensor to a numpy array.
z = y.numpy()
# ================================================================== #
# 4. Input pipeline #
# ================================================================== #
# Download and construct CIFAR-10 dataset.
train_dataset = torchvision.datasets.CIFAR10(root='../../data/',
train=True,
transform=transforms.ToTensor(),
download=True)
# Fetch one data pair (read data from disk).
image, label = train_dataset[0]
print (image.size())
print (label)
# Data loader (this provides queues and threads in a very simple way).
train_loader = torch.utils.data.DataLoader(dataset=train_dataset,
batch_size=64,
shuffle=True)
# When iteration starts, queue and thread start to load data from files.
data_iter = iter(train_loader)
# Mini-batch images and labels.
images, labels = data_iter.next()
# Actual usage of the data loader is as below.
for images, labels in train_loader:
# Training code should be written here.
pass
# ================================================================== #
# 5. Input pipeline for custom dataset #
# ================================================================== #
# You should build your custom dataset as below.
class CustomDataset(torch.utils.data.Dataset):
def __init__(self):
# TODO
# 1. Initialize file paths or a list of file names.
pass
def __getitem__(self, index):
# TODO
# 1. Read one data from file (e.g. using numpy.fromfile, PIL.Image.open).
# 2. Preprocess the data (e.g. torchvision.Transform).
# 3. Return a data pair (e.g. image and label).
pass
def __len__(self):
# You should change 0 to the total size of your dataset.
return 0
# You can then use the prebuilt data loader.
custom_dataset = CustomDataset()
train_loader = torch.utils.data.DataLoader(dataset=custom_dataset,
batch_size=64,
shuffle=True)
# ================================================================== #
# 6. Pretrained model #
# ================================================================== #
# Download and load the pretrained ResNet-18.
resnet = torchvision.models.resnet18(pretrained=True)
# If you want to finetune only the top layer of the model, set as below.
for param in resnet.parameters():
param.requires_grad = False
# Replace the top layer for finetuning.
resnet.fc = nn.Linear(resnet.fc.in_features, 100) # 100 is an example.
# Forward pass.
images = torch.randn(64, 3, 224, 224)
outputs = resnet(images)
print (outputs.size()) # (64, 100)
# ================================================================== #
# 7. Save and load the model #
# ================================================================== #
# Save and load the entire model.
torch.save(resnet, 'model.ckpt')
model = torch.load('model.ckpt')
# Save and load only the model parameters (recommended).
torch.save(resnet.state_dict(), 'params.ckpt')
resnet.load_state_dict(torch.load('params.ckpt'))
莫烦python:https://morvanzhou.github.io/tutorials/machine-learning/torch/3-04-save-reload/
github:https://github.com/yunjey/pytorch-tutorial
pytorch doc:https://pytorch.org/docs/stable/torch.html?highlight=unsqueeze#torch.unsqueeze