** Pytorch安装:**
和Numpy类似,Pytorch中的Tensor也有自己的数据类型定义方式,常用的有:
Pytorch常用的Tensor运算如下:
torch和torchvision是Pytorch中两个核心的包。
torch是pytorch中的基本的包,很多数据操作的类都包含在torch包中。torch包含的类有:
torchvision包主要功能是实现数据的处理、导入和预览等。torchvision包含的类有:
torch.nn包提供了很多与实现神经网络中的具体功能相关的类,包括实现卷基层、池化层、全连接层的方法,防止过拟合的参数归一化方法、Dropout方法,实现激活函数的方法包括线性激活函数、非线性激活函数等等。
使用torch.nn搭建简单模型的代码如下:
import torch
from torch.autograd import Variable
batch_n = 100
hidden_layer = 100
input_data = 1000
output_data = 10
X = Variable(torch.randn(batch_n, input_data), requires_grad = False)
Y = Variable(torch.randn(batch_n, output_data), requires_grad = False)
models = torch.nn.Sequential(torch.nn.Linear(input_data, hidden_layer),
troch.nn.ReLU(),
torch.nn.Linear(hidden_layer, output_data))
在以上代码中,用到的几个函数:
下面介绍torch.nn中常用的损失函数:
MSELoss函数调用方式为:loss_function = torch.nn.MSELoss(),loss = loss_function(x,y),其余损失函数的调用方式类似。即,先对损失函数进行实例化,然后调用计算。
torch.optim中包含了许多自动优化的方法,如SGD、AdaGrad、RMSProp、Adam等。
在torch.transforms中有大量的数据变换类,其中有很大一部分可以用于实现数据增强。
torch.transforms.Compose类可以看做一种容器,能够同时对多种数据进行组合。传入的参数是一个列表,列表中的元素就是对载入的数据进行的各种变换操作。torch.transforms.Compose类常用的数据变化操作有:
中文文档地址:https://pytorch-cn.readthedocs.io/zh/latest/package_references/data/
torch.utils.data用来将数据分批。常用的为数据加载器:torch.utils.data.DataLoader,其类的定义为:
class torch.utils.data.DataLoader(dataset, batch_size=1, shuffle=False, sampler=None, num_workers=0, collate_fn=, pin_memory=False, drop_last=False)
参数:
第一种(推荐)只保存和加载模型参数:
torch.save(the_model.state_dict(), PATH)
然后:
the_model = TheModelClass(args, kwargs)
the_model.load_state_dict(torch.load(PATH))
然后:
the_model = torch.load(PATH)
import torch
import torch.nn as nn
import torch.utils.data as data
import matplotlib.pyplot as plt
%matplotlib inline
n_data = torch.ones(100,2)
x0 = torch.normal(2*n_data, 1)
y0 = torch.zeros(100)
x1 = torch.normal(-2*n_data,1)
y1 = torch.ones(100)
x = torch.cat((x0,x1),dim=0).type(torch.FloatTensor) # torch.float32
y = torch.cat((y0,y1),).type(torch.LongTensor) # torch.int64
class Classification(nn.Module):
def init(self, n_input, n_hidden, n_output):
super(Classification, self).init()
self.fc1 = nn.Linear(n_input, n_hidden)
self.fc2 = nn.Linear(n_hidden, n_hidden)
self.fc3 = nn.Linear(n_hidden, n_output)
self.relu = nn.ReLU()
def forward(self,x):
x = self.fc1(x)
x = self.relu(x)
x = self.fc2(x)
x = self.relu(x)
x = self.fc3(x)
return x
cls_net = Classification(2, 10, 2)
optimizer = torch.optim.SGD(cls_net.parameters(), lr=0.02)
loss = torch.nn.CrossEntropyLoss()
dataset = data.TensorDataset(x, y)
loader = data.DataLoader(dataset=dataset,batch_size=10,shuffle=True,num_workers=2)
plt.ion()
for i in range(10):
for D,L in loader:
pred = cls_net(x)
loss_value = loss(pred,y)
optimizer.zero_grad()
loss_value.backward()
optimizer.step()
if i % 5 == 0:
plt.cla()
prediction = torch.max(pred,1)[1]
predy = prediction.data.numpy()
targety = y.data.numpy()
plt.scatter(x.data.numpy()[:,0],x.data.numpy()[:,1],
c=predy, s=100, lw=0, cmap='RdYlGn')
accuracy = float((predy == targety).astype(int).sum() / float(targety.size))
plt.text(1.5,-4,'Accuarcy={:2}'.format(accuracy),fontdict={'size':20, 'color':'red'})
plt.pause(0.1)
plt.ioff()
plt.show()
import torch
import torch.nn as nn
import torch.nn.functional as fn
import matplotlib.pyplot as plt
data = torch.linspace(-1,1,100)
x = torch.unsqueeze(data,dim=1)
y = -x.pow(2) + 0.5 * torch.rand(x.size())
class regression(nn.Module):
def init(self, n_input, n_hidden, n_output):
super(regression, self).init()
self.fc1 = nn.Linear(n_input, n_hidden)
self.fc2 = nn.Linear(n_hidden, n_hidden)
self.fc3 = nn.Linear(n_hidden, n_output)
self.relu = nn.ReLU()
def forward(self, x):
x = self.fc1(x)
x = self.relu(x)
x = self.fc2(x)
x = self.relu(x)
x = self.fc3(x)
return x
reg_net = regression(1, 20 ,1)
optimizer = torch.optim.SGD(reg_net.parameters(), lr=0.2)
loss = torch.nn.MSELoss()
plt.ion()
for i in range(200):
pred = reg_net(x)
loss_value = loss(pred,y)
optimizer.zero_grad()
loss_value.backward()
optimizer.step()
if i % 5 == 0:
plt.cla()
plt.scatter(x.data.numpy(), y.data.numpy())
plt.plot(x.data.numpy(), pred.data.numpy())
plt.text(0.5, 0, 'Loss={:.4}'.format(loss_value.data.numpy()),
fontdict={'size':20, 'color':'red'})
plt.pause(0.1)
plt.ioff()
plt.show()
在深度学习中train_test_split出现的频率实在是太高了。
train_test_split官方文档:
https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.train_test_split.html#sklearn.model_selection.train_test_split
train_test_split()是sklearn.cross_validatino模块中用来随机划分训练集和测试集的。train_test_split是交叉验证中常用的函数,功能是从样本中随机的按比例选取train data和test data,调用形式为:
** X_train,X_test, y_train, y_test = cross_validation.train_test_split(train_data, train_target, test_size=0.4, random_state=0,options) **
参数代表含义:
class simple_range(object):
def init(self,num):
self.num = num
def iter(self):
return self
def next(self):
if self.num <= 0:
raise StopIteration
tmp = self.num
self.num -= 1
return tmp
a = simple_range(5)
a.next()
生成器器是一个有next()方法的对象,序列类型则保存了所有的数据项,它们的访问时通过索引进行的。
def infinte():
n = 1
while 1:
yield n #不加这一行的话,程序就进入了死循环
n += 1
ge = infinte()
next(ge)
def getyear(start,end):
for i in range(start,end+1):
if i%4==0 and i%100 != 0:
yield i
elif i%400 == 0 :
yield i
year_gen = getyear(1900,2000)
next(year_gen)
使用os模块system()方法可以执行shell命令,正常执行会返回0。使用格式:os.system(“bash command”)。如:os.system(‘ping www.baidu.com’)
注意:在非控制台编写时(如notebook编写),system只会调用系统命令而不会执行,执行结果可通过popen()函数返回file对象进行读取获得。
os.sep
help(dict)