很多内容来源于网络,如有冒犯。私信删除
属性缩放到一个指定的最大和最小值(通常是1-0)之间,这可以通过preprocessing.MinMaxScaler类实现。常用的最小最大规范化方法(x-min(x))/(max(x)-min(x))。
from sklearn import preprocessing
import numpy as np
min_max_scaler = preprocessing.MinMaxScaler()
X_train = np.array([[ 1., -1., 2.],[ 2., 0., 0.],[ 0., 1., -1.]])
X_train_minmax = min_max_scaler.fit_transform(X_train)
>>> X_train_minmax
array([[ 0.5 , 0. , 1. ],
[ 1. , 0.5 , 0.33333333],
[ 0. , 1. , 0. ]])
将数据按比例缩放,使之落入一个小的特定区间内,标准化后的数据可正可负,一般绝对值不会太大。计算时对每个属性/每列分别进行将数据按期属性(按列进行)减去其均值,并处以其方差。得到的结果是,对于每个属性/每列来说所有数据都聚集在0附近,方差为1。使用z-score方法规范化(x-mean(x))/std(x)这个在matlab中有特定的方程使用sklearn.preprocessing.scale()函数,可以直接将给定数据进行标准化:
from sklearn import preprocessing
import numpy as np
X = np.array([[ 1., -1., 2.],
[ 2., 0., 0.],
[ 0., 1., -1.]])
X_scaled = preprocessing.scale(X)
>>> X_scaled
array([[ 0. ..., -1.22..., 1.33...],
[ 1.22..., 0. ..., -0.26...],
[-1.22..., 1.22..., -1.06...]])
>>>#处理后数据的均值和方差
>>> X_scaled.mean(axis=0)
array([ 0., 0., 0.])
>>> X_scaled.std(axis=0)
array([ 1., 1., 1.])
正则化的过程是将每个样本缩放到单位范数(每个样本的范数为1),如果后面要使用如二次型(点积)或者其它核方法计算两个样本之间的相似性这个方法会很有用。Normalization主要思想是对每个样本计算其p-范数,然后对该样本中每个元素除以该范数,这样处理的结果是使得每个处理后样本的p-范数(l1-norm,l2-norm)等于1。
P − 范 数 的 计 算 公 式 : ∣ ∣ X ∣ ∣ p = ( ∣ x 1 ∣ p + ∣ x 2 ∣ p + . . . + ∣ x n ∣ p ) 1 / p P-范数的计算公式:||X||_p=(|x_1|^p+|x_2|^p+...+|x_n|^p)^{1/p} P−范数的计算公式:∣∣X∣∣p=(∣x1∣p+∣x2∣p+...+∣xn∣p)1/p
该方法主要应用于文本分类和聚类中。例如,对于两个TF-IDF向量的l2-norm进行点积,就可以得到这两个向量的余弦相似性。
>>> X = [[ 1., -1., 2.],
... [ 2., 0., 0.],
... [ 0., 1., -1.]]
>>> X_normalized = preprocessing.normalize(X, norm='l2')
>>> X_normalized
array([[ 0.40..., -0.40..., 0.81...],
[ 1. ..., 0. ..., 0. ...],
[ 0. ..., 0.70..., -0.70...]])
>>> normalizer = preprocessing.Normalizer().fit(X) # fit does nothing
>>> normalizer
Normalizer(copy=True, norm='l2')
>>>
>>> normalizer.transform(X)
array([[ 0.40..., -0.40..., 0.81...],
[ 1. ..., 0. ..., 0. ...],
[ 0. ..., 0.70..., -0.70...]])
>>> normalizer.transform([[-1., 1., 0.]])
array([[-0.70..., 0.70..., 0. ...]])
对数据沿着某一维度进行拼接,cat后的总维度数不变,需要注意两个张量进行cat时某一维的维数要相同,否则会报错!
import torch
x = torch.randn(2,3)
print(x)
print('*'*80)
y = torch.randn(1,3)
print(y)
print('*'*80)
t = torch.cat((x, y), 0) # 维度为(3, 3)
print(t)
torch.cat((x, z), 0) # 报错
运行结果:
tensor([[-1.3758, -0.3441, -1.4608],
[ 1.2006, -0.7091, 0.1233]])
********************************************************************************
tensor([[-0.8673, -0.8082, -2.3864]])
********************************************************************************
tensor([[-1.3758, -0.3441, -1.4608],
[ 1.2006, -0.7091, 0.1233],
[-0.8673, -0.8082, -2.3864]])
相比于Cat,Stack则会增加新的维度,并且将两个矩阵在新的维度上进行堆叠,一般要求两个矩阵的维度是相同的!
import torch
x = torch.randn(1,2)
y = torch.randn(1,2)
torch.stack((x, y), 0) # 在0维度进行堆叠,维度为(2, 1, 2)
torch.stack((x, y), 1) # 维度为(1, 2, 2)
运行结果:
tensor([[-0.9762, -1.1769]])
********************************************************************************
tensor([[-0.6522, 0.0318]])
********************************************************************************
tensor([[[-0.9762, -1.1769]],
[[-0.6522, 0.0318]]])
********************************************************************************
tensor([[[-0.9762, -1.1769],
[-0.6522, 0.0318]]])
********************************************************************************
import torch
x = 2.55555
y = torch.tensor(2.55555, dtype= torch.float32)
# 方法一
print('结果1:',round(x,3)) # round为python语法中自带的函数,3是保留小数的位数
# 方法二
print('结果2:',torch.round(y)) # torch.round不能限制小数位数
print('结果3:',torch.round(y).item()) # item()把数从tensor中取出
运行结果:
结果1: 2.556
结果2: tensor(3.)
结果3: 3.0
import torch
在PyTorch中,Tensor和tensor都能用于生成新的张量:
a = torch.Tensor([1,2])
>>> a=torch.Tensor([1,2])
>>> a
tensor([1., 2.])
>>> a=torch.tensor([1,2])
>>> a
tensor([1, 2])
首先,我们需要明确一下,torch.Tensor()是python类,更明确地说,是默认张量类型torch.FloatTensor()的别名,torch.Tensor([1,2])会调用Tensor类的构造函数__init__,生成单精度浮点类型的张量。
>>> a=torch.Tensor([1,2])
>>> a.type()
'torch.FloatTensor'
而torch.tensor()仅仅是python函数:https://pytorch.org/docs/stable/torch.html#torch.tensor ,函数原型是:
torch.tensor(data, dtype=None, device=None, requires_grad=False)
其中data可以是:list, tuple, NumPy ndarray, scalar和其他类型。torch.tensor会从data中的数据部分做拷贝(而不是直接引用),根据原始数据类型生成相应的torch.LongTensor、torch.FloatTensor和torch.DoubleTensor。
>>> a=torch.tensor([1,2])
>>> a.type()
'torch.LongTensor'
>>> a=torch.tensor([1.,2.])
>>> a.type()
'torch.FloatTensor
>>> a=np.zeros(2,dtype=np.float64)
>>> a=torch.tensor(a)
>>> a.type()
'torch.DoubleTensor'
这里再说一下torch.empty(),根据 https://pytorch.org/docs/stable/torch.html?highlight=empty#torch.empty ,我们可以生成指定类型、指定设备以及其他参数的张量,由于torch.Tensor()只能指定数据类型为torch.float,所以torch.Tensor()可以看做torch.empty()的一个特殊情况。
with torch.no_grad():
testY = model(testX)
print(testY)
运行结果:
tensor([[ 7.4433, -1.4233, -1.6965, -4.9028],
[ 11.1287, -5.7861, -2.3523, -1.3352],
[ 1.6368, 4.0758, 1.5106, -6.8918],
[ 11.1269, -6.2055, -0.2486, -4.0074],
[ 4.2791, -7.5071, 8.0243, -5.0912],
[ 3.9377, 0.1002, -3.0278, 0.7973],
[ 10.4937, -5.5156, 0.3815, -4.5885],
[ 10.2765, -2.4278, -0.0422, -7.3499],
[ 0.8234, 9.4561, -2.2854, -7.8151],
[ 3.6753, -2.6943, 6.2879, -5.9786],
[ 9.7963, -1.1426, 0.2660, -8.2053],
[ 5.3171, 3.5008, -3.4102, -5.2817],
[ 9.0295, -2.3807, -5.0728, -2.1787],
[ 12.7925, -6.8981, -3.3715, -1.3687],
[ 2.9363, -4.1924, -3.8692, 5.4553],
[ 7.0463, -1.8211, -2.3471, -1.9651],
[ 6.1256, -1.4506, -0.0740, -4.6081],
[ 4.4470, 0.8657, 1.6806, -5.3237],
[ 7.1012, 1.6752, 1.1116, -9.0371],
[ 1.7235, -5.7148, 6.2477, -1.1781],
[ 0.8945, 4.2796, -1.5190, -3.4724],
[ 9.7305, -2.1866, -2.9471, -2.3112],
[ 7.8209, -2.1488, 0.8533, -5.8382],
[ 0.6063, 7.9243, -2.4863, -5.3481],
[ 3.1649, -0.0549, 3.5648, -6.4298],
[ 8.4594, -0.2936, -0.4718, -6.2386],
[ 2.6753, 2.1676, 0.6504, -4.7133],
[ 11.4688, -4.3625, -5.2973, -1.6718],
[ 12.7178, -6.6919, -4.8123, -1.9376],
[ -0.9076, -0.9274, -4.6698, 7.8568],
[ 8.5488, -3.4524, -1.4708, -3.4786],
[ 9.8643, -6.3564, -2.3896, 0.1812],
[ -0.3086, 6.6137, -1.6922, -4.2936],
[ 5.6480, -0.3888, -1.8955, -0.7594],
[ 2.4999, -2.9834, 7.2879, -5.2193],
[ 4.2896, 0.3526, -4.0778, 0.2920],
[ 9.1389, -5.9225, -0.3296, -3.1200],
[ 6.9025, -3.9361, -2.1047, 1.1030],
[ 1.7949, 2.7270, -1.1831, -1.9257],
[ 4.2454, -4.7726, 5.9915, -4.7709],
[ 10.3149, -2.4509, -0.5917, -6.6981],
[ 0.3288, 8.1812, -5.5801, -0.7519],
[ 10.9215, -3.3665, -3.9858, -2.0602],
[ 9.2952, -3.1185, -5.7481, -0.3535],
[ 2.7448, -6.3724, -4.5297, 7.7019],
[ 8.7598, -4.8083, -2.2426, -0.4326],
[ 9.3423, -5.7544, 0.3519, -2.5967],
[ 2.0215, 2.5876, -0.7334, -1.8973],
[ 8.3974, -1.2813, -0.1331, -5.7042],
[ 1.4222, -2.6100, 6.5302, -2.1887],
[ 7.4289, 2.8581, 0.6636, -8.8257],
[ 7.4660, -3.3966, -3.2598, 0.7070],
[ 7.7047, -3.7917, -0.8066, -2.5238],
[ 3.9101, 3.1239, -2.9358, -1.0799],
[ 2.7316, -3.2821, 8.4985, -6.1583],
[ 9.0011, -2.5707, -1.6200, -3.3008],
[ -0.5210, 4.3287, -2.8837, 0.1590],
[ 9.4240, -1.8600, -4.6306, -0.2257],
[ 10.5553, -4.5794, -2.8072, -1.3519],
[ -2.0982, -1.5021, -5.8774, 10.1451],
[ 8.1251, -5.1918, -3.6729, 0.5811],
[ 8.6910, -2.0897, -4.6669, 0.5333],
[ -0.7934, 5.4703, -0.1302, -3.1170],
[ 6.9602, -1.3405, -0.1571, -4.3973],
[ 0.1805, -0.8911, 6.1601, -5.5365],
[ 2.1057, 2.5338, -5.6351, 2.3221],
[ 7.3220, 0.2707, -4.7512, -2.4399],
[ 8.4964, -1.4643, 4.8854, -10.9043],
[ 3.1047, 5.5968, 0.9471, -8.8787],
[ 4.9688, -5.2696, 6.1680, -4.2479],
[ 9.7998, -3.5701, 1.4597, -6.7401],
[ 1.8569, 6.1164, -3.1263, -4.2748],
[ 6.1492, 2.9876, -7.2567, -2.3775],
[ 9.4298, -2.8283, -7.4377, 1.5422],
[ -0.6555, -0.2519, -5.8323, 6.8694],
[ 7.3518, 3.0800, -0.9119, -9.0124],
[ 6.9438, 1.7972, -2.6768, -6.1078],
[ -0.0528, 7.3127, -1.9607, -3.8322],
[ 5.6991, 2.8540, -3.7784, -4.0820],
[ -1.1966, -1.2128, 6.1327, -2.6217],
[ -0.3849, 7.3386, -2.5669, -4.6670],
[ 6.3555, 1.5932, -5.1967, -1.0024],
[ 2.5816, 4.1530, -0.7747, -5.6864],
[ -0.7420, 9.3222, 0.5745, -7.3484],
[ 0.1243, -2.8342, 10.8683, -7.7141],
[ 6.9208, 1.0358, -0.1274, -5.5745],
[ 0.7077, 4.9082, 2.1944, -6.8996],
[ 6.7253, -0.3559, -4.0509, -1.9693],
[ 8.4796, -3.4290, -4.4795, -1.4465],
[ -0.5281, -0.7838, -5.0702, 7.0901],
[ 5.6690, 0.0732, -3.9329, -2.0248],
[ 12.2119, -2.2533, -1.3228, -7.9734],
[ 3.9205, 2.3429, 0.3645, -6.9054],
[ 7.0275, 1.2768, -2.3088, -4.5443],
[ 0.8335, -3.6880, 8.6731, -5.6231],
[ 0.8692, 6.5459, -5.7003, -0.1224],
[ 8.7197, -1.6967, -3.0582, -3.4979],
[ 7.0834, -1.9839, -3.9747, 0.7744],
[ 3.1499, 4.4433, -3.7725, -2.7284],
[ 8.1010, -2.4316, 4.0292, -8.6694]])
代码:
print(testY.max(1)) # 返回两个tensor, 第一个tensor为每一行的最大值,第二个tensor为最大值在每一行位置的索引
运行结果:
torch.return_types.max(
values=tensor([ 7.4433, 11.1287, 4.0758, 11.1269, 8.0243, 3.9377, 10.4937, 10.2765,
9.4561, 6.2879, 9.7963, 5.3171, 9.0295, 12.7925, 5.4553, 7.0463,
6.1256, 4.4470, 7.1012, 6.2477, 4.2796, 9.7305, 7.8209, 7.9243,
3.5648, 8.4594, 2.6753, 11.4688, 12.7178, 7.8568, 8.5488, 9.8643,
6.6137, 5.6480, 7.2879, 4.2896, 9.1389, 6.9025, 2.7270, 5.9915,
10.3149, 8.1812, 10.9215, 9.2952, 7.7019, 8.7598, 9.3423, 2.5876,
8.3974, 6.5302, 7.4289, 7.4660, 7.7047, 3.9101, 8.4985, 9.0011,
4.3287, 9.4240, 10.5553, 10.1451, 8.1251, 8.6910, 5.4703, 6.9602,
6.1601, 2.5338, 7.3220, 8.4964, 5.5968, 6.1680, 9.7998, 6.1164,
6.1492, 9.4298, 6.8694, 7.3518, 6.9438, 7.3127, 5.6991, 6.1327,
7.3386, 6.3555, 4.1530, 9.3222, 10.8683, 6.9208, 4.9082, 6.7253,
8.4796, 7.0901, 5.6690, 12.2119, 3.9205, 7.0275, 8.6731, 6.5459,
8.7197, 7.0834, 4.4433, 8.1010]),
indices=tensor([0, 0, 1, 0, 2, 0, 0, 0, 1, 2, 0, 0, 0, 0, 3, 0, 0, 0, 0, 2, 1, 0, 0, 1,
2, 0, 0, 0, 0, 3, 0, 0, 1, 0, 2, 0, 0, 0, 1, 2, 0, 1, 0, 0, 3, 0, 0, 1,
0, 2, 0, 0, 0, 0, 2, 0, 1, 0, 0, 3, 0, 0, 1, 0, 2, 1, 0, 0, 1, 2, 0, 1,
0, 0, 3, 0, 0, 1, 0, 2, 1, 0, 1, 1, 2, 0, 1, 0, 0, 3, 0, 0, 0, 0, 2, 1,
0, 0, 1, 0]))
代码:
print(testY.max(1)[1])
运行结果:
tensor([0, 0, 1, 0, 2, 0, 0, 0, 1, 2, 0, 0, 0, 0, 3, 0, 0, 0, 0, 2, 1, 0, 0, 1,
2, 0, 0, 0, 0, 3, 0, 0, 1, 0, 2, 0, 0, 0, 1, 2, 0, 1, 0, 0, 3, 0, 0, 1,
0, 2, 0, 0, 0, 0, 2, 0, 1, 0, 0, 3, 0, 0, 1, 0, 2, 1, 0, 0, 1, 2, 0, 1,
0, 0, 3, 0, 0, 1, 0, 2, 1, 0, 1, 1, 2, 0, 1, 0, 0, 3, 0, 0, 0, 0, 2, 1,
0, 0, 1, 0])
word_to_idx = {
word:i for i, word in enumerate(idx_to_word)}
# print(type(word_to_idx)) # 字典0 : the, 1 : of,………………
# print(word_to_idx[:100]) # 报错
# 字典不能切片显示,可以转换成list
print(list(word_to_idx.items())[:100])
print('*'*80)
print(list(word_to_idx)[:100])
运行结果:
[('the', 0), ('of', 1), ('and', 2), ('one', 3), ('in', 4), ('a', 5), ('to', 6), ('zero', 7), ('nine', 8), ('two', 9), ('is', 10), ('as', 11), ('eight', 12), ('for', 13), ('s', 14), ('five', 15), ('three', 16), ('was', 17), ('by', 18), ('that', 19), ('four', 20), ('six', 21), ('seven', 22), ('with', 23), ('on', 24), ('are', 25), ('it', 26), ('from', 27), ('or', 28), ('his', 29), ('an', 30), ('be', 31), ('this', 32), ('he', 33), ('at', 34), ('which', 35), ('not', 36), ('also', 37), ('have', 38), ('were', 39), ('has', 40), ('but', 41), ('other', 42), ('their', 43), ('its', 44), ('first', 45), ('they', 46), ('had', 47), ('some', 48), ('more', 49), ('all', 50), ('can', 51), ('most', 52), ('been', 53), ('such', 54), ('who', 55), ('many', 56), ('new', 57), ('there', 58), ('used', 59), ('after', 60), ('american', 61), ('when', 62), ('time', 63), ('into', 64), ('these', 65), ('only', 66), ('see', 67), ('may', 68), ('than', 69), ('i', 70), ('world', 71), ('b', 72), ('d', 73), ('would', 74), ('no', 75), ('however', 76), ('between', 77), ('about', 78), ('over', 79), ('states', 80), ('years', 81), ('war', 82), ('people', 83), ('united', 84), ('during', 85), ('known', 86), ('if', 87), ('called', 88), ('use', 89), ('th', 90), ('often', 91), ('system', 92), ('so', 93), ('history', 94), ('state', 95), ('will', 96), ('up', 97), ('while', 98), ('where', 99)]
********************************************************************************
['the', 'of', 'and', 'one', 'in', 'a', 'to', 'zero', 'nine', 'two', 'is', 'as', 'eight', 'for', 's', 'five', 'three', 'was', 'by', 'that', 'four', 'six', 'seven', 'with', 'on', 'are', 'it', 'from', 'or', 'his', 'an', 'be', 'this', 'he', 'at', 'which', 'not', 'also', 'have', 'were', 'has', 'but', 'other', 'their', 'its', 'first', 'they', 'had', 'some', 'more', 'all', 'can', 'most', 'been', 'such', 'who', 'many', 'new', 'there', 'used', 'after', 'american', 'when', 'time', 'into', 'these', 'only', 'see', 'may', 'than', 'i', 'world', 'b', 'd', 'would', 'no', 'however', 'between', 'about', 'over', 'states', 'years', 'war', 'people', 'united', 'during', 'known', 'if', 'called', 'use', 'th', 'often', 'system', 'so', 'history', 'state', 'will', 'up', 'while', 'where']
import torch
import numpy as np
a_numpy = np.array([1,2,3])
a_tensor = torch.from_numpy(a_numpy)
print(a_tensor)
a_numpy = a_tensor.numpy()
print(a_numpy)
# Tensor转list
>>>a=torch.ones([1,5])
>>>a
tensor([[1., 1., 1., 1., 1.]])
>>>b=a.tolist()
>>>b
[[1.0, 1.0, 1.0, 1.0, 1.0]]
# list转Tensor
>>>a=list(range(1,6))
>>>a
[1, 2, 3, 4, 5]
>>>b=torch.tensor(a)
>>>b
tensor([1, 2, 3, 4, 5])
tensor = torch.Tensor(3, 5)
# torch.long() 将tensor投射为long类型
newtensor = tensor.long()
# torch.half()将tensor投射为半精度浮点类型
newtensor = tensor.half()
# torch.int()将该tensor投射为int类型
newtensor = tensor.int()
# torch.double()将该tensor投射为double类型
newtensor = tensor.double()
# torch.float()将该tensor投射为float类型
newtensor = tensor.float()
# torch.char()将该tensor投射为char类型
newtensor = tensor.char()
# torch.byte()将该tensor投射为byte类型
newtensor = tensor.byte()
# torch.short()将该tensor投射为short类型
newtensor = tensor.short()
>>> a=torch.Tensor(2,5)
>>> a
tensor([[1.9431e-19, 4.8613e+30, 1.4603e-19, 2.0704e-19, 4.7429e+30],
[1.6530e+19, 1.8254e+31, 1.4607e-19, 6.8801e+16, 1.8370e+25]])
>>> b=torch.IntTensor(1,2)
>>> b
tensor([[16843009, 1]], dtype=torch.int32)
>>> a.type_as(b)
tensor([[ 0, -2147483648, 0, 0, -2147483648],
[-2147483648, -2147483648, 0, -2147483648, -2147483648]],
dtype=torch.int32)
>>> a
tensor([[1.9431e-19, 4.8613e+30, 1.4603e-19, 2.0704e-19, 4.7429e+30],
[1.6530e+19, 1.8254e+31, 1.4607e-19, 6.8801e+16, 1.8370e+25]])
type(new_type=None, async=False)如果未提供new_type,则返回类型,否则将此对象转换为指定的类型。 如果已经是正确的类型,则不会执行且返回原对象,用法如下:
>>>t1 = torch.LongTensor(3, 5)
>>>print(t1.type())
torch.LongTensor
# 转换为其他类型
>>>t2=t1.type(torch.FloatTensor)
>>>print(t2.type())
torch.FloatTensor
torch.FloatTensor
torch.LongTensor
torch.HalfTensor
torch.IntTensor
torch.DoubleTensor
torch.FloatTensor
torch.CharTensor
torch.ByteTensor
torch.ShortTensor
isinstance() 函数来判断一个对象是否是一个已知的类型,类似 type()。
以下是 isinstance() 方法的语法:
isinstance(object,classinfo)
>>>a = 2
>>>isinstance(a,int)
True
>>>isinstance(a,str)
False
>>>isinstance(a,(str,int,list)
# 是元组中的任何一个返还True
True
isinstance()与type()的区别
class A:
pass
class B(A):
pass
isinstance(A(), A) # returns True
type(A()) == A # returns True
isinstance(B(), A) # returns True
type(B()) == A # returns False
我们发现,创建一个A对象,再创建一个继承A对象的B对象,使用 isinstance() 和 type() 来比较 A() 和 A 时,由于它们的类型都是一样的,所以都返回了 True。而B对象继承于A对象,在使用isinstance()函数来比较 B() 和 A 时,由于考虑了继承关系,所以返回了 True,使用 type() 函数来比较 B() 和 A 时,不会考虑 B() 继承自哪里,所以返回了 False。如果要判断两个类型是否相同,则推荐使用isinstance()。
if isinstance(h, torch.Tensor):
pass
else:
pass
class WordEmbeddingDataset(torch.utils.data.Dataset):
def __init__(self, text, word_to_idx, idx_to_word, word_freqs, word_counts):
super(WordEmbeddingDataset, self).__init__()
self.text_encoded = [word_to_idx.get(t, VOCAB_SIZE-1) for t in text]
self.text_encoded = torch.LongTensor(self.text_encoded).long()
self.word_to_idx = word_to_idx
self.idx_to_word = idx_to_word
self.word_freqs = torch.Tensor(word_freqs)
self.word_counts = torch.Tensor(word_counts)
def __len__(self):
# 这个数据集一共有多少items
return len(self.text_encoded)
def __getitem__(self, idx): # 根据idx返回数据(tensor)
center_word = self.text_encoded[idx]
pos_indices = list(range(idx-C, idx)) + list(range(idx+1, idx+1+C)) # 周围单词的索引
# 防止 idx+1+C 大于 len(self.text_encoded),
# i % len(self.text_encoded),当i
# 当i>len(self.text_encoded)时,余数为个数
pos_indices = [i % len(self.text_encoded) for i in pos_indices]
pos_words = self.text_encoded[pos_indices] # 周围正确的单词,希望预测出来
# torch.multinomial()
neg_words = torch.multinomial(self.word_freqs, K*pos_words.shape[0],True) # 负例采样,pos_words.shape[0]表示正确单词个数
return center_word,pos_words,neg_words
dataset = WordEmbeddingDataset(text, word_to_idx, idx_to_word, word_freqs, word_counts)
dataloader = torch.utils.data.DataLoader(dataset, batch_size=BATCH_SIZE, shuffle=True, num_workers=0)
# 查看dataloader中dataset数据
# 方法一
next(iter(dataloader))
# 方法二
for i,(center_word, pos_words, neg_words) in enumerate(dataloader):
print(center_word, pos_words, neg_words)
if i>5:
break
input_embedding = self.in_embed(input_labels) # Batch_size * embed_size
pos_embedding = self.out_embed(pos_labels) # Batch_size * (2*C) * embed_size
neg_embedding = self.out_embed(neg_labels) # Batch_size * (2*C * K) * embed_size
input_embedding = input_embedding.unsqueeze(2) # Batch_size * embed_size * 1,unsqueeze(2)增加了第三个维度
# torch.bmm实现第一个维度不变,其余维度矩阵相乘
pos_dot = torch.bmm(pos_embedding, input_embedding).squeeze() # 本来是Batch_size * (2*C) * 1,squeeze()后变为B * (2*C)
neg_pot = torch.bmm(neg_embedding, -input_embedding).squeeze() # Batch_size * (2*C*K)
import torch.nn.functional as F
# log形式的sigmoid函数,用F.log(F.sigmoid)形式,可能会出现内存爆炸等一系列问题
log_pos = F.logsigmoid(pos_dot).sum(1)
log_neg = F.logsigmoid(neg_pot).sum(1)
import torch.nn.functional as F
import torch.nn as nn
# 只是纯粹调用函数,都是小写字母开头
F.tanh()
F.sigmoid()
# 在网络中增加激活层,均为大写字母开头
nn.Tanh()
nn.Sigmoid()
方法一:
# input_datas.xlsx存储复数的表格
data_input = pd.read_excel(r"E://Datas/input_datas.xlsx")
# print(data_1)
data_input = np.array(data_input)
# data_1 = data_1.reshape(1024,2)
data_input = data_input.tolist()
data_input = np.array(data_input)
data_input = data_input.astype(np.complex).tolist() # 数据类型转换成复数
data_input = np.array(data_input)
print(data_input.shape)
data_input_r = torch.tensor(np.real(data_input), dtype=torch.float32) # 实部
data_input_i = torch.tensor(np.imag(data_input), dtype=torch.float32) # 虚部
# 提取data_input_r中,除第一行之外的所有数据
new_data_input_r = torch.zeros((46,1024), dtype=torch.float32)
new_data_input_r = data_input_r[1:,:]
# print(data_input_r[46])
# print(new_data_input_r[45])
方法二:
# 训练集输入数据
data_input = pd.read_excel('/content/drive/My Drive/Colab Notebooks/工作簿6.xlsx') #
data_input = np.array(data_input)
data_input = data_input.tolist()
new = list()
for i in range(347):
for j in range(1024):
new.append(complex(data_input[i][j]))
data_input = np.array(new).reshape(347,1024)
# 打开梯度的两种方式
# 方法一
x = torch.ones(2, 2, requires_grad=True)
# 方法二
x.requires_grad_(True)
# model = torch.nn.Sequential(……)
for params in model.parameters():
params.requires_grad_(True)
# 无梯度运算
with torch.no_grad():
for param in model.parameters(): # 注意加括号
param -= learning_rate*param.grad
# 方法一
optimizer = torch.optim.SGD(model.parameters(), lr=0.05)
………………………………
optimizer.zero_grad()
# 方法二
model = torch.nn.Sequential(……)
model.zero_grad()
# 方法一
hidden_Layers = 100
NUM_DIGITS = 10
model = torch.nn.Sequential(
torch.nn.Linear(NUM_DIGITS, hidden_Layers), # 不能少逗号
torch.nn.ReLU(),
torch.nn.Linear(hidden_Layers, 4)
)
loss_fn = torch.nn.CrossEntropyLoss() # 多用作分类,集成了Softmax
optimizer = torch.optim.SGD(model.parameters(), lr=0.05)
……………………
y_pred = model(input_data)
loss = loss_fn(y_pred, y_label)
optimizer.zero_grad() # 梯度清零不能忘
loss.backward()
optimizer.step()
# 方法二
class TwoLayerNet(torch.nn.Module):
def __init__(self, n_features, n_hidden, n_out): # define the model architecture
super(TwoLayerNet, self).__init__()
self.linear1 = torch.nn.Linear(n_features, n_hidden) # 在句尾多家一个逗号,会报错
self.linear2 = torch.nn.Linear(n_hidden, n_out)
def forward(self, x):
y_before = F.relu(self.linear1(x))
y_pred = self.linear2(y_before)
# y_pred = self.linear2(self.linear1(x).clamp(min = 0))
return y_pred
net = TwoLayerNet(2, 10, 4)
loss_fn = torch.nn.MSELoss()
optimizer = torch.optim.Adam(model.parameters(), lr=0.05)
……………………
y_pred = net(input_data)
loss = loss_fn(y_pred, y_label)
optimizer.zero_grad() # 梯度清零不能忘
loss.backward()
optimizer.step()
optimizer = torch.optim.SGD(model.parameters(), lr=0.05)
optimizer = torch.optim.Adam(model.parameters(), lr=0.05)
………………
loss_fn = torch.nn.CrossEntropyLoss() # 多用作分类,集成了Softmax
loss_fn = torch.nn.MSELoss()
………………
loss = loss_fn(y_pred, y_label)
optimizer.zero_grad() # 梯度清零不能忘
loss.backward()
# 以Sequential方法为例
model = torch.nn.Sequential(
torch.nn.Linear(NUM_DIGITS, hidden_Layers), # 不能少逗号
torch.nn.ReLU(),
torch.nn.Linear(hidden_Layers, 4)
)
print(model)
print(model[0].weight)
# 修改方法如下:修改模型默认初始化的数据
torch.nn.init.normal_(model[0].weight)
torch.nn.init.normal_(model[2].weight)
运行结果:
Sequential(
(0): Linear(in_features=1000, out_features=100, bias=True)
(1): ReLU()
(2): Linear(in_features=100, out_features=10, bias=True)
)
Parameter containing:
tensor([[ 0.6446, 0.6133, -1.2414, ..., 0.7190, 0.1795, -0.1246],
[ 1.5737, -1.2386, -0.7058, ..., 0.8870, 0.0807, 0.4245],
[-0.8080, -2.5309, -0.9246, ..., -0.1821, -0.0434, -0.2618],
...,
[-0.6270, -1.0656, 1.3784, ..., 0.3057, -1.4967, -0.3401],
[ 0.9599, -0.0353, -1.1812, ..., 1.1073, 0.9129, 0.0291],
[-1.3919, -0.1804, 0.0903, ..., 0.5543, 0.3251, 1.8142]],
requires_grad=True)
两条语句有固定的使用场景。
model.train()
model.eval()
同时发现,如果不使用这两条语句,程序也可以运行。这两个方法是针对在网络train和eval时采用不同方式的情况,比如Batch Normalization和Dropout。下面对这Batch Normalization和Dropout做一下详细的解析:
BN的作用主要是对网络中间的每层进行归一化处理,并且使用变换重构(Batch Normalization Transform)保证每层所提取的特征分布不会被破坏。
训练时是针对每个mini-batch的,但是在测试中往往是针对单张图片,即不存在mini-batch的概念。由于网络训练完毕后参数都是固定的,因此每个batch的均值和方差都是不变的,因此直接结算所有batch的均值和方差。所有Batch Normalization的训练和测试时的操作不同。
import torch
torch.save(model.state_dict(),path):
功能:保存训练完的网络的各层参数(即weights和bias)
其中:model.state_dict()获取各层参数,path是文件存放路径(通常保存文件格式为.pt或.pth)
import torch
model2 = Sequential(…………)
model2 = TheModelClass(*args, **kwargs)
model2.load_state_dict(torch.load(PATH))
model2.eval()
# 必须在加载模型后调用model.eval函数来将dropout及批归一化层设置为预测模式。如果不这么做结果出错。
功能:加载保存到path中的各层参数到神经网络
注意:不可以直接为torch.load_state_dict(path),此函数不能直接接收字符串类型参数
torch.save(net,path):
功能:保存训练完的整个网络模型(不止weights和bias)
net2=torch.load(path):
功能:加载保存到path中的整个神经网络
说明:官方推荐方式一,原因自然是保存的内容少,速度会更快。
# 0.5,表示每调用一次 lr 降一半。
scheduler = torch.optim.lr_scheduler.ExponentialLR(optimizer,0.5)
loss_list = []
for epoch in range(20000):
for start in range(0, 346, batch_size): #
end = start + batch_size
batch_input_datas = new_data_input_r[start:end]
batch_label_datas = new_data_label_r[start:end]
acc_sum, err_sum =0.0, 0.0
new_y_pred_r = model(batch_input_datas)
loss = loss_fn(new_y_pred_r, batch_label_datas)
# 训练准确率:
if epoch % 50 ==0:
#############################################################################
loss_list = loss.item() # 将loss保存在列表中
if len(loss_list) ==0 or loss_list < min(loss_list):
torch.save(model.state_dict(), 'lm.pth')
print("best model saved to lm.pth")
else: # 模型loss没有下降时:
# learning rate decay:下降学习率。
# 也可以设置loss三次没下降,调用该函数
scheduler.step() # 必须放在optimizer.step()之后
未完待续…………