教程视频:https://www.bilibili.com/video/av93365242
本期教学视频讲的是关于反向传播的过程
其中y=w*x线性模型,用pytorch实现反向传播代码如下:
import numpy as np
import matplotlib.pyplot as plt
import torch
x_data = [1.0,2.0,3.0]
y_data = [2.0,4.0,6.0]
w = torch.Tensor([1.0])#初始权值
w.requires_grad = True#计算梯度,默认是不计算的
def forward(x):
return x * w
def loss(x,y):#构建计算图
y_pred = forward(x)
return (y_pred-y) **2
print('Predict (befortraining)',4,forward(4))
for epoch in range(100):
l = loss(1, 2)#为了在for循环之前定义l,以便之后的输出,无实际意义
for x,y in zip(x_data,y_data):
l = loss(x, y)
l.backward()
print('\tgrad:',x,y,w.grad.item())
w.data = w.data - 0.01*w.grad.data #注意这里的grad是一个tensor,所以要取他的data
w.grad.data.zero_() #释放之前计算的梯度
print('Epoch:',epoch,l.item())
print('Predict(after training)',4,forward(4).item())
运行结果:
Predict (befortraining) 4 tensor([4.], grad_fn=)
grad: 1.0 2.0 -2.0
grad: 2.0 4.0 -7.840000152587891
grad: 3.0 6.0 -16.228801727294922
Epoch: 0 7.315943717956543
grad: 1.0 2.0 -1.478623867034912
grad: 2.0 4.0 -5.796205520629883
grad: 3.0 6.0 -11.998146057128906
Epoch: 1 3.9987640380859375
grad: 1.0 2.0 -1.0931644439697266
grad: 2.0 4.0 -4.285204887390137
grad: 3.0 6.0 -8.870372772216797
Epoch: 2 2.1856532096862793
grad: 1.0 2.0 -0.8081896305084229
grad: 2.0 4.0 -3.1681032180786133
grad: 3.0 6.0 -6.557973861694336
Epoch: 3 1.1946394443511963
grad: 1.0 2.0 -0.5975041389465332
grad: 2.0 4.0 -2.3422164916992188
grad: 3.0 6.0 -4.848389625549316
Epoch: 4 0.6529689431190491
grad: 1.0 2.0 -0.4417421817779541
grad: 2.0 4.0 -1.7316293716430664
grad: 3.0 6.0 -3.58447265625
Epoch: 5 0.35690122842788696
grad: 1.0 2.0 -0.3265852928161621
grad: 2.0 4.0 -1.2802143096923828
grad: 3.0 6.0 -2.650045394897461
Epoch: 6 0.195076122879982
grad: 1.0 2.0 -0.24144840240478516
grad: 2.0 4.0 -0.9464778900146484
grad: 3.0 6.0 -1.9592113494873047
Epoch: 7 0.10662525147199631
grad: 1.0 2.0 -0.17850565910339355
grad: 2.0 4.0 -0.699742317199707
grad: 3.0 6.0 -1.4484672546386719
Epoch: 8 0.0582793727517128
grad: 1.0 2.0 -0.1319713592529297
grad: 2.0 4.0 -0.5173273086547852
grad: 3.0 6.0 -1.070866584777832
Epoch: 9 0.03185431286692619
grad: 1.0 2.0 -0.09756779670715332
grad: 2.0 4.0 -0.3824653625488281
grad: 3.0 6.0 -0.7917022705078125
Epoch: 10 0.017410902306437492
grad: 1.0 2.0 -0.07213282585144043
grad: 2.0 4.0 -0.2827606201171875
grad: 3.0 6.0 -0.5853137969970703
……
Epoch: 90 9.094947017729282e-13
grad: 1.0 2.0 -7.152557373046875e-07
grad: 2.0 4.0 -2.86102294921875e-06
grad: 3.0 6.0 -5.7220458984375e-06
Epoch: 91 9.094947017729282e-13
grad: 1.0 2.0 -7.152557373046875e-07
grad: 2.0 4.0 -2.86102294921875e-06
grad: 3.0 6.0 -5.7220458984375e-06
Epoch: 92 9.094947017729282e-13
grad: 1.0 2.0 -7.152557373046875e-07
grad: 2.0 4.0 -2.86102294921875e-06
grad: 3.0 6.0 -5.7220458984375e-06
Epoch: 93 9.094947017729282e-13
grad: 1.0 2.0 -7.152557373046875e-07
grad: 2.0 4.0 -2.86102294921875e-06
grad: 3.0 6.0 -5.7220458984375e-06
Epoch: 94 9.094947017729282e-13
grad: 1.0 2.0 -7.152557373046875e-07
grad: 2.0 4.0 -2.86102294921875e-06
grad: 3.0 6.0 -5.7220458984375e-06
Epoch: 95 9.094947017729282e-13
grad: 1.0 2.0 -7.152557373046875e-07
grad: 2.0 4.0 -2.86102294921875e-06
grad: 3.0 6.0 -5.7220458984375e-06
Epoch: 96 9.094947017729282e-13
grad: 1.0 2.0 -7.152557373046875e-07
grad: 2.0 4.0 -2.86102294921875e-06
grad: 3.0 6.0 -5.7220458984375e-06
Epoch: 97 9.094947017729282e-13
grad: 1.0 2.0 -7.152557373046875e-07
grad: 2.0 4.0 -2.86102294921875e-06
grad: 3.0 6.0 -5.7220458984375e-06
Epoch: 98 9.094947017729282e-13
grad: 1.0 2.0 -7.152557373046875e-07
grad: 2.0 4.0 -2.86102294921875e-06
grad: 3.0 6.0 -5.7220458984375e-06
Epoch: 99 9.094947017729282e-13
Predict(after training) 4 7.999998569488525
1、手动推导线性模型y=w*x,损失函数loss=(ŷ-y)²下,当数据集x=2,y=4的时候,反向传播的过程。
答:
2、手动推导线性模型 y=w*x+b,损失函数loss=(ŷ-y)²下,当数据集x=1,y=2的时候,反向传播的过程。
答:
3、画出二次模型y=w1x²+w2x+b,损失函数loss=(ŷ-y)²的计算图,并且手动推导反向传播的过程,最后用pytorch的代码实现。
import numpy as np
import matplotlib.pyplot as plt
import torch
x_data = [1.0,2.0,3.0]
y_data = [2.0,4.0,6.0]
w1 = torch.Tensor([1.0])#初始权值
w1.requires_grad = True#计算梯度,默认是不计算的
w2 = torch.Tensor([1.0])
w2.requires_grad = True
b = torch.Tensor([1.0])
b.requires_grad = True
def forward(x):
return w1 * x**2 + w2 * x + b
def loss(x,y):#构建计算图
y_pred = forward(x)
return (y_pred-y) **2
print('Predict (befortraining)',4,forward(4))
for epoch in range(100):
l = loss(1, 2)#为了在for循环之前定义l,以便之后的输出,无实际意义
for x,y in zip(x_data,y_data):
l = loss(x, y)
l.backward()
print('\tgrad:',x,y,w1.grad.item(),w2.grad.item(),b.grad.item())
w1.data = w1.data - 0.01*w1.grad.data #注意这里的grad是一个tensor,所以要取他的data
w2.data = w2.data - 0.01 * w2.grad.data
b.data = b.data - 0.01 * b.grad.data
w1.grad.data.zero_() #释放之前计算的梯度
w2.grad.data.zero_()
b.grad.data.zero_()
print('Epoch:',epoch,l.item())
print('Predict(after training)',4,forward(4).item())
结果:
Predict (befortraining) 4 tensor([21.], grad_fn=)
grad: 1.0 2.0 2.0 2.0 2.0
grad: 2.0 4.0 22.880001068115234 11.440000534057617 5.720000267028809
grad: 3.0 6.0 77.04720306396484 25.682401657104492 8.560800552368164
Epoch: 0 18.321826934814453
grad: 1.0 2.0 -1.1466078758239746 -1.1466078758239746 -1.1466078758239746
grad: 2.0 4.0 -15.536651611328125 -7.7683258056640625 -3.8841629028320312
grad: 3.0 6.0 -30.432214736938477 -10.144071578979492 -3.381357192993164
Epoch: 1 2.858394145965576
grad: 1.0 2.0 0.3451242446899414 0.3451242446899414 0.3451242446899414
grad: 2.0 4.0 2.4273414611816406 1.2136707305908203 0.6068353652954102
grad: 3.0 6.0 19.449920654296875 6.483306884765625 2.161102294921875
Epoch: 2 1.1675907373428345
grad: 1.0 2.0 -0.32242679595947266 -0.32242679595947266 -0.32242679595947266
grad: 2.0 4.0 -5.845773696899414 -2.922886848449707 -1.4614434242248535
grad: 3.0 6.0 -3.8828859329223633 -1.294295310974121 -0.43143177032470703
Epoch: 3 0.04653334245085716
grad: 1.0 2.0 0.01369333267211914 0.01369333267211914 0.01369333267211914
grad: 2.0 4.0 -1.9140911102294922 -0.9570455551147461 -0.47852277755737305
grad: 3.0 6.0 6.855700492858887 2.285233497619629 0.761744499206543
Epoch: 4 0.14506366848945618
grad: 1.0 2.0 -0.11818885803222656 -0.11818885803222656 -0.11818885803222656
grad: 2.0 4.0 -3.664388656616211 -1.8321943283081055 -0.9160971641540527
grad: 3.0 6.0 1.7454700469970703 0.5818233489990234 0.1939411163330078
Epoch: 5 0.009403289295732975
grad: 1.0 2.0 -0.03326845169067383 -0.03326845169067383 -0.03326845169067383
grad: 2.0 4.0 -2.7738723754882812 -1.3869361877441406 -0.6934680938720703
grad: 3.0 6.0 4.014009475708008 1.338003158569336 0.4460010528564453
Epoch: 6 0.04972923547029495
grad: 1.0 2.0 -0.050147056579589844 -0.050147056579589844 -0.050147056579589844
grad: 2.0 4.0 -3.1150074005126953 -1.5575037002563477 -0.7787518501281738
grad: 3.0 6.0 2.8533897399902344 0.9511299133300781 0.3170433044433594
Epoch: 7 0.025129113346338272
grad: 1.0 2.0 -0.020544052124023438 -0.020544052124023438 -0.020544052124023438
grad: 2.0 4.0 -2.8858280181884766 -1.4429140090942383 -0.7214570045471191
grad: 3.0 6.0 3.292379379272461 1.0974597930908203 0.36581993103027344
Epoch: 8 0.03345605731010437
grad: 1.0 2.0 -0.013420581817626953 -0.013420581817626953 -0.013420581817626953
grad: 2.0 4.0 -2.9246826171875 -1.46234130859375 -0.731170654296875
grad: 3.0 6.0 2.990907669067383 0.9969692230224609 0.3323230743408203
Epoch: 9 0.027609655633568764
grad: 1.0 2.0 0.0033445358276367188 0.0033445358276367188 0.0033445358276367188
grad: 2.0 4.0 -2.841381072998047 -1.4206905364990234 -0.7103452682495117
grad: 3.0 6.0 3.0377025604248047 1.0125675201416016 0.3375225067138672
Epoch: 10 0.02848036028444767
grad: 1.0 2.0 0.014836311340332031 0.014836311340332031 0.014836311340332031
grad: 2.0 4.0 -2.8173885345458984 -1.4086942672729492 -0.7043471336364746
grad: 3.0 6.0 2.9260196685791016 0.9753398895263672 0.32511329650878906
……
Epoch: 90 0.0065572685562074184
grad: 1.0 2.0 0.314302921295166 0.314302921295166 0.314302921295166
grad: 2.0 4.0 -1.7493114471435547 -0.8746557235717773 -0.43732786178588867
grad: 3.0 6.0 1.4542293548583984 0.4847431182861328 0.16158103942871094
Epoch: 91 0.0065271081402897835
grad: 1.0 2.0 0.31465959548950195 0.31465959548950195 0.31465959548950195
grad: 2.0 4.0 -1.7466468811035156 -0.8733234405517578 -0.4366617202758789
grad: 3.0 6.0 1.4509849548339844 0.4836616516113281 0.16122055053710938
Epoch: 92 0.00649801641702652
grad: 1.0 2.0 0.31499528884887695 0.31499528884887695 0.31499528884887695
grad: 2.0 4.0 -1.7440509796142578 -0.8720254898071289 -0.43601274490356445
grad: 3.0 6.0 1.4478435516357422 0.48261451721191406 0.1608715057373047
Epoch: 93 0.0064699104987084866
grad: 1.0 2.0 0.3153109550476074 0.3153109550476074 0.3153109550476074
grad: 2.0 4.0 -1.7415199279785156 -0.8707599639892578 -0.4353799819946289
grad: 3.0 6.0 1.4447879791259766 0.4815959930419922 0.16053199768066406
Epoch: 94 0.006442630663514137
grad: 1.0 2.0 0.31560707092285156 0.31560707092285156 0.31560707092285156
grad: 2.0 4.0 -1.7390518188476562 -0.8695259094238281 -0.43476295471191406
grad: 3.0 6.0 1.4418182373046875 0.4806060791015625 0.1602020263671875
Epoch: 95 0.006416172254830599
grad: 1.0 2.0 0.3158855438232422 0.3158855438232422 0.3158855438232422
grad: 2.0 4.0 -1.7366409301757812 -0.8683204650878906 -0.4341602325439453
grad: 3.0 6.0 1.4389429092407227 0.4796476364135742 0.1598825454711914
Epoch: 96 0.006390606984496117
grad: 1.0 2.0 0.3161449432373047 0.3161449432373047 0.3161449432373047
grad: 2.0 4.0 -1.7342891693115234 -0.8671445846557617 -0.43357229232788086
grad: 3.0 6.0 1.436136245727539 0.4787120819091797 0.15957069396972656
Epoch: 97 0.0063657015562057495
grad: 1.0 2.0 0.3163881301879883 0.3163881301879883 0.3163881301879883
grad: 2.0 4.0 -1.7319889068603516 -0.8659944534301758 -0.4329972267150879
grad: 3.0 6.0 1.4334239959716797 0.47780799865722656 0.1592693328857422
Epoch: 98 0.0063416799530386925
grad: 1.0 2.0 0.31661415100097656 0.31661415100097656 0.31661415100097656
grad: 2.0 4.0 -1.7297439575195312 -0.8648719787597656 -0.4324359893798828
grad: 3.0 6.0 1.4307546615600586 0.47691822052001953 0.15897274017333984
Epoch: 99 0.00631808303296566
Predict(after training) 4 8.544171333312988
总结与感悟:
在用y=w1x²+w2x+b的模型训练100次后可以看到当x=4时,y=8.5,与正确值8相差比较大。原因可能是数据集本身是一次函数的数据,模型是二次函数。所以模型本身就不适合这个数据集,所以才导致预测结果和正确值相差比较大的情况。