Pytorch之梯度下降算法

目录

复习:

线性模型:

分治法:

优化问题:

梯度下降算法的难题:

梯度下降算法的优势:

求损失函数的过程:

代码及运行结果如下:

代码:

运行结果:

随机梯度下降:

为什么要使用随机梯度下降算法:

代码如下:

运行结果如下:

注意:


梯度下降算法是深度学习中最常用的算法之一。下面我们一起来学习梯度下降算法。

复习:

首先让我们回忆一下上一篇中所讲的学习目标:

  • 使用Pytorch来构建一个学习系统。(What should be the best model for the data?)

首先第一步我们要选择the best model,什么模型对于数据来说是最佳的?

线性模型:

其中最简单的模型是我们在初中就已经掌握住的线性模型,也叫做Linear model。

最简化的线性模型是\widehat{y}=x*w

Pytorch之梯度下降算法_第1张图片

在给定的输入输出值中,输入维度只有1,输出维度也只有1,所以我们可以把输入输出在二维平面坐标系中表现出来。即上图。如果我们的模型能经过所有的数据点,我们就认为它与我们的数据之间是没有误差的。

Pytorch之梯度下降算法_第2张图片

但其实我们刚开始不知道w是多少,所以我们刚开始要进行随机猜测,w可能是1,也可能是2,甚至可能是0.5。w(也叫权重)猜测错误会造成预测值和真实值之间的偏差,在图中蓝色和绿色的直线表示预测值。与中间的红色直线明显有偏差。这时我们希望所随机到的权重接近真实的权重,也就是预测的直线尽可能经过或接近数据点。

我们力争使cost=\frac{1}{N}\sum_{n=1}^{N}\left ( \widehat{y}_{n} -y_{n}\right)^{2}最小,即越接近真实的直线。

我们对于不同的权重,得到的lost结果不同。可见,当权重w=2的时候,cost是最小的。

Pytorch之梯度下降算法_第3张图片

实际上在大多数情况下,损失(cost)不会像我们希望的那样是一条完美的二次图像曲线。维度低的时候我们推荐使用线性搜索,但是如果维度过高,假设有两个权重w1和w2,那我们就要在平面里面搜索w1和w2,所需要搜索的数据数量也是以指数形式增长的。如果权重更多比如说十个权重,那么我们的搜索量就是相当的恐怖了。程序几乎运行不了。Pytorch之梯度下降算法_第4张图片

分治法:

假设我们需要对二维平面进行搜索,即有两个权重,w1,w2,我们需要在两个方向上各取16个数据,那么如果我们逐个搜索,那么我们需要搜索256个数据,但是如果我们可以在256个数据中均匀的取16个数据,每个方向上间隔均匀的取4个数据,逐个搜索,损失最小的再扩展成4*4,进行搜索,其中损失最小的我们认为是最优解,对应的w1和w2我们认为是最优权重。红色部分表示我们的第一次搜索,绿色部分,当然只是指左边的那块儿绿色,表示我们的第二次搜索。

Pytorch之梯度下降算法_第5张图片

这样我们就只用搜索16+16=32次即可。

但是这种方法也不行,这是因为我们的损失图像并不一定是非常漂亮的图,它有可能是

Pytorch之梯度下降算法_第6张图片

这种图,找最优值的时候,可能会陷入局部最优解。

Pytorch之梯度下降算法_第7张图片

我们选取四个点进行搜索,其中第三个点是最低点,但是我们如果围绕着第三个点进行搜索,就找不到最低点,即最优解。分治法有可能会错过非常好的目标点。而且,当维度足够大的时候,即权重数量较多,也很难进行分治法进行搜索。而且产生的误差更大。

优化问题:

我们将寻找使损失函数最小的w的问题称为优化问题。

Pytorch之梯度下降算法_第8张图片

Pytorch之梯度下降算法_第9张图片

假设我们有这样一个损失函数,初始点在红色部分。我们希望得到的是该图中最低点对应的w值。

我们可以经过逐步迭代,使w更加接近最小值点。这是我们希望首先确定在下一次迭代是将w向左移动还是向右移动。我们可以通过我们高中学的知识,求导的方法来使其逼近最低点,当这一点的导数(y对x的导数,也就是cost对w的导数)大于0,说明这里的趋势是上升的,需要向左寻找,当这里的导数小于0,即表示向右是下降的方向,这时候应该向右寻找。

Pytorch之梯度下降算法_第10张图片

 我们使用已有的权重减去求导结果,(导数大于0表示已有权重在目标权重右边(已有权重大于目标权重),使之向左移动(减少)。导数小于0,同理)。同时我们还需要注意学习率就是\alpha,学习率选择要适当,学习率小的话,每次更新权重幅度比较小,需要经过更多轮才能逼近目标权重。但是经过足够多的迭代,得到的权重和目标权重的误差较小。如果权重选择过大,那么每次迭代改变的幅度比较大。假设是从右边开始向左移动,很可能跨步太大到目标权重左边去,甚至造成迭代后的权重对应的损失比原有权重的损失还要大。难以进行收敛。

可见,我们每次迭代都往着下降最快的方向去更新权重。这就是典型的贪心算法,我们只看中眼前最好的选择。我们学过算法的都知道,贪心算法不一定能得到最优解,那么我们的梯度下降算法也是一样,我们不能确定得到最优解。但是我们能够得到在局部区域最优的结果。

Pytorch之梯度下降算法_第11张图片

假设我们的损失函数长这样,最开始我们的权重是圆圈所在的位置,我们采用梯度下降算法,不会到达该曲线的最低点,只会到达局部最优点。

Pytorch之梯度下降算法_第12张图片

实心点是我们最终抵达的点,但是明显不是我们的目标点。

在优化问题中,我们把这种函数(并非只有一个极小值点的函数)叫做非凸函数。

梯度下降算法的难题:

只能找到局部最优,不一定能找到全局最优。

梯度下降算法的优势:

既然我们很难找到全局最优解,那么为什么我们在深度学习里面经常使用梯度下降算法呢?

很早之前大家认为在深度神经网络里面有很多局部最优点,但是经过不断的研究发现,在深度神经网络里面并没有很多的局部最优点。局部最优点其实很少,我们很难陷入到局部最优点。但是存在一种特殊的点,叫做鞍点,鞍点处梯度等于0。一维函数中表示梯度为0。

Pytorch之梯度下降算法_第13张图片

在图中,圆圈所在的点就是鞍点,梯度为0,根据迭代的公式,很明显,迭代后的权重不变,这时就陷入了这个点,也就是走不动了。高维空间里面的鞍点是曲面的最低点,有点像马鞍的最低点。

在深度学习里面,我们要解决的最大的问题其实不是局部最优问题,局部最优问题还是很好解决的,最大的问题是鞍点的问题。陷入鞍点很难迭代。

求损失函数的过程:

Pytorch之梯度下降算法_第14张图片

代码及运行结果如下:

代码:

x_data = [1.0, 2.0, 3.0]
y_data = [2.0, 4.0, 6.0]

w = 1.0
def forward(x):
    return x*w
def cost(xs, ys):
    cost = 0
    for x,y in zip(xs,ys):
        y_pred = forward(x);
        cost += (y_pred - y)**2
    return cost/len(xs)
def gradient(xs, ys):
    grad = 0
    for x,y in zip(xs,ys):
        grad += 2*x*(x*w-y)
    return grad/len(xs)
print('Predict (before training)', 4, forward(4))
for epoch in range(100):
    cost_val = cost(x_data, y_data)
    grad_val = gradient(x_data,y_data)
    w -= 0.01*grad_val
    print('Epoch:', epoch, 'w=', w, 'loss=', cost_val)
print('Predict(after training)', 4, forward(4))

运行结果:

C:\Users\86156\PycharmProjects\pythonProject1\venv\Scripts\python.exe C:\Users\86156\PycharmProjects\pythonProject1\main.py 
Predict (before training) 4 4.0
Epoch: 0 w= 1.0933333333333333 loss= 4.666666666666667
Epoch: 1 w= 1.1779555555555554 loss= 3.8362074074074086
Epoch: 2 w= 1.2546797037037036 loss= 3.1535329869958857
Epoch: 3 w= 1.3242429313580246 loss= 2.592344272332262
Epoch: 4 w= 1.3873135910979424 loss= 2.1310222071581117
Epoch: 5 w= 1.4444976559288012 loss= 1.7517949663820642
Epoch: 6 w= 1.4963445413754464 loss= 1.440053319920117
Epoch: 7 w= 1.5433523841804047 loss= 1.1837878313441108
Epoch: 8 w= 1.5859728283235668 loss= 0.9731262101573632
Epoch: 9 w= 1.6246153643467005 loss= 0.7999529948031382
Epoch: 10 w= 1.659651263674342 loss= 0.6575969151946154
Epoch: 11 w= 1.6914171457314033 loss= 0.5405738908195378
Epoch: 12 w= 1.7202182121298057 loss= 0.44437576375991855
Epoch: 13 w= 1.7463311789976905 loss= 0.365296627844598
Epoch: 14 w= 1.7700069356245727 loss= 0.3002900634939416
Epoch: 15 w= 1.7914729549662791 loss= 0.2468517784170642
Epoch: 16 w= 1.8109354791694263 loss= 0.2029231330489788
Epoch: 17 w= 1.8285815011136133 loss= 0.16681183417217407
Epoch: 18 w= 1.8445805610096762 loss= 0.1371267415488235
Epoch: 19 w= 1.8590863753154396 loss= 0.11272427607497944
Epoch: 20 w= 1.872238313619332 loss= 0.09266436490145864
Epoch: 21 w= 1.8841627376815275 loss= 0.07617422636521683
Epoch: 22 w= 1.8949742154979183 loss= 0.06261859959338009
Epoch: 23 w= 1.904776622051446 loss= 0.051475271914629306
Epoch: 24 w= 1.9136641373266443 loss= 0.04231496130368814
Epoch: 25 w= 1.9217221511761575 loss= 0.03478477885657844
Epoch: 26 w= 1.9290280837330496 loss= 0.02859463421027894
Epoch: 27 w= 1.9356521292512983 loss= 0.023506060193480772
Epoch: 28 w= 1.9416579305211772 loss= 0.01932302619282764
Epoch: 29 w= 1.9471031903392007 loss= 0.015884386331668398
Epoch: 30 w= 1.952040225907542 loss= 0.01305767153735723
Epoch: 31 w= 1.9565164714895047 loss= 0.010733986344664803
Epoch: 32 w= 1.9605749341504843 loss= 0.008823813841374291
Epoch: 33 w= 1.9642546069631057 loss= 0.007253567147113681
Epoch: 34 w= 1.9675908436465492 loss= 0.005962754575689583
Epoch: 35 w= 1.970615698239538 loss= 0.004901649272531298
Epoch: 36 w= 1.9733582330705144 loss= 0.004029373553099482
Epoch: 37 w= 1.975844797983933 loss= 0.0033123241439168096
Epoch: 38 w= 1.9780992835054327 loss= 0.0027228776607060357
Epoch: 39 w= 1.980143350378259 loss= 0.002238326453885249
Epoch: 40 w= 1.9819966376762883 loss= 0.001840003826269386
Epoch: 41 w= 1.983676951493168 loss= 0.0015125649231412608
Epoch: 42 w= 1.9852004360204722 loss= 0.0012433955919298103
Epoch: 43 w= 1.9865817286585614 loss= 0.0010221264385926248
Epoch: 44 w= 1.987834100650429 loss= 0.0008402333603648631
Epoch: 45 w= 1.9889695845897222 loss= 0.0006907091659248264
Epoch: 46 w= 1.9899990900280147 loss= 0.0005677936325753796
Epoch: 47 w= 1.9909325082920666 loss= 0.0004667516012495216
Epoch: 48 w= 1.9917788075181404 loss= 0.000383690560742734
Epoch: 49 w= 1.9925461188164473 loss= 0.00031541069384432885
Epoch: 50 w= 1.9932418143935788 loss= 0.0002592816085930997
Epoch: 51 w= 1.9938725783835114 loss= 0.0002131410058905752
Epoch: 52 w= 1.994444471067717 loss= 0.00017521137977565514
Epoch: 53 w= 1.9949629871013967 loss= 0.0001440315413480261
Epoch: 54 w= 1.9954331083052663 loss= 0.0001184003283899171
Epoch: 55 w= 1.9958593515301082 loss= 9.733033217332803e-05
Epoch: 56 w= 1.9962458120539648 loss= 8.000985883901657e-05
Epoch: 57 w= 1.9965962029289281 loss= 6.57716599593935e-05
Epoch: 58 w= 1.9969138906555615 loss= 5.406722767150764e-05
Epoch: 59 w= 1.997201927527709 loss= 4.444566413387458e-05
Epoch: 60 w= 1.9974630809584561 loss= 3.65363112808981e-05
Epoch: 61 w= 1.9976998600690001 loss= 3.0034471708953996e-05
Epoch: 62 w= 1.9979145397958935 loss= 2.4689670610172655e-05
Epoch: 63 w= 1.9981091827482769 loss= 2.0296006560253656e-05
Epoch: 64 w= 1.9982856590251044 loss= 1.6684219437262796e-05
Epoch: 65 w= 1.9984456641827613 loss= 1.3715169898293847e-05
Epoch: 66 w= 1.9985907355257035 loss= 1.1274479219506377e-05
Epoch: 67 w= 1.9987222668766378 loss= 9.268123006398985e-06
Epoch: 68 w= 1.9988415219681517 loss= 7.61880902783969e-06
Epoch: 69 w= 1.9989496465844576 loss= 6.262999634617916e-06
Epoch: 70 w= 1.9990476795699081 loss= 5.1484640551938914e-06
Epoch: 71 w= 1.9991365628100501 loss= 4.232266273994499e-06
Epoch: 72 w= 1.999217150281112 loss= 3.479110977946351e-06
Epoch: 73 w= 1.999290216254875 loss= 2.859983851026929e-06
Epoch: 74 w= 1.9993564627377531 loss= 2.3510338359374262e-06
Epoch: 75 w= 1.9994165262155628 loss= 1.932654303533636e-06
Epoch: 76 w= 1.999470983768777 loss= 1.5887277332523938e-06
Epoch: 77 w= 1.9995203586170245 loss= 1.3060048068548734e-06
Epoch: 78 w= 1.9995651251461022 loss= 1.0735939958924364e-06
Epoch: 79 w= 1.9996057134657994 loss= 8.825419799121559e-07
Epoch: 80 w= 1.9996425135423248 loss= 7.254887315754342e-07
Epoch: 81 w= 1.999675878945041 loss= 5.963839812987369e-07
Epoch: 82 w= 1.999706130243504 loss= 4.902541385825727e-07
Epoch: 83 w= 1.9997335580874436 loss= 4.0301069098738336e-07
Epoch: 84 w= 1.9997584259992822 loss= 3.312926995781724e-07
Epoch: 85 w= 1.9997809729060159 loss= 2.723373231729343e-07
Epoch: 86 w= 1.9998014154347876 loss= 2.2387338352920307e-07
Epoch: 87 w= 1.9998199499942075 loss= 1.8403387118941732e-07
Epoch: 88 w= 1.9998367546614149 loss= 1.5128402140063082e-07
Epoch: 89 w= 1.9998519908930161 loss= 1.2436218932547864e-07
Epoch: 90 w= 1.9998658050763347 loss= 1.0223124683409346e-07
Epoch: 91 w= 1.9998783299358769 loss= 8.403862850836479e-08
Epoch: 92 w= 1.9998896858085284 loss= 6.908348768398496e-08
Epoch: 93 w= 1.9998999817997325 loss= 5.678969725349543e-08
Epoch: 94 w= 1.9999093168317574 loss= 4.66836551287917e-08
Epoch: 95 w= 1.9999177805941268 loss= 3.8376039345125727e-08
Epoch: 96 w= 1.9999254544053418 loss= 3.154680994333735e-08
Epoch: 97 w= 1.9999324119941766 loss= 2.593287985380858e-08
Epoch: 98 w= 1.9999387202080534 loss= 2.131797981222471e-08
Epoch: 99 w= 1.9999444396553017 loss= 1.752432687141379e-08
Predict(after training) 4 7.999777758621207

进程已结束,退出代码为 0

但是损失函数理论上不会非常平滑,训练过程可能会出现下图所示:

Pytorch之梯度下降算法_第15张图片

很明显虽然局部会有一些波动,但是整体上是收敛的。

这时我们需要对其进行平滑处理可以采用指数加权均值的方法。C0,C1,C2....分别表示第n轮的损失。上图表示平滑处理过程。

Pytorch之梯度下降算法_第16张图片

这样的图没有进行收敛,很明显训练失败,训练失败的原因有很多,最常见的一种就是学习率取的太大了。可以把学习率降低一点来重新训练。

随机梯度下降:

实际上,在深度学习里面,梯度下降用到的挺少的,我们通常使用其延伸的版本,叫做随机梯度下降。(Stochastic Gradient Descent)。

Pytorch之梯度下降算法_第17张图片

梯度下降算法使用的是全体的损失,对权重w求导。一共有n个样本,而随机梯度下降是在n个样本中随机选取一个样本对权重w进行求导,获得梯度。loss表示单个样本损失,cost表示总损失。

为什么要使用随机梯度下降算法:

使用随机梯度下降算法是我们解决鞍点问题的一个策略,我们的损失函数可能长成这样:

Pytorch之梯度下降算法_第18张图片

如果此时我们在鞍点,根据梯度下降的方法,我们显然很难移动,这时我们不妨采用随机梯度下降算法,数据是带有噪声的,而噪声带来的偏差可能帮助我们移动,选取个体可以使噪声的作用放大。从而有希望走出鞍点。

Pytorch之梯度下降算法_第19张图片

而且随机梯度下降算法求梯度的方式也变得很简单了,我们没必要把所有样本损失的均值都表示出来,只需要表示单个样本损失的均值表现出来就行了。 

代码如下:

x_data = [1.0, 2.0, 3.0]
y_data = [2.0, 4.0, 6.0]

w = 1.0
def forward(x):
    return x*w
def loss(x,y):
    y_pred = forward(x)
    return (y_pred - y)**2
def gradient(x, y):
    return 2*x*(x*w-y)
print('Predict (before training)', 4, forward(4))
for epoch in range(100):
   for x,y in zip(x_data,y_data):
       grad = gradient(x, y)
       w = w-0.01*grad
       print('\tgard:', x, y, grad)
       l = loss(x, y)
       print('Epoch:', epoch, 'w=', w, 'loss=', l)
print('Predict(after training)', 4, forward(4))

运行结果如下:

C:\Users\86156\PycharmProjects\pythonProject1\venv\Scripts\python.exe C:\Users\86156\PycharmProjects\pythonProject1\main.py 
Predict (before training) 4 4.0
	gard: 1.0 2.0 -2.0
Epoch: 0 w= 1.02 loss= 0.9603999999999999
	gard: 2.0 4.0 -7.84
Epoch: 0 w= 1.0984 loss= 3.2515302399999997
	gard: 3.0 6.0 -16.2288
Epoch: 0 w= 1.260688 loss= 4.919240100095999
	gard: 1.0 2.0 -1.478624
Epoch: 1 w= 1.27547424 loss= 0.5249375769035775
	gard: 2.0 4.0 -5.796206079999999
Epoch: 1 w= 1.3334363008 loss= 1.7772286603647518
	gard: 3.0 6.0 -11.998146585599997
Epoch: 1 w= 1.453417766656 loss= 2.688769240265834
	gard: 1.0 2.0 -1.093164466688
Epoch: 2 w= 1.4643494113228799 loss= 0.28692155315014534
	gard: 2.0 4.0 -4.285204709416961
Epoch: 2 w= 1.5072014584170494 loss= 0.9714016103451322
	gard: 3.0 6.0 -8.87037374849311
Epoch: 2 w= 1.5959051959019805 loss= 1.4696334962911515
	gard: 1.0 2.0 -0.8081896081960389
Epoch: 3 w= 1.603987091983941 loss= 0.15682622331533563
	gard: 2.0 4.0 -3.1681032641284723
Epoch: 3 w= 1.6356681246252256 loss= 0.5309508616564006
	gard: 3.0 6.0 -6.557973756745939
Epoch: 3 w= 1.701247862192685 loss= 0.8032755585999681
	gard: 1.0 2.0 -0.59750427561463
Epoch: 4 w= 1.7072229049488312 loss= 0.08571842738660111
	gard: 2.0 4.0 -2.3422167604093502
Epoch: 4 w= 1.7306450725529248 loss= 0.29020830776007667
	gard: 3.0 6.0 -4.848388694047353
Epoch: 4 w= 1.7791289594933983 loss= 0.43905614881022015
	gard: 1.0 2.0 -0.44174208101320334
Epoch: 5 w= 1.7835463803035303 loss= 0.046852169479703935
	gard: 2.0 4.0 -1.7316289575717576
Epoch: 5 w= 1.8008626698792478 loss= 0.15862270499048575
	gard: 3.0 6.0 -3.584471942173538
Epoch: 5 w= 1.836707389300983 loss= 0.2399802903801062
	gard: 1.0 2.0 -0.3265852213980338
Epoch: 6 w= 1.8399732415149634 loss= 0.02560856343122822
	gard: 2.0 4.0 -1.2802140678802925
Epoch: 6 w= 1.8527753821937665 loss= 0.08670035235276613
	gard: 3.0 6.0 -2.650043120512205
Epoch: 6 w= 1.8792758133988885 loss= 0.1311689630744999
	gard: 1.0 2.0 -0.241448373202223
Epoch: 7 w= 1.8816902971309106 loss= 0.013997185792972214
	gard: 2.0 4.0 -0.946477622952715
Epoch: 7 w= 1.8911550733604379 loss= 0.04738887222068665
	gard: 3.0 6.0 -1.9592086795121197
Epoch: 7 w= 1.910747160155559 loss= 0.07169462478267678
	gard: 1.0 2.0 -0.17850567968888198
Epoch: 8 w= 1.912532216952448 loss= 0.0076506130712536416
	gard: 2.0 4.0 -0.6997422643804168
Epoch: 8 w= 1.919529639596252 loss= 0.025901915614036375
	gard: 3.0 6.0 -1.4484664872674653
Epoch: 8 w= 1.9340143044689266 loss= 0.03918700813247573
	gard: 1.0 2.0 -0.13197139106214673
Epoch: 9 w= 1.9353340183795482 loss= 0.004181689178936608
	gard: 2.0 4.0 -0.5173278529636143
Epoch: 9 w= 1.9405072969091843 loss= 0.014157526884207792
	gard: 3.0 6.0 -1.0708686556346834
Epoch: 9 w= 1.9512159834655312 loss= 0.021418922423117836
	gard: 1.0 2.0 -0.09756803306893769
Epoch: 10 w= 1.9521916637962204 loss= 0.002285637010573619
	gard: 2.0 4.0 -0.38246668963023644
Epoch: 10 w= 1.956016330692523 loss= 0.007738252662998013
	gard: 3.0 6.0 -0.7917060475345892
Epoch: 10 w= 1.9639333911678687 loss= 0.01170720245384975
	gard: 1.0 2.0 -0.07213321766426262
Epoch: 11 w= 1.9646547233445113 loss= 0.0012492885818530344
	gard: 2.0 4.0 -0.2827622132439096
Epoch: 11 w= 1.9674823454769503 loss= 0.00422959142272165
	gard: 3.0 6.0 -0.5853177814148953
Epoch: 11 w= 1.9733355232910992 loss= 0.006398948863435593
	gard: 1.0 2.0 -0.05332895341780164
Epoch: 12 w= 1.9738688128252773 loss= 0.000682838943160394
	gard: 2.0 4.0 -0.2090494973977819
Epoch: 12 w= 1.975959307799255 loss= 0.0023118195259638472
	gard: 3.0 6.0 -0.4327324596134101
Epoch: 12 w= 1.9802866323953892 loss= 0.003497551760830656
	gard: 1.0 2.0 -0.039426735209221686
Epoch: 13 w= 1.9806808997474814 loss= 0.00037322763456686407
	gard: 2.0 4.0 -0.15455280202014876
Epoch: 13 w= 1.9822264277676829 loss= 0.0012635994795895775
	gard: 3.0 6.0 -0.3199243001817109
Epoch: 13 w= 1.9854256707695 loss= 0.001911699652671057
	gard: 1.0 2.0 -0.02914865846100012
Epoch: 14 w= 1.98571715735411 loss= 0.00020399959404725638
	gard: 2.0 4.0 -0.11426274116712065
Epoch: 14 w= 1.9868597847657812 loss= 0.0006906610256063856
	gard: 3.0 6.0 -0.2365238742159388
Epoch: 14 w= 1.9892250235079405 loss= 0.0010449010656399273
	gard: 1.0 2.0 -0.021549952984118992
Epoch: 15 w= 1.9894405230377816 loss= 0.00011150255371562055
	gard: 2.0 4.0 -0.08447581569774698
Epoch: 15 w= 1.9902852811947591 loss= 0.00037750304585960356
	gard: 3.0 6.0 -0.17486493849433593
Epoch: 15 w= 1.9920339305797026 loss= 0.0005711243580809696
	gard: 1.0 2.0 -0.015932138840594856
Epoch: 16 w= 1.9921932519681085 loss= 6.094531483344211e-05
	gard: 2.0 4.0 -0.062453984255132156
Epoch: 16 w= 1.9928177918106598 loss= 0.00020633645790010366
	gard: 3.0 6.0 -0.12927974740812687
Epoch: 16 w= 1.994110589284741 loss= 0.0003121664271570621
	gard: 1.0 2.0 -0.011778821430517894
Epoch: 17 w= 1.9942283774990461 loss= 3.331162629351695e-05
	gard: 2.0 4.0 -0.046172980007630926
Epoch: 17 w= 1.9946901072991226 loss= 0.00011277984197932648
	gard: 3.0 6.0 -0.09557806861579543
Epoch: 17 w= 1.9956458879852805 loss= 0.0001706246229305199
	gard: 1.0 2.0 -0.008708224029438938
Epoch: 18 w= 1.9957329702255748 loss= 1.8207543095830904e-05
	gard: 2.0 4.0 -0.03413623819540135
Epoch: 18 w= 1.9960743326075288 loss= 6.164345790524789e-05
	gard: 3.0 6.0 -0.07066201306448505
Epoch: 18 w= 1.9967809527381737 loss= 9.326038746484765e-05
	gard: 1.0 2.0 -0.006438094523652627
Epoch: 19 w= 1.9968453336834102 loss= 9.951919569026147e-06
	gard: 2.0 4.0 -0.02523733053271826
Epoch: 19 w= 1.9970977069887375 loss= 3.369321889289368e-05
	gard: 3.0 6.0 -0.052241274202728505
Epoch: 19 w= 1.9976201197307648 loss= 5.097447086306101e-05
	gard: 1.0 2.0 -0.004759760538470381
Epoch: 20 w= 1.9976677173361495 loss= 5.439542424097813e-06
	gard: 2.0 4.0 -0.01865826131080439
Epoch: 20 w= 1.9978542999492575 loss= 1.8416114831026318e-05
	gard: 3.0 6.0 -0.03862260091336722
Epoch: 20 w= 1.998240525958391 loss= 2.7861740127856012e-05
	gard: 1.0 2.0 -0.0035189480832178432
Epoch: 21 w= 1.9982757154392232 loss= 2.9731572465331686e-06
	gard: 2.0 4.0 -0.01379427648621423
Epoch: 21 w= 1.9984136582040855 loss= 1.0065921173861568e-05
	gard: 3.0 6.0 -0.028554152326460525
Epoch: 21 w= 1.99869919972735 loss= 1.5228732143933469e-05
	gard: 1.0 2.0 -0.002601600545300009
Epoch: 22 w= 1.998725215732803 loss= 1.6250749278928904e-06
	gard: 2.0 4.0 -0.01019827413757568
Epoch: 22 w= 1.9988271984741788 loss= 5.501853675874337e-06
	gard: 3.0 6.0 -0.021110427464781978
Epoch: 22 w= 1.9990383027488265 loss= 8.323754426231206e-06
	gard: 1.0 2.0 -0.001923394502346909
Epoch: 23 w= 1.9990575366938501 loss= 8.88237083438977e-07
	gard: 2.0 4.0 -0.007539706449199102
Epoch: 23 w= 1.9991329337583421 loss= 3.007215469690939e-06
	gard: 3.0 6.0 -0.01560719234984198
Epoch: 23 w= 1.9992890056818404 loss= 4.549616284094891e-06
	gard: 1.0 2.0 -0.0014219886363191492
Epoch: 24 w= 1.9993032255682037 loss= 4.85494608805024e-07
	gard: 2.0 4.0 -0.005574195454370212
Epoch: 24 w= 1.9993589675227474 loss= 1.6436905475701982e-06
	gard: 3.0 6.0 -0.011538584590544687
Epoch: 24 w= 1.999474353368653 loss= 2.486739429417538e-06
	gard: 1.0 2.0 -0.0010512932626940419
Epoch: 25 w= 1.9994848663012799 loss= 2.6536272755709016e-07
	gard: 2.0 4.0 -0.004121069589761106
Epoch: 25 w= 1.9995260769971774 loss= 8.984120504174191e-07
	gard: 3.0 6.0 -0.008530614050808794
Epoch: 25 w= 1.9996113831376856 loss= 1.3592075910762856e-06
	gard: 1.0 2.0 -0.0007772337246287897
Epoch: 26 w= 1.9996191554749319 loss= 1.450425522743824e-07
	gard: 2.0 4.0 -0.0030467562005451754
Epoch: 26 w= 1.9996496230369374 loss= 4.910560649799249e-07
	gard: 3.0 6.0 -0.006306785335127074
Epoch: 26 w= 1.9997126908902887 loss= 7.429187207079447e-07
	gard: 1.0 2.0 -0.0005746182194226179
Epoch: 27 w= 1.999718437072483 loss= 7.9277682151955e-08
	gard: 2.0 4.0 -0.002252503420136165
Epoch: 27 w= 1.9997409621066844 loss= 2.6840252069354844e-07
	gard: 3.0 6.0 -0.00466268207967957
Epoch: 27 w= 1.9997875889274812 loss= 4.060661735575354e-07
	gard: 1.0 2.0 -0.0004248221450375844
Epoch: 28 w= 1.9997918371489316 loss= 4.333177256492245e-08
	gard: 2.0 4.0 -0.0016653028085471533
Epoch: 28 w= 1.999808490177017 loss= 1.467040491959103e-07
	gard: 3.0 6.0 -0.0034471768136938863
Epoch: 28 w= 1.9998429619451539 loss= 2.2194855602869353e-07
	gard: 1.0 2.0 -0.00031407610969225175
Epoch: 29 w= 1.9998461027062508 loss= 2.3684377023312183e-08
	gard: 2.0 4.0 -0.0012311783499932005
Epoch: 29 w= 1.9998584144897509 loss= 8.018582685001506e-08
	gard: 3.0 6.0 -0.0025485391844828342
Epoch: 29 w= 1.9998838998815958 loss= 1.213131374411496e-07
	gard: 1.0 2.0 -0.00023220023680847746
Epoch: 30 w= 1.999886221883964 loss= 1.2945459688719335e-08
	gard: 2.0 4.0 -0.0009102249282886277
Epoch: 30 w= 1.9998953241332469 loss= 4.382814832206124e-08
	gard: 3.0 6.0 -0.0018841656015560204
Epoch: 30 w= 1.9999141657892625 loss= 6.630760559646474e-08
	gard: 1.0 2.0 -0.00017166842147497974
Epoch: 31 w= 1.9999158824734773 loss= 7.0757582682986495e-09
	gard: 2.0 4.0 -0.0006729402121816719
Epoch: 31 w= 1.999922611875599 loss= 2.39556871931849e-08
	gard: 3.0 6.0 -0.0013929862392156878
Epoch: 31 w= 1.9999365417379913 loss= 3.624255915449335e-08
	gard: 1.0 2.0 -0.0001269165240174175
Epoch: 32 w= 1.9999378109032315 loss= 3.867483756878424e-09
	gard: 2.0 4.0 -0.0004975127741477792
Epoch: 32 w= 1.999942786030973 loss= 1.3093753007251004e-08
	gard: 3.0 6.0 -0.0010298514424817995
Epoch: 32 w= 1.9999530845453979 loss= 1.9809538924707548e-08
	gard: 1.0 2.0 -9.383090920422887e-05
Epoch: 33 w= 1.9999540228544899 loss= 2.1138979092600645e-09
	gard: 2.0 4.0 -0.00036781716408107457
Epoch: 33 w= 1.9999577010261307 loss= 7.156812761587869e-09
	gard: 3.0 6.0 -0.0007613815296476645
Epoch: 33 w= 1.9999653148414271 loss= 1.0827542027017377e-08
	gard: 1.0 2.0 -6.937031714571162e-05
Epoch: 34 w= 1.9999660085445985 loss= 1.1554190403125223e-09
	gard: 2.0 4.0 -0.0002719316432120422
Epoch: 34 w= 1.9999687278610305 loss= 3.911786702904296e-09
	gard: 3.0 6.0 -0.0005628985014531906
Epoch: 34 w= 1.999974356846045 loss= 5.9181421028034105e-09
	gard: 1.0 2.0 -5.1286307909848006e-05
Epoch: 35 w= 1.9999748697091242 loss= 6.315315195027067e-10
	gard: 2.0 4.0 -0.00020104232700646207
Epoch: 35 w= 1.9999768801323943 loss= 2.1381131124250783e-09
	gard: 3.0 6.0 -0.0004161576169003922
Epoch: 35 w= 1.9999810417085633 loss= 3.2347513278475087e-09
	gard: 1.0 2.0 -3.7916582873442906e-05
Epoch: 36 w= 1.999981420874392 loss= 3.451839083573592e-10
	gard: 2.0 4.0 -0.0001486330048638962
Epoch: 36 w= 1.9999829072044406 loss= 1.168654640140748e-09
	gard: 3.0 6.0 -0.0003076703200690645
Epoch: 36 w= 1.9999859839076413 loss= 1.7680576050779005e-09
	gard: 1.0 2.0 -2.8032184717474706e-05
Epoch: 37 w= 1.9999862642294883 loss= 1.886713915492408e-10
	gard: 2.0 4.0 -0.0001098861640933535
Epoch: 37 w= 1.9999873630911293 loss= 6.387658632210285e-10
	gard: 3.0 6.0 -0.00022746435967313516
Epoch: 37 w= 1.9999896377347262 loss= 9.6638887447731e-10
	gard: 1.0 2.0 -2.0724530547688857e-05
Epoch: 38 w= 1.9999898449800317 loss= 1.031244305557788e-10
	gard: 2.0 4.0 -8.124015974608767e-05
Epoch: 38 w= 1.9999906573816293 loss= 3.491380720830064e-10
	gard: 3.0 6.0 -0.00016816713067413502
Epoch: 38 w= 1.999992339052936 loss= 5.282109892545845e-10
	gard: 1.0 2.0 -1.5321894128117464e-05
Epoch: 39 w= 1.9999924922718773 loss= 5.6365981565078246e-11
	gard: 2.0 4.0 -6.006182498197177e-05
Epoch: 39 w= 1.999993092890127 loss= 1.908326671877105e-10
	gard: 3.0 6.0 -0.00012432797771566584
Epoch: 39 w= 1.9999943361699042 loss= 2.887107421958329e-10
	gard: 1.0 2.0 -1.1327660191629008e-05
Epoch: 40 w= 1.999994449446506 loss= 3.0808644089069355e-11
	gard: 2.0 4.0 -4.4404427951505454e-05
Epoch: 40 w= 1.9999948934907856 loss= 1.0430574542577618e-10
	gard: 3.0 6.0 -9.191716585732479e-05
Epoch: 40 w= 1.9999958126624442 loss= 1.5780416225633037e-10
	gard: 1.0 2.0 -8.37467511161094e-06
Epoch: 41 w= 1.9999958964091953 loss= 1.683945749254976e-11
	gard: 2.0 4.0 -3.282872643772805e-05
Epoch: 41 w= 1.9999962246964598 loss= 5.7011667283557454e-11
	gard: 3.0 6.0 -6.795546372551087e-05
Epoch: 41 w= 1.999996904251097 loss= 8.625295142578772e-11
	gard: 1.0 2.0 -6.191497806007362e-06
Epoch: 42 w= 1.999996966166075 loss= 9.204148284300409e-12
	gard: 2.0 4.0 -2.4270671399762023e-05
Epoch: 42 w= 1.999997208872789 loss= 3.116156442993922e-11
	gard: 3.0 6.0 -5.0240289795056015e-05
Epoch: 42 w= 1.999997711275687 loss= 4.71443308235547e-11
	gard: 1.0 2.0 -4.5774486259198e-06
Epoch: 43 w= 1.9999977570501732 loss= 5.030823925355687e-12
	gard: 2.0 4.0 -1.794359861406747e-05
Epoch: 43 w= 1.9999979364861593 loss= 1.7032357482857185e-11
	gard: 3.0 6.0 -3.714324913239864e-05
Epoch: 43 w= 1.9999983079186507 loss= 2.5768253628059826e-11
	gard: 1.0 2.0 -3.3841626985164908e-06
Epoch: 44 w= 1.9999983417602778 loss= 2.749758976288653e-12
	gard: 2.0 4.0 -1.326591777761621e-05
Epoch: 44 w= 1.9999984744194557 loss= 9.309583989038873e-12
	gard: 3.0 6.0 -2.7460449796734565e-05
Epoch: 44 w= 1.9999987490239537 loss= 1.4084469615916932e-11
	gard: 1.0 2.0 -2.5019520926150562e-06
Epoch: 45 w= 1.9999987740434746 loss= 1.5029694021905123e-12
	gard: 2.0 4.0 -9.807652203264183e-06
Epoch: 45 w= 1.9999988721199966 loss= 5.0884532082164796e-12
	gard: 3.0 6.0 -2.0301840059744336e-05
Epoch: 45 w= 1.9999990751383971 loss= 7.698320862431846e-12
	gard: 1.0 2.0 -1.8497232057157476e-06
Epoch: 46 w= 1.999999093635629 loss= 8.21496372817981e-13
	gard: 2.0 4.0 -7.250914967116273e-06
Epoch: 46 w= 1.9999991661447787 loss= 2.781258120464295e-12
	gard: 3.0 6.0 -1.5009393983689279e-05
Epoch: 46 w= 1.9999993162387186 loss= 4.20776540913866e-12
	gard: 1.0 2.0 -1.3675225627451937e-06
Epoch: 47 w= 1.9999993299139442 loss= 4.4901532211598856e-13
	gard: 2.0 4.0 -5.3606884460322135e-06
Epoch: 47 w= 1.9999993835208287 loss= 1.5201862744244805e-12
	gard: 3.0 6.0 -1.109662508014253e-05
Epoch: 47 w= 1.9999994944870796 loss= 2.299889814334344e-12
	gard: 1.0 2.0 -1.0110258408246864e-06
Epoch: 48 w= 1.999999504597338 loss= 2.4542379752074493e-13
	gard: 2.0 4.0 -3.963221296032771e-06
Epoch: 48 w= 1.999999544229551 loss= 8.309068084976207e-13
	gard: 3.0 6.0 -8.20386808086937e-06
Epoch: 48 w= 1.9999996262682318 loss= 1.2570789110540446e-12
	gard: 1.0 2.0 -7.474635363990956e-07
Epoch: 49 w= 1.9999996337428672 loss= 1.3414428735942875e-13
	gard: 2.0 4.0 -2.930057062755509e-06
Epoch: 49 w= 1.9999996630434378 loss= 4.541588992601398e-13
	gard: 3.0 6.0 -6.065218119744031e-06
Epoch: 49 w= 1.999999723695619 loss= 6.870969979249939e-13
	gard: 1.0 2.0 -5.526087618612507e-07
Epoch: 50 w= 1.9999997292217067 loss= 7.332088414580143e-14
	gard: 2.0 4.0 -2.166226346744793e-06
Epoch: 50 w= 1.9999997508839702 loss= 2.4823518522241923e-13
	gard: 3.0 6.0 -4.484088535150477e-06
Epoch: 50 w= 1.9999997957248556 loss= 3.7555501141274804e-13
	gard: 1.0 2.0 -4.08550288710785e-07
Epoch: 51 w= 1.9999997998103585 loss= 4.007589257609293e-14
	gard: 2.0 4.0 -1.6015171322436572e-06
Epoch: 51 w= 1.9999998158255299 loss= 1.356809417747564e-13
	gard: 3.0 6.0 -3.3151404608133817e-06
Epoch: 51 w= 1.9999998489769344 loss= 2.052716967104274e-13
	gard: 1.0 2.0 -3.020461312175371e-07
Epoch: 52 w= 1.9999998519973956 loss= 2.190477090749354e-14
	gard: 2.0 4.0 -1.1840208351543424e-06
Epoch: 52 w= 1.999999863837604 loss= 7.416079236506028e-14
	gard: 3.0 6.0 -2.4509231284497446e-06
Epoch: 52 w= 1.9999998883468353 loss= 1.1219786256679713e-13
	gard: 1.0 2.0 -2.2330632942768602e-07
Epoch: 53 w= 1.9999998905798986 loss= 1.1972758595639988e-14
	gard: 2.0 4.0 -8.753608113920563e-07
Epoch: 53 w= 1.9999998993335066 loss= 4.053497154431545e-14
	gard: 3.0 6.0 -1.811996877876254e-06
Epoch: 53 w= 1.9999999174534755 loss= 6.132535848018759e-14
	gard: 1.0 2.0 -1.6509304900935717e-07
Epoch: 54 w= 1.999999919104406 loss= 6.544097128817071e-15
	gard: 2.0 4.0 -6.471647520100987e-07
Epoch: 54 w= 1.9999999255760534 loss= 2.215569528691633e-14
	gard: 3.0 6.0 -1.3396310407642886e-06
Epoch: 54 w= 1.999999938972364 loss= 3.351935118167793e-14
	gard: 1.0 2.0 -1.220552721115098e-07
Epoch: 55 w= 1.9999999401929167 loss= 3.57688721115342e-15
	gard: 2.0 4.0 -4.784566662863199e-07
Epoch: 55 w= 1.9999999449774835 loss= 1.2109909306894754e-14
	gard: 3.0 6.0 -9.904052991061008e-07
Epoch: 55 w= 1.9999999548815364 loss= 1.8321081844499955e-14
	gard: 1.0 2.0 -9.023692726373156e-08
Epoch: 56 w= 1.9999999557839057 loss= 1.9550629988133238e-15
	gard: 2.0 4.0 -3.5372875473171916e-07
Epoch: 56 w= 1.9999999593211932 loss= 6.61906128011118e-15
	gard: 3.0 6.0 -7.322185204827747e-07
Epoch: 56 w= 1.9999999666433785 loss= 1.0013977760018664e-14
	gard: 1.0 2.0 -6.671324292994996e-08
Epoch: 57 w= 1.9999999673105109 loss= 1.068602698639685e-15
	gard: 2.0 4.0 -2.615159129248923e-07
Epoch: 57 w= 1.99999996992567 loss= 3.61786127941924e-15
	gard: 3.0 6.0 -5.413379398078177e-07
Epoch: 57 w= 1.9999999753390494 loss= 5.473462367088053e-15
	gard: 1.0 2.0 -4.932190122985958e-08
Epoch: 58 w= 1.9999999758322684 loss= 5.840792503875133e-16
	gard: 2.0 4.0 -1.9334185274999527e-07
Epoch: 58 w= 1.999999977765687 loss= 1.977458708532122e-15
	gard: 3.0 6.0 -4.002176350326181e-07
Epoch: 58 w= 1.9999999817678633 loss= 2.991697274308627e-15
	gard: 1.0 2.0 -3.6464273378555845e-08
Epoch: 59 w= 1.999999982132506 loss= 3.192473392973848e-16
	gard: 2.0 4.0 -1.429399514307761e-07
Epoch: 59 w= 1.9999999835619056 loss= 1.0808437954292166e-15
	gard: 3.0 6.0 -2.9588569994132286e-07
Epoch: 59 w= 1.9999999865207625 loss= 1.6352086111474931e-15
	gard: 1.0 2.0 -2.6958475007887728e-08
Epoch: 60 w= 1.9999999867903473 loss= 1.744949245871147e-16
	gard: 2.0 4.0 -1.0567722164012139e-07
Epoch: 60 w= 1.9999999878471195 loss= 5.907700158186214e-16
	gard: 3.0 6.0 -2.1875184863517916e-07
Epoch: 60 w= 1.999999990034638 loss= 8.937759877335403e-16
	gard: 1.0 2.0 -1.993072418216002e-08
Epoch: 61 w= 1.9999999902339451 loss= 9.537582740546369e-17
	gard: 2.0 4.0 -7.812843882959442e-08
Epoch: 61 w= 1.9999999910152295 loss= 3.2290440254075056e-16
	gard: 3.0 6.0 -1.617258700292723e-07
Epoch: 61 w= 1.9999999926324883 loss= 4.885220495987371e-16
	gard: 1.0 2.0 -1.473502342363986e-08
Epoch: 62 w= 1.9999999927798384 loss= 5.213073323732328e-17
	gard: 2.0 4.0 -5.7761292637792394e-08
Epoch: 62 w= 1.9999999933574513 loss= 1.7649381469611494e-16
	gard: 3.0 6.0 -1.195658771990793e-07
Epoch: 62 w= 1.99999999455311 loss= 2.670175009618106e-16
	gard: 1.0 2.0 -1.0893780100218464e-08
Epoch: 63 w= 1.9999999946620477 loss= 2.849373478267347e-17
	gard: 2.0 4.0 -4.270361841918202e-08
Epoch: 63 w= 1.9999999950890839 loss= 9.64683884802193e-17
	gard: 3.0 6.0 -8.839649012770678e-08
Epoch: 63 w= 1.9999999959730488 loss= 1.4594702493172377e-16
	gard: 1.0 2.0 -8.05390243385773e-09
Epoch: 64 w= 1.999999996053588 loss= 1.5574168352596936e-17
	gard: 2.0 4.0 -3.1571296688071016e-08
Epoch: 64 w= 1.999999996369301 loss= 5.272790308466875e-17
	gard: 3.0 6.0 -6.53525820126788e-08
Epoch: 64 w= 1.9999999970228268 loss= 7.977204100704301e-17
	gard: 1.0 2.0 -5.9543463493128e-09
Epoch: 65 w= 1.9999999970823703 loss= 8.512563027807649e-18
	gard: 2.0 4.0 -2.334103754719763e-08
Epoch: 65 w= 1.9999999973157807 loss= 2.882013281476982e-17
	gard: 3.0 6.0 -4.8315948575350376e-08
Epoch: 65 w= 1.9999999977989402 loss= 4.360197735196887e-17
	gard: 1.0 2.0 -4.402119557767037e-09
Epoch: 66 w= 1.9999999978429615 loss= 4.652815296603292e-18
	gard: 2.0 4.0 -1.725630838222969e-08
Epoch: 66 w= 1.9999999980155245 loss= 1.5752571609185567e-17
	gard: 3.0 6.0 -3.5720557178819945e-08
Epoch: 66 w= 1.9999999983727301 loss= 2.3832065197304227e-17
	gard: 1.0 2.0 -3.254539748809293e-09
Epoch: 67 w= 1.9999999984052754 loss= 2.5431464688847555e-18
	gard: 2.0 4.0 -1.2757796596929438e-08
Epoch: 67 w= 1.9999999985328534 loss= 8.610076893550303e-18
	gard: 3.0 6.0 -2.6408640607655798e-08
Epoch: 67 w= 1.9999999987969397 loss= 1.3026183953845832e-17
	gard: 1.0 2.0 -2.406120636067044e-09
Epoch: 68 w= 1.9999999988210009 loss= 1.3900389262686045e-18
	gard: 2.0 4.0 -9.431992964437086e-09
Epoch: 68 w= 1.9999999989153208 loss= 4.706116174130455e-18
	gard: 3.0 6.0 -1.9524227568012975e-08
Epoch: 68 w= 1.999999999110563 loss= 7.11988308874388e-18
	gard: 1.0 2.0 -1.7788739370416806e-09
Epoch: 69 w= 1.9999999991283517 loss= 7.59770774733322e-19
	gard: 2.0 4.0 -6.97318647269185e-09
Epoch: 69 w= 1.9999999991980835 loss= 2.5722805047330707e-18
	gard: 3.0 6.0 -1.4434496264925656e-08
Epoch: 69 w= 1.9999999993424284 loss= 3.89160224698574e-18
	gard: 1.0 2.0 -1.3151431055291596e-09
Epoch: 70 w= 1.99999999935558 loss= 4.152772303042112e-19
	gard: 2.0 4.0 -5.155360582875801e-09
Epoch: 70 w= 1.9999999994071336 loss= 1.4059623802894647e-18
	gard: 3.0 6.0 -1.067159693945996e-08
Epoch: 70 w= 1.9999999995138495 loss= 2.1270797208746147e-18
	gard: 1.0 2.0 -9.72300906454393e-10
Epoch: 71 w= 1.9999999995235727 loss= 2.2698301222627576e-19
	gard: 2.0 4.0 -3.811418736177075e-09
Epoch: 71 w= 1.9999999995616868 loss= 7.684737173373023e-19
	gard: 3.0 6.0 -7.88963561149103e-09
Epoch: 71 w= 1.9999999996405833 loss= 1.1626238773828175e-18
	gard: 1.0 2.0 -7.18833437218791e-10
Epoch: 72 w= 1.9999999996477715 loss= 1.240648941022745e-19
	gard: 2.0 4.0 -2.8178277489132597e-09
Epoch: 72 w= 1.9999999996759499 loss= 4.200339212714733e-19
	gard: 3.0 6.0 -5.832902161273523e-09
Epoch: 72 w= 1.999999999734279 loss= 6.354692062078993e-19
	gard: 1.0 2.0 -5.314420015167798e-10
Epoch: 73 w= 1.9999999997395934 loss= 6.781158960725132e-20
	gard: 2.0 4.0 -2.0832526814729135e-09
Epoch: 73 w= 1.9999999997604259 loss= 2.295830709791818e-19
	gard: 3.0 6.0 -4.31233715403323e-09
Epoch: 73 w= 1.9999999998035491 loss= 3.4733644793346653e-19
	gard: 1.0 2.0 -3.92901711165905e-10
Epoch: 74 w= 1.9999999998074782 loss= 3.7064634349246757e-20
	gard: 2.0 4.0 -1.5401742103904326e-09
Epoch: 74 w= 1.99999999982288 loss= 1.2548611414877284e-19
	gard: 3.0 6.0 -3.188159070077745e-09
Epoch: 74 w= 1.9999999998547615 loss= 1.8984796531526204e-19
	gard: 1.0 2.0 -2.9047697580608656e-10
Epoch: 75 w= 1.9999999998576663 loss= 2.0258882264266137e-20
	gard: 2.0 4.0 -1.1386696030513122e-09
Epoch: 75 w= 1.999999999869053 loss= 6.858849970693509e-20
	gard: 3.0 6.0 -2.3570478902001923e-09
Epoch: 75 w= 1.9999999998926234 loss= 1.0376765851119951e-19
	gard: 1.0 2.0 -2.1475310418850313e-10
Epoch: 76 w= 1.999999999894771 loss= 1.107312911383885e-20
	gard: 2.0 4.0 -8.418314934033333e-10
Epoch: 76 w= 1.9999999999031894 loss= 3.7489137776169456e-20
	gard: 3.0 6.0 -1.7425900722400911e-09
Epoch: 76 w= 1.9999999999206153 loss= 5.671751114309842e-20
	gard: 1.0 2.0 -1.5876944203796484e-10
Epoch: 77 w= 1.999999999922203 loss= 6.0523890941882305e-21
	gard: 2.0 4.0 -6.223768167501476e-10
Epoch: 77 w= 1.9999999999284266 loss= 2.0491014287630478e-20
	gard: 3.0 6.0 -1.2883241140571045e-09
Epoch: 77 w= 1.9999999999413098 loss= 3.100089617511693e-20
	gard: 1.0 2.0 -1.17380327679939e-10
Epoch: 78 w= 1.9999999999424836 loss= 3.308140416852629e-21
	gard: 2.0 4.0 -4.601314884666863e-10
Epoch: 78 w= 1.9999999999470848 loss= 1.1200085313487653e-20
	gard: 3.0 6.0 -9.524754318590567e-10
Epoch: 78 w= 1.9999999999566096 loss= 1.6944600977692705e-20
	gard: 1.0 2.0 -8.678080476443029e-11
Epoch: 79 w= 1.9999999999574773 loss= 1.808175938740392e-21
	gard: 2.0 4.0 -3.4018121652934497e-10
Epoch: 79 w= 1.999999999960879 loss= 6.121788255259634e-21
	gard: 3.0 6.0 -7.041780492045291e-10
Epoch: 79 w= 1.9999999999679208 loss= 9.2616919156479e-21
	gard: 1.0 2.0 -6.415845632545825e-11
Epoch: 80 w= 1.9999999999685623 loss= 9.883315779891823e-22
	gard: 2.0 4.0 -2.5150193039280566e-10
Epoch: 80 w= 1.9999999999710774 loss= 3.3460768947187237e-21
	gard: 3.0 6.0 -5.206075570640678e-10
Epoch: 80 w= 1.9999999999762834 loss= 5.062350511130293e-21
	gard: 1.0 2.0 -4.743316850408519e-11
Epoch: 81 w= 1.9999999999767577 loss= 5.4020436871698675e-22
	gard: 2.0 4.0 -1.8593837580738182e-10
Epoch: 81 w= 1.999999999978617 loss= 1.8289128720347612e-21
	gard: 3.0 6.0 -3.8489211817704927e-10
Epoch: 81 w= 1.999999999982466 loss= 2.7669155644059242e-21
	gard: 1.0 2.0 -3.5067948545020045e-11
Epoch: 82 w= 1.9999999999828166 loss= 2.9526806163710343e-22
	gard: 2.0 4.0 -1.3746692673066718e-10
Epoch: 82 w= 1.9999999999841913 loss= 9.996584262034417e-22
	gard: 3.0 6.0 -2.845563784603655e-10
Epoch: 82 w= 1.9999999999870368 loss= 1.5124150106147723e-21
	gard: 1.0 2.0 -2.5926372160256506e-11
Epoch: 83 w= 1.9999999999872962 loss= 1.613874994621283e-22
	gard: 2.0 4.0 -1.0163070385260653e-10
Epoch: 83 w= 1.9999999999883125 loss= 5.463943486283182e-22
	gard: 3.0 6.0 -2.1037571684701106e-10
Epoch: 83 w= 1.999999999990416 loss= 8.26683933105326e-22
	gard: 1.0 2.0 -1.9167778475548403e-11
Epoch: 84 w= 1.9999999999906077 loss= 8.821463701619896e-23
	gard: 2.0 4.0 -7.51381179497912e-11
Epoch: 84 w= 1.9999999999913591 loss= 2.9865824713989597e-22
	gard: 3.0 6.0 -1.5553425214420713e-10
Epoch: 84 w= 1.9999999999929146 loss= 4.518126871054872e-22
	gard: 1.0 2.0 -1.4170886686315498e-11
Epoch: 85 w= 1.9999999999930562 loss= 4.821606520676571e-23
	gard: 2.0 4.0 -5.555023108172463e-11
Epoch: 85 w= 1.9999999999936118 loss= 1.632375868892772e-22
	gard: 3.0 6.0 -1.1499068364173581e-10
Epoch: 85 w= 1.9999999999947617 loss= 2.469467919185614e-22
	gard: 1.0 2.0 -1.0476508549572827e-11
Epoch: 86 w= 1.9999999999948666 loss= 2.635230090727337e-23
	gard: 2.0 4.0 -4.106759377009439e-11
Epoch: 86 w= 1.9999999999952773 loss= 8.921432311840397e-23
	gard: 3.0 6.0 -8.500933290633839e-11
Epoch: 86 w= 1.9999999999961273 loss= 1.349840097651456e-22
	gard: 1.0 2.0 -7.745359908994942e-12
Epoch: 87 w= 1.9999999999962048 loss= 1.4403439714944095e-23
	gard: 2.0 4.0 -3.036149109902908e-11
Epoch: 87 w= 1.9999999999965083 loss= 4.8766518344147864e-23
	gard: 3.0 6.0 -6.285105769165966e-11
Epoch: 87 w= 1.999999999997137 loss= 7.376551550022107e-23
	gard: 1.0 2.0 -5.726086271806707e-12
Epoch: 88 w= 1.9999999999971942 loss= 7.872264643114844e-24
	gard: 2.0 4.0 -2.2446045022661565e-11
Epoch: 88 w= 1.9999999999974187 loss= 2.6651788942408325e-23
	gard: 3.0 6.0 -4.646416584819235e-11
Epoch: 88 w= 1.9999999999978835 loss= 4.031726170507742e-23
	gard: 1.0 2.0 -4.233058348290797e-12
Epoch: 89 w= 1.9999999999979259 loss= 4.301968193379283e-24
	gard: 2.0 4.0 -1.659294923683774e-11
Epoch: 89 w= 1.9999999999980917 loss= 1.4565692625929953e-23
	gard: 3.0 6.0 -3.4351188560322043e-11
Epoch: 89 w= 1.9999999999984353 loss= 2.2033851437431755e-23
	gard: 1.0 2.0 -3.1294966618133913e-12
Epoch: 90 w= 1.9999999999984666 loss= 2.3514383612198287e-24
	gard: 2.0 4.0 -1.226752033289813e-11
Epoch: 90 w= 1.9999999999985891 loss= 7.96223265163349e-24
	gard: 3.0 6.0 -2.539835008974478e-11
Epoch: 90 w= 1.9999999999988431 loss= 1.2047849775995315e-23
	gard: 1.0 2.0 -2.3137047833188262e-12
Epoch: 91 w= 1.9999999999988662 loss= 1.2854111769494144e-24
	gard: 2.0 4.0 -9.070078021977679e-12
Epoch: 91 w= 1.9999999999989568 loss= 4.352777491689404e-24
	gard: 3.0 6.0 -1.8779644506139448e-11
Epoch: 91 w= 1.9999999999991447 loss= 6.5840863393251405e-24
	gard: 1.0 2.0 -1.7106316363424412e-12
Epoch: 92 w= 1.9999999999991618 loss= 7.026100585915738e-25
	gard: 2.0 4.0 -6.7057470687359455e-12
Epoch: 92 w= 1.9999999999992288 loss= 2.3787566143676324e-24
	gard: 3.0 6.0 -1.3882228699912957e-11
Epoch: 92 w= 1.9999999999993676 loss= 3.5991747246272455e-24
	gard: 1.0 2.0 -1.2647660696529783e-12
Epoch: 93 w= 1.9999999999993803 loss= 3.840609253151823e-25
	gard: 2.0 4.0 -4.957811938766099e-12
Epoch: 93 w= 1.9999999999994298 loss= 1.3005602645580524e-24
	gard: 3.0 6.0 -1.0263789818054647e-11
Epoch: 93 w= 1.9999999999995324 loss= 1.969312363793734e-24
	gard: 1.0 2.0 -9.352518759442319e-13
Epoch: 94 w= 1.9999999999995417 loss= 2.100389491805257e-25
	gard: 2.0 4.0 -3.666400516522117e-12
Epoch: 94 w= 1.9999999999995783 loss= 7.111977463172295e-25
	gard: 3.0 6.0 -7.58859641791787e-12
Epoch: 94 w= 1.9999999999996543 loss= 1.0761829795642296e-24
	gard: 1.0 2.0 -6.914468997365475e-13
Epoch: 95 w= 1.9999999999996612 loss= 1.148125910829028e-25
	gard: 2.0 4.0 -2.7107205369247822e-12
Epoch: 95 w= 1.9999999999996882 loss= 3.887538095365355e-25
	gard: 3.0 6.0 -5.611511255665391e-12
Epoch: 95 w= 1.9999999999997444 loss= 5.875191475205477e-25
	gard: 1.0 2.0 -5.111466805374221e-13
Epoch: 96 w= 1.9999999999997495 loss= 6.273337462679574e-26
	gard: 2.0 4.0 -2.0037305148434825e-12
Epoch: 96 w= 1.9999999999997695 loss= 2.1248836229123696e-25
	gard: 3.0 6.0 -4.1460168631601846e-12
Epoch: 96 w= 1.999999999999811 loss= 3.2110109830478153e-25
	gard: 1.0 2.0 -3.779199175824033e-13
Epoch: 97 w= 1.9999999999998148 loss= 3.429355848699413e-26
	gard: 2.0 4.0 -1.4814816040598089e-12
Epoch: 97 w= 1.9999999999998297 loss= 1.1601954826789095e-25
	gard: 3.0 6.0 -3.064215547965432e-12
Epoch: 97 w= 1.9999999999998603 loss= 1.757455879087579e-25
	gard: 1.0 2.0 -2.793321129956894e-13
Epoch: 98 w= 1.9999999999998632 loss= 1.8708625228221516e-26
	gard: 2.0 4.0 -1.0942358130705543e-12
Epoch: 98 w= 1.999999999999874 loss= 6.340252588964947e-26
	gard: 3.0 6.0 -2.2648549702353193e-12
Epoch: 98 w= 1.9999999999998967 loss= 9.608404711682446e-26
	gard: 1.0 2.0 -2.0650148258027912e-13
Epoch: 99 w= 1.9999999999998987 loss= 1.025203632425227e-26
	gard: 2.0 4.0 -8.100187187665142e-13
Epoch: 99 w= 1.9999999999999067 loss= 3.478876592024662e-26
	gard: 3.0 6.0 -1.6786572132332367e-12
Epoch: 99 w= 1.9999999999999236 loss= 5.250973729513143e-26
Predict(after training) 4 7.9999999999996945

进程已结束,退出代码为 0

可见,整体来看,损失并不是逐渐减小的,虽然收敛,但是波动较大,但是对于每一个x,y数据而言,损失是逐渐减小的,在其他例子中,就算不是逐渐减小,波动也会比较小,效率会更高。

注意:

Pytorch之梯度下降算法_第20张图片

并行化计算方法用于梯度下降效率很高,但是用于随机梯度下降不行,随机梯度下降的每次权重更新对于前一个样本有依赖,不适合使用并行计算的方法。但是这样效率不如梯度下降算法好,算法时间复杂度高。这时我们会选择一种折中的方式。

这种折中叫做batch,即批量的随机梯度下降,全部扔在一起,性能不好,全部分开,时间复杂度不好。

Pytorch之梯度下降算法_第21张图片

我们在样本里面,一定数量的算作一组。进行训练。更正式的名字叫做Mini_batch,即用小批量的。目前的Mini_batch是主流,所以我们目前所说的batch,其实就是Mini_batch。

你可能感兴趣的:(pytorch,pytorch,算法,人工智能)