torch.optim.lr_scheduler.MultiStepLR()用法研究 台阶/阶梯学习率

torch.optim.lr_scheduler.MultiStepLR(optimizermilestonesgamma=0.1last_epoch=-1verbose=False)

我自已用代码研究了一遍MultiStepLR()中的last_epoch参数,发现就是个垃圾

一、结论:

  1.  last_epoch就是个鸡肋的东西   经过评论区大佬的指点,我现在确定了last_epoch的用法:last_epoch表示已经走了多少个epoch,下一个milestone减去last_epoch就是需要的epoch数
    (评论区原话:last_epoch是有用的,简单来说,就是所有学习率都要提前last_epoch开始进行变化。举个例子假如我设置原始lr=0.1,milestones=[5, 15], gamma=0.5,last_epoch=0此时epoch=5时lr才会变为0.05,epoch=15时lr变为0.025。当修改last_epoch=3后,epoch=2,lr就会变为0.05,epoch=12,lr变为0.025.一般默认last_epoch=0)
  2.  使用scheduler.get_lr(),会在milestone的时候乘以gamma的平方
  3.  经评论区大佬提醒,新版本pytorch已经有了新的变化:
    新版的pytorch没有get_lr()函数了,应该用get_last_lr()代替get_lr(),而且 get_last_lr() 也没有 "乘gamma平方" 这个问题了

二、实验代码如下:

1、首先是默认配置:

import torch
import torchvision

learing_rate = 0.1
model = torchvision.models.resnet18()
optimizer = torch.optim.SGD(model.parameters(), lr=learing_rate,
                                momentum=0.9,
                                weight_decay=5e-5)
scheduler = torch.optim.lr_scheduler.MultiStepLR(optimizer, milestones=[3, 6], gamma=0.1)

for epoch in range(9):
    optimizer.step()
    scheduler.step()
    # print(optimizer.get_lr())
    print(epoch, scheduler.get_lr())

返回:
0 [0.1]
1 [0.1]
2 [0.0010000000000000002]   # 此处乘的是gamma的平方
3 [0.010000000000000002]
4 [0.010000000000000002]
5 [0.00010000000000000003]   # 此处乘的是gamma的平方
6 [0.0010000000000000002]
7 [0.0010000000000000002]
8 [0.0010000000000000002]

2、设置last_epoch=-1。

和前面1、是一样的,因为函数的默认值就是last_epoch=-1

import torch
import torchvision

learing_rate = 0.1
model = torchvision.models.resnet18()
optimizer = torch.optim.SGD(model.parameters(), lr=learing_rate,
                                momentum=0.9,
                                weight_decay=5e-5)
scheduler = torch.optim.lr_scheduler.MultiStepLR(optimizer, milestones=[3, 6], gamma=0.1)
scheduler.last_epoch = -1

for epoch in range(9):
    optimizer.step()
    scheduler.step()
    # print(optimizer.get_lr())
    print(epoch, scheduler.get_lr())

返回:
0 [0.1]
1 [0.1]
2 [0.1]
3 [0.0010000000000000002]
4 [0.010000000000000002]
5 [0.010000000000000002]
6 [0.00010000000000000003]
7 [0.0010000000000000002]
8 [0.0010000000000000002]

3、设置last_epoch=4。

把第1个epoch的learning_rate设置为0.1,但是按照模型已经更新到了第4个epoch开始执行。

后来理解了一下,感觉就是:在last_epoch处将learning_rate重新设置为初始值,而且也是从last_epoch处继续进行运行;所以就要求你手动把learning_rate设为上一次模型停止的时候对应的learning_rate值,即last_epoch处对应的learning_rate

import torch
import torchvision

learing_rate = 0.1
model = torchvision.models.resnet18()
optimizer = torch.optim.SGD(model.parameters(), lr=learing_rate,
                                momentum=0.9,
                                weight_decay=5e-5)
scheduler = torch.optim.lr_scheduler.MultiStepLR(optimizer, milestones=[3, 6], gamma=0.1)
scheduler.last_epoch = 4

for epoch in range(9):
    optimizer.step()
    scheduler.step()
    # print(optimizer.get_lr())
    print(epoch, scheduler.get_lr())

返回:
0 [0.1] # 0相当于第4个eopch
1 [0.0010000000000000002]  # 1相当于第5个epoch所以乘以gamma的平方
2 [0.010000000000000002]
3 [0.010000000000000002]
4 [0.010000000000000002]
5 [0.010000000000000002]
6 [0.010000000000000002]
7 [0.010000000000000002]
8 [0.010000000000000002]  # 因为4在3的后面,只有一个6这个milestones,所以只更新了一次

4、设置last_epoch=4,并且将scheduler.step()改为schduler.step(epoch)。也是不对

import torch
import torchvision

learing_rate = 0.1
model = torchvision.models.resnet18()
optimizer = torch.optim.SGD(model.parameters(), lr=learing_rate,
                                momentum=0.9,
                                weight_decay=5e-5)
scheduler = torch.optim.lr_scheduler.MultiStepLR(optimizer, milestones=[3, 6], gamma=0.1)
scheduler.last_epoch = 4

for epoch in range(9):
    optimizer.step()
    scheduler.step(epoch)
    # print(optimizer.get_lr())
    print(epoch, scheduler.get_lr())

返回:
0 [0.0010000000000000002]
1 [0.0010000000000000002]
2 [0.0010000000000000002]
3 [0.00010000000000000003]
4 [0.0010000000000000002]
5 [0.0010000000000000002]
6 [0.00010000000000000003]
7 [0.0010000000000000002]
8 [0.0010000000000000002]

我试过了,无论把last_epoch改为多少,输出都是上面这个。证明scheduler.step()里面是一定不能加epoch的

5、接3,当last_epoch大于milestones的某些值时,会自动跳过这些值

import torch
import torchvision

learing_rate = 0.1
model = torchvision.models.resnet18()
optimizer = torch.optim.SGD(model.parameters(), lr=learing_rate,
                                momentum=0.9,
                                weight_decay=5e-5)
scheduler = torch.optim.lr_scheduler.MultiStepLR(optimizer, milestones=[5, 8], gamma=0.1)
scheduler.last_epoch = 4

for epoch in range(9):
    optimizer.step()
    # scheduler.step(epoch)
    scheduler.step()
    # print(optimizer.get_lr())
    print(epoch, scheduler.get_lr())

返回:
0 [0.0010000000000000002]
1 [0.010000000000000002]
2 [0.010000000000000002]
3 [0.00010000000000000003]
4 [0.0010000000000000002]
5 [0.0010000000000000002]
6 [0.0010000000000000002]
7 [0.0010000000000000002]
8 [0.0010000000000000002]

对比第3小节和第5小节的例子可以发现,在3中的例子中,4大于3,所以把3跳过了,直接在milestone=6的时候调整的learning_rate

三、新版本pytorch没有get_lr()这个函数了,用get_last_lr()代替

1、官方文档:

get_last_lr():

Return the last computed learning rate by current scheduler.(获取scheduler计算的最后学习速率)

print_lr(is_verbosegrouplrepoch=None):# 这个怎么用,还没空研究

Display the current learning rate.(返回当前的学习率)

看来,现在没有get_lr()这个函数了

2、例子代码:

import torch
import torchvision

learing_rate = 0.1
model = torchvision.models.resnet18()
optimizer = torch.optim.SGD(model.parameters(), lr=learing_rate,
                                momentum=0.9,
                                weight_decay=5e-5)
scheduler = torch.optim.lr_scheduler.MultiStepLR(optimizer, milestones=[3, 6], gamma=0.1)

for epoch in range(9):
    optimizer.step()
    scheduler.step()
    # print(optimizer.get_lr())
    print(epoch, scheduler.get_last_lr())



返回:
0 [0.1]
1 [0.1]
2 [0.010000000000000002]
3 [0.010000000000000002]
4 [0.010000000000000002]
5 [0.0010000000000000002]
6 [0.0010000000000000002]
7 [0.0010000000000000002]
8 [0.0010000000000000002]

可以看到:没有“乘以gamma平方”这个问题了。

ps: 是评论区的 angleboy8 朋友发现的。

你可能感兴趣的:(pytorch,pytorch,深度学习,人工智能)