报错就不截图了,gogole后发现在pytorch的github上发现是AdamW优化器的一个bug,git #52944
原问题地址
[optim] bugfix when all parameters have no grad
已经有大佬解决了,不太会用git,所以只能手动去改源文件
先找到bug报错的adamW优化器源代码,需要改的地方就三个文件,每个文件分别把对应的那行代码挪一下:
52 for group in self.param_groups:
53 params_with_grad = []
54 grads = []
55 square_avgs = []
56 acc_deltas = []
57 lr, rho, eps, weight_decay = group['lr'], group['rho'], group['eps'], group['weight_decay'] #changed bug 52944 ++加这一行
...
78 #lr, rho, eps, weight_decay = group['lr'], group['rho'], group['eps'], group['weight_decay'] #changed bug 52944 --注释掉这一行
68 for group in self.param_groups:
69 params_with_grad = []
70 grads = []
71 exp_avgs = []
72 exp_avg_sqs = []
73 state_sums = []
74 max_exp_avg_sqs = []
75 state_steps = []
76 beta1, beta2 = group['betas'] #changed bug 52944 ++加这一行
...
108 #beta1, beta2 = group['betas'] #changed bug 52944 --注释掉这一行
67 for group in self.param_groups:
68 params_with_grad = []
69 grads = []
70 exp_avgs = []
71 exp_avg_sqs = []
72 state_sums = []
73 max_exp_avg_sqs = []
74 state_steps = []
75 amsgrad = group['amsgrad']
76 beta1, beta2 = group['betas'] # changed bug52944 ++加这一行
...
105 #beta1, beta2 = group['betas'] #changed bug52944 -- 注释掉这一行