1. RuntimeError: "exp" not implemented for 'torch.LongTensor'
class PositionalEncoding(nn.Module)
div_term = torch.exp(torch.arange(0., d_model, 2) * -(math.log(10000.0) / d_model))
将 “0” 改为 “0.”
否则会报错:RuntimeError: "exp" not implemented for 'torch.LongTensor'
2. RuntimeError: expected type torch.FloatTensor but got torch.LongTensor
class PositionalEncoding(nn.Module)
position = torch.arange(0., max_len).unsqueeze(1)
将 “0” 改为 “0.”
否则会报错:
pe[:, 0::2] = torch.sin(position * div_term)
RuntimeError: expected type torch.FloatTensor but got torch.LongTensor
3. UserWarning: nn.init.xavier_uniform is now deprecated in favor of nn.init.xavier_uniform_.
def make_model
nn.init.xavier_uniform_(p)
将“nn.init.xavier_uniform(p)” 改为 “nn.init.xavier_uniform_(p)”
否则会提示:UserWarning: nn.init.xavier_uniform is now deprecated in favor of nn.init.xavier_uniform_.
4. UserWarning: size_average and reduce args will be deprecated, please use reduction='sum' instead.
class LabelSmoothing
self.criterion = nn.KLDivLoss(reduction='sum')
将 “self.criterion = nn.KLDivLoss(size_average=False)” 改为 “self.criterion = nn.KLDivLoss(reduction='sum')”
否则会提示:UserWarning: size_average and reduce args will be deprecated, please use reduction='sum' instead.
5. IndexError: invalid index of a 0-dim tensor. Use tensor.item() to convert a 0-dim tensor to a Python number
class SimpleLossCompute
return loss.item() * norm
将 “loss.data[0]” 改为 loss.item(),
否则会报错:IndexError: invalid index of a 0-dim tensor. Use tensor.item() to convert a 0-dim tensor to a Python number
6. floating point exception (core dumped)
直接运行“A First Example”会报错:floating point exception (core dumped)
修改方法:https://github.com/harvardnlp/annotated-transformer/issues/26
修改 run_epoch 函数,将计数值转换为numpy。方法:.detach().numpy() 或者直接 .numpy()
以下是亲测可用的代码:
def run_epoch(data_iter, model, loss_compute): "Standard Training and Logging Function" start = time.time() total_tokens = 0 total_loss = 0 tokens = 0 for i, batch in enumerate(data_iter): out = model.forward(batch.src, batch.trg, batch.src_mask, batch.trg_mask) loss = loss_compute(out, batch.trg_y, batch.ntokens) total_loss += loss.detach().numpy() total_tokens += batch.ntokens.numpy() tokens += batch.ntokens.numpy() if i % 50 == 1: elapsed = time.time() - start print("Epoch Step: %d Loss: %f Tokens per Sec: %f" % (i, loss.detach().numpy() / batch.ntokens.numpy(), tokens / elapsed)) start = time.time() tokens = 0 return total_loss / total_tokens
7. loss 均为整数
class SimpleLossCompute
在运行“A First Example” 时, 结果显示的 loss 全部是整数,这就很奇怪了。测试后发现,是 class SimpleLossCompute中的返回值的问题,norm这个tensor是int型的,虽然loss.item()是浮点数,但是return loss.item() * norm的值仍是int型tensor.
修改方法:将norm转为float再进行乘法运算:
return loss.item() * norm.float()