RuntimeError:[torch.cuda.FloatTensor [ ]] is at version 4; expected version 3 instead.)

RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [512]] is at version 4; expected version 3 instead.

  • 错误描述
    • 尝试操作
    • 分析

错误描述

在执行loss.backward()时出现错误:
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [512]] is at version 4; expected version 3 instead. Hint: enable anomaly
detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).

尝试操作

  1. nn.ReLU()改为nn.ReLU(inplace=False) ,失败;
  2. nn.ReLU()改为nn.ReLU6() ,失败;
  3. torch.nn.parallel.DistributedDataParallel(…,broadcast_buffers=False,… ) 中添加broadcast_buffers=False参数,成功解决;

分析

我只使用了一个GPU和一个节点,而代码支持多GPU。
具体原因未知。

类似问题参考:
https://github.com/NVlabs/FUNIT/issues/23

你可能感兴趣的:(pytorch,python,深度学习,python,人工智能)