Bug:
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.LongTensor [4, 512, 512]] is at version 1; expected version 0 instead. Hint: the backtrace further above shows the operation that failed to compute its gradient. The variable in question was changed in there or anywhere later. Good luck!
[W python_anomaly_mode.cpp:104] Warning: Error detected in NllLoss2DBackward0. Traceback of forward call that caused the error:
File "
File "/data2/user10/anaconda3/envs/zwk/lib/python3.8/multiprocessing/spawn.py", line 116, in spawn_main
exitcode = _main(fd, parent_sentinel)
File "/data2/user10/anaconda3/envs/zwk/lib/python3.8/multiprocessing/spawn.py", line 129, in _main
return self._bootstrap(parent_sentinel)
File "/data2/user10/anaconda3/envs/zwk/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
self.run()
File "/data2/user10/anaconda3/envs/zwk/lib/python3.8/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "/data2/user10/anaconda3/envs/zwk/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 59, in _wrap
fn(i, *args)
File "/data2/user10/code/UDA_SS/ReCo_loveDA/train_semisup_loveDA.py", line 211, in main_train
unsup_loss = compute_unsupervised_loss(pred_u_large, train_u_aug_label, train_u_aug_logits, args.strong_threshold,ignore_index=250)
File "/data2/user10/code/UDA_SS/ReCo_loveDA/module_list.py", line 71, in compute_unsupervised_loss
loss = F.cross_entropy(predict, target, reduction='none', ignore_index=ignore_index)# loss shape : torch.Size([2, 512, 512])
File "/data2/user10/anaconda3/envs/zwk/lib/python3.8/site-packages/torch/nn/functional.py", line 2846, in cross_entropy
return torch._C._nn.cross_entropy_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index, label_smoothing)
(function _print_stack)
[W python_anomaly_mode.cpp:104] Warning: Error detected in NllLoss2DBackward0. Traceback of forward call that caused the error:
File "
File "/data2/user10/anaconda3/envs/zwk/lib/python3.8/multiprocessing/spawn.py", line 116, in spawn_main
exitcode = _main(fd, parent_sentinel)
File "/data2/user10/anaconda3/envs/zwk/lib/python3.8/multiprocessing/spawn.py", line 129, in _main
return self._bootstrap(parent_sentinel)
File "/data2/user10/anaconda3/envs/zwk/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
self.run()
File "/data2/user10/anaconda3/envs/zwk/lib/python3.8/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "/data2/user10/anaconda3/envs/zwk/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 59, in _wrap
fn(i, *args)
File "/data2/user10/code/UDA_SS/ReCo_loveDA/train_semisup_loveDA.py", line 211, in main_train
unsup_loss = compute_unsupervised_loss(pred_u_large, train_u_aug_label, train_u_aug_logits, args.strong_threshold,ignore_index=250)
File "/data2/user10/code/UDA_SS/ReCo_loveDA/module_list.py", line 71, in compute_unsupervised_loss
loss = F.cross_entropy(predict, target, reduction='none', ignore_index=ignore_index)# loss shape : torch.Size([2, 512, 512])
File "/data2/user10/anaconda3/envs/zwk/lib/python3.8/site-packages/torch/nn/functional.py", line 2846, in cross_entropy
return torch._C._nn.cross_entropy_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index, label_smoothing)
(function _print_stack)
0%| | 0/145 [00:20, ?it/s]
0%| | 0/145 [00:20, ?it/s]
Traceback (most recent call last):
File "train_semisup_loveDA.py", line 398, in
mp.spawn(main_train, args=(world_size, args), nprocs=world_size, join=True)
File "/data2/user10/anaconda3/envs/zwk/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 230, in spawn
return start_processes(fn, args, nprocs, join, daemon, start_method='spawn')
File "/data2/user10/anaconda3/envs/zwk/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 188, in start_processes
while not context.join():
File "/data2/user10/anaconda3/envs/zwk/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 150, in join
raise ProcessRaisedException(msg, error_index, failed_process.pid)
torch.multiprocessing.spawn.ProcessRaisedException:
-- Process 0 terminated with the following error:
Traceback (most recent call last):
File "/data2/user10/anaconda3/envs/zwk/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 59, in _wrap
fn(i, *args)
File "/data2/user10/code/UDA_SS/ReCo_loveDA/train_semisup_loveDA.py", line 231, in main_train
loss.backward()
File "/data2/user10/anaconda3/envs/zwk/lib/python3.8/site-packages/torch/_tensor.py", line 307, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)
File "/data2/user10/anaconda3/envs/zwk/lib/python3.8/site-packages/torch/autograd/__init__.py", line 154, in backward
Variable._execution_engine.run_backward(
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.LongTensor [4, 512, 512]] is at version 1; expected version 0 instead. Hint: the backtrace further above shows the operation that failed to compute its gradient. The variable in question was changed in there or anywhere later. Good luck!
对输入到compute_unsupervised_loss函数的train_u_aug_label 进行clone()操作。