报错信息如下:
Exception ignored in:
Traceback (most recent call last):
File "/home/vincent/anaconda3/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 926, in __del__
self._shutdown_workers()
File "/home/vincent/anaconda3/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 882, in _shutdown_workers
self._shutdown_workers()
File "/home/vincent/anaconda3/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 882, in _shutdown_workers
if not self._shutdown:
AttributeError: '_MultiProcessingDataLoaderIter' object has no attribute '_shutdown'
if not self._shutdown:
AttributeError: '_MultiProcessingDataLoaderIter' object has no attribute '_shutdown'
Traceback (most recent call last):
File "/media/vincent/0A9AD66165F33762/VincentCode/SAN-master1/model/san_lighten.py", line 336, in
main()
File "/media/vincent/0A9AD66165F33762/VincentCode/SAN-master1/model/san_lighten.py", line 331, in main
trainer.fit(model)
File "/home/vincent/anaconda3/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 844, in fit
mp.spawn(self.ddp_train, nprocs=self.num_processes, args=(model,))
File "/home/vincent/anaconda3/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 171, in spawn
while not spawn_context.join():
File "/home/vincent/anaconda3/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 118, in join
raise Exception(msg)
Exception:
-- Process 0 terminated with the following error:
Traceback (most recent call last):
File "/home/vincent/anaconda3/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 19, in _wrap
fn(i, *args)
File "/home/vincent/anaconda3/lib/python3.7/site-packages/pytorch_lightning/trainer/distrib_data_parallel.py", line 389, in ddp_train
self.run_pretrain_routine(model)
File "/home/vincent/anaconda3/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 1001, in run_pretrain_routine
False)
File "/home/vincent/anaconda3/lib/python3.7/site-packages/pytorch_lightning/trainer/evaluation_loop.py", line 256, in _evaluate
for batch_idx, batch in enumerate(dataloader):
File "/home/vincent/anaconda3/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 278, in __iter__
return _MultiProcessingDataLoaderIter(self)
File "/home/vincent/anaconda3/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 647, in __init__
self._worker_result_queue = multiprocessing_context.Queue()
File "/home/vincent/anaconda3/lib/python3.7/multiprocessing/context.py", line 102, in Queue
return Queue(maxsize, ctx=self.get_context())
File "/home/vincent/anaconda3/lib/python3.7/multiprocessing/queues.py", line 42, in __init__
self._rlock = ctx.Lock()
File "/home/vincent/anaconda3/lib/python3.7/multiprocessing/context.py", line 67, in Lock
return Lock(ctx=self.get_context())
File "/home/vincent/anaconda3/lib/python3.7/multiprocessing/synchronize.py", line 162, in __init__
SemLock.__init__(self, SEMAPHORE, 1, 1, ctx=ctx)
File "/home/vincent/anaconda3/lib/python3.7/multiprocessing/synchronize.py", line 80, in __init__
register(self._semlock.name)
File "/home/vincent/anaconda3/lib/python3.7/multiprocessing/semaphore_tracker.py", line 83, in register
self._send('REGISTER', name)
File "/home/vincent/anaconda3/lib/python3.7/multiprocessing/semaphore_tracker.py", line 90, in _send
self.ensure_running()
File "/home/vincent/anaconda3/lib/python3.7/multiprocessing/semaphore_tracker.py", line 71, in ensure_running
pid = util.spawnv_passfds(exe, args, fds_to_pass)
File "/home/vincent/anaconda3/lib/python3.7/multiprocessing/util.py", line 420, in spawnv_passfds
False, False, None)
OSError: [Errno 12] Cannot allocate memory
这个是ubuntu交换内存过小的原因导致不能创建新线程。出现这个问题可以尝试调大交换内存(我加多了8G后不再出现报错),教程如下:https://www.jianshu.com/p/833e81c0d854