Pytorch--OverflowError: cannot serialize a bytes object larger than 4 GiB

首先给出报错提示

[07.27.20|23:44:32] Training epoch: 0
Traceback (most recent call last):
  File "main.py", line 31, in 
    p.start()
  File "D:\code\st-gcn\processor\processor.py", line 114, in start
    self.train()
  File "D:\code\st-gcn\processor\recognition.py", line 85, in train
    for data, label in loader:
  File "D:\Python36\lib\site-packages\torch\utils\data\dataloader.py", line 279, in __iter__
    return _MultiProcessingDataLoaderIter(self)
  File "D:\Python36\lib\site-packages\torch\utils\data\dataloader.py", line 719, in __init__
    w.start()
  File "D:\Python36\lib\multiprocessing\process.py", line 105, in start
    self._popen = self._Popen(self)
  File "D:\Python36\lib\multiprocessing\context.py", line 223, in _Popen
    return _default_context.get_context().Process._Popen(process_obj)
  File "D:\Python36\lib\multiprocessing\context.py", line 322, in _Popen
    return Popen(process_obj)
  File "D:\Python36\lib\multiprocessing\popen_spawn_win32.py", line 65, in __init__
    reduction.dump(process_obj, to_child)
  File "D:\Python36\lib\multiprocessing\reduction.py", line 60, in dump
    ForkingPickler(file, protocol).dump(obj)
OverflowError: cannot serialize a bytes object larger than 4 GiB
Traceback (most recent call last):
  File "", line 1, in 
  File "D:\Python36\lib\multiprocessing\spawn.py", line 105, in spawn_main
    exitcode = _main(fd)
  File "D:\Python36\lib\multiprocessing\spawn.py", line 115, in _main
    self = reduction.pickle.load(from_parent)
EOFError: Ran out of input

主要在每个epoch划分batch_size的时候,这里出错。

常见错误提示有:OverflowError: cannot serialize a bytes object larger than 4 GiB 或者 EOFError: Ran out of input

解决办法官方:https://discuss.pytorch.org/t/pytorch-windows-eoferror-ran-out-of-input-when-num-workers-0/25918

直接修改num_worker为0,windows上遗留下来的问题

另外,如果出现类似如下错误提示:

RuntimeError: CUDA out of memory. Tried to allocate 236.00 MiB (GPU 0; 8.00 GiB total capacity; 5.76 GiB already allocated; 161.97 MiB free; 5.78 GiB
 reserved in total by PyTorch)

也即:RuntimeError: CUDA out of memory. 

解决办法,修改config文件中batch_size大小,往小调,如果对精度有疑虑,换卡

你可能感兴趣的:(python)