Pytorch GPU由于num_workers读取lmdb格式引发的BUG Cannot re-initialize CUDA&TypeError: can't pickle odict_keys

如果你遇到了如下的BUG:

1、"Cannot re-initialize CUDA in forked subprocess. " + msg) RuntimeError: Cannot re-initialize CUDA in forked subprocess. To use CUDA with multiprocessing, you must use the 'spawn' start method

官方解决方案:https://discuss.pytorch.org/t/not-using-multiprocessing-but-getting-cuda-error-re-forked-subprocess/54610

2、C:\ProgramData\Anaconda3\lib\multiprocessing\reduction.py in dump(obj, file, protocol) 58 def dump(obj, file, protocol=None): 59 '''Replacement for pickle.dump() using ForkingPickler.''' ---> 60 ForkingPickler(file, protocol).dump(obj) 61 62 # TypeError: can't pickle odict_keys objects

官方解决方案:https://discuss.pytorch.org/t/dataloader-issues-with-multiprocessing-when-i-do-torch-multiprocessing-set-start-method-spawn-force-true/69275

等等 这似乎是个windows版独有问题,但是我在linux上也有遭遇

你可能可以通过

1、设置工作模式:torch.multiprocessing.set_start_method('spawn')

2、用main包起来

import torch

def main()
    for i, data in enumerate(dataloader):
        # do something here

if __name__ == '__main__':
    main()

3、设置num_workers为0解决

但是使用第三个方法等于向邪恶势力低头

你可以使用linux系统,然后依旧设置>0的num_workers。在dataloader读取数据前,即上述代码第四行前,保证读取的数据是在cpu中,随后再移入gpu中,进行网络的运算。

你可能感兴趣的:(人工智能,pytorch)