pytorch多个显卡并行训练 RuntimeError: Caught RuntimeError in replica 0 on device 0.

我的踩坑记录

报错一:

RuntimeError: Caught RuntimeError in replica 0 on device 0.

报错二:

RuntimeError: sizes of tensors must match except in dimension 1. expected size 1 but got size 2 for tensor number 1 in the list.

错误原因:

数据集和GPU个数需要呈现严格的倍数关系:data_number % GPU_number = 0, 否则就会呈现这样错误。原因在于DP通过第一维度(也就是batch_size)分配给不同的GPU。

你可能感兴趣的:(pytorch,深度学习,人工智能)