【多机多卡】mmsegmentation训练报错“RuntimeError: NCCL error in: /opt/pytorch/pytorch/torch/csrc/distributed/”
多机多卡训练代码:报错信息:RuntimeError:NCCLerrorin:/opt/pytorch/pytorch/torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp:1248,unhandledsystemerror,NCCLversion2.12.10第一台机器:NNODES=2NODE_RANK=0PORT=8888MASTER_ADDR=1