ERROR:torch.distributed.elastic.agent.server.local_elastic_agent:[default] Worker group failed

ERROR:torch.distributed.elastic.multiprocessing.api:failed

ERROR:torch.distributed.elastic.agent.server.local_elastic_agent:[default] Worker group failed

单个GPU跑多GPU训练程序报的错误,解决方式就是os.environ['CUDA_VISIBLE_DEVICES'] =[0,1]

你可能感兴趣的:(深度学习,python,人工智能)