PyTorch报错torch.cuda.is_available() is False解决方法

问题

PyTorch无法使用GPU,报以下错误:

Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False.

分析

验证CUDA 运行是否正常:

# 进入 CUDA Samples目录,以“~/NVIDIA_CUDA-11.0_Samples”为例
cd ~/NVIDIA_CUDA-11.0_Samples/1_Utilities/deviceQuery
make
./deviceQuery

发现CUDA不能正常运行:

CUDA Device Query (Runtime API) version (CUDART static linking)

cudaGetDeviceCount returned 802
-> system not yet initialized
Result = FAIL

查找资料发现是由于服务器重启后Nvidia Fabric Manager没有启动导致。

解决

配置Nvidia Fabric Manager开机启动并启动服务:

sudo systemctl enable nvidia-fabricmanager.service
sudo service nvidia-fabricmanager start

CUDA即可正常运行。

参考

cuda runtime error (802) : system not yet initialized …/THCGeneral.cpp:50

你可能感兴趣的:(实用教程,pytorch,人工智能,python)