报错解决:RuntimeError:The detected CUDA version mismatches the version that was used to compile PyTorch.

报错解决:RuntimeError:The detected CUDA version mismatches the version that was used to compile PyTorch.

  • 报错
  • 解决方法:
    • cuda安装
    • cuda多版本切换
      • 方法一:通过修改软链接的方式
      • 方法二:修改bashrc中cuda的路径
  • 附录:可能遇到的报错
    • 报错一:GCC版本不兼容
    • 报错二:安装路径报错

报错

博主在编译安装软件时,遇到报错如下:

File "/home/XXX/miniconda3/envs/lin/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 404, in build_extensions
        self._check_cuda_version()
File "/home/XXX/miniconda3/envs/lin/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 781, in _check_cuda_version
        raise RuntimeError(CUDA_MISMATCH_MESSAGE.format(cuda_str_version, torch.version.cuda))
RuntimeError:
    The detected CUDA version (11.3) mismatches the version that was used to compile PyTorch (10.2). Please make sure to use the same CUDA versions.

报错原因:CUDA版本和Pytorch版本不匹配。

解决方法:

cuda安装

前往官网,下载低版本的cuda并安装,例如我选择安装cuda10.2,网页如下图所示:

运行如下指令进行安装:

wget https://developer.download.nvidia.com/compute/cuda/10.2/Prod/local_installers/cuda_10.2.89_440.33.01_linux.run
sudo sh cuda_10.2.89_440.33.01_linux.run

输入accept:
报错解决:RuntimeError:The detected CUDA version mismatches the version that was used to compile PyTorch._第1张图片
取消安装cuda驱动(因为在这之前已经安装好更高版本的显卡驱动就不需要再重复安装)
报错解决:RuntimeError:The detected CUDA version mismatches the version that was used to compile PyTorch._第2张图片
Do you want to install a symbolic link at /usr/local/cuda? # 是否将安装目录通过软连接的方式 link 到 /usr/local/cuda
选择yes

安装完成后,终端会有如下输出:

===========
= Summary =
===========

Driver:   Not Selected
Toolkit:  Installed in /usr/local/cuda-10.2/
Samples:  Installed in /root/, but missing recommended libraries

Please make sure that
 -   PATH includes /usr/local/cuda-10.2/bin
 -   LD_LIBRARY_PATH includes /usr/local/cuda-10.2/lib64, or, add /usr/local/cuda-10.2/lib64 to /etc/ld.so.conf and run ldconfig as root

To uninstall the CUDA Toolkit, run cuda-uninstaller in /usr/local/cuda-10.2/bin

Please see CUDA_Installation_Guide_Linux.pdf in /usr/local/cuda-10.2/doc/pdf for detailed information on setting up CUDA.
***WARNING: Incomplete installation! This installation did not install the CUDA Driver. A driver of version at least 440.00 is required for CUDA 10.2 functionality to work.
To install the driver using this installer, run the following command, replacing <CudaInstaller> with the name of this run file:
    sudo <CudaInstaller>.run --silent --driver

Logfile is /var/log/cuda-installer.log

检查cuda版本:

nvcc --version

报错解决:RuntimeError:The detected CUDA version mismatches the version that was used to compile PyTorch._第3张图片
结果如上图所示,说明安装成功。

cuda多版本切换

如果之后需要进行其他cuda版本的切换可以按照如下两个方法:

方法一:通过修改软链接的方式

  1. 将~/.bashrc 下与cuda相关的路径都改为/usr/local/cuda/,而不使用具体某一cuda版本,例如cuda-10.2、cuda-11.3等。
  2. 修改软连接
# 删除之前创建的软链接
rm -rf /usr/local/cuda
# 建立新的软连接
sudo ln -s /usr/local/cuda-11.3/ /usr/local/cuda/
# 查看当前cuda版本
nvcc --version

方法二:修改bashrc中cuda的路径

打开bashrc文件,在最后添加如下:(根据需要使用cuda版本修改版本号):

export PATH="/usr/local/cuda-10.2/bin:$PATH"
export LD_LIBRARY_PATH="/usr/lcoal/cuda-10.2/lib64:$LD_LIBRARY_PATH"

之后source更新一下source ~/.bashrc

附录:可能遇到的报错

报错一:GCC版本不兼容

Failed to verify gcc version. See log at /var/log/cuda-installer.log for details.

使用以下指令进行安装:

sudo sh cuda_10.2.89_440.33.01_linux.run --override

报错二:安装路径报错

Installation failed. See log at /var/log/cuda-installer.log for details.

查看log的报错原因

cat /var/log/cuda-installer.log | grep [ERROR]

发现报错原因是:

[ERROR]: boost::filesystem::remove: Directory not empty: "/var/log/nvidia/.uninstallManifests/"

使用以下命令进行安装:

sudo sh cuda_10.2.89_440.33.01_linux.run --librarypath=/usr/local/cuda-10.2

你可能感兴趣的:(Linux,pytorch,cuda)