CUDA 11.3安装

显卡驱动

1. 检查cuda对应driver版本:

https://docs.nvidia.com/cuda/cuda-toolkit-release-notes/index.html#title-new-features

2. 安装显卡驱动

  • 参考https://phantomvk.github.io/2019/06/29/Ubuntu_install_nVidia_Driver/

  • 添加PPA,过程中回车确认询问的信息

sudo add-apt-repository ppa:graphics-drivers/ppa
  • 依次执行以下命令更新下载源
sudo apt update
  • 检查可用驱动
ubuntu-drivers devices

根据下列结果,这里 nvidia-driver-465 为推荐驱动安装版本(但cuda11.3要求465以上)

== /sys/devices/pci0000:64/0000:64:00.0/0000:65:00.0 ==
modalias : pci:v000010DEd00002204sv00001028sd00003880bc03sc00i00
vendor : NVIDIA Corporation
driver : nvidia-driver-460 - distro non-free
driver : nvidia-driver-460-server - distro non-free
driver : nvidia-driver-465 - distro non-free recommended
driver : xserver-xorg-video-nouveau - distro free builtin

直接安装最新驱动,安装完成后重启电脑

sudo apt install nvidia-driver-465

也可以用sudo ubuntu-drivers autoinstall 会自动安装推荐的版本(一般是最高的版本)重启后用命令 nvidia-smi 检查显卡是否被正常识别:显卡P106-100,显存6G,驱动430.26,CUDA10.2(应显示驱动为440)

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 430.26       Driver Version: 430.26       CUDA Version: 10.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  P106-100            Off  | 00000000:01:00.0 Off |                  N/A |
| 39%   40C    P0    27W / 120W |      0MiB /  6080MiB |      4%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

如果出现NVIDIA-SMI has failed because it couldn’t communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running. 需要手动关闭BIOS的安全模式。

错误处理

如果安装 nvidia-driver-410 或以上版本提示 packages 无法安装,请执行以下步骤:

  • 移除已添加的 PPA
sudo apt-add-repository -r ppa:graphics-drivers/ppa

更新 apt

sudo apt update

移除 NVIDIA 显卡驱动文件

sudo apt remove nvidia*

执行自动清理

sudo apt autoremove

然后重新回到本文初步骤重新安装

CUDA

  • 下载安装程序并启动(推荐runfile方式)

根据配置选择cuda的版本 https://developer.nvidia.com/cuda-downloads?target_os=Linux&target_arch=x86_64&Distribution=Ubuntu&target_version=18.04&target_type=runfile_local

wget wget https://developer.download.nvidia.com/compute/cuda/11.3.1/local_installers/cuda_11.3.1_465.19.01_linux.run
sudo sh cuda_10.2.89_440.33.01_linux.run

cuDNN

参考 https://blog.csdn.net/u013084111/article/details/104167056
注册并下载以下3个文件
cuDNN Runtime Library for Ubuntu18.04 (Deb)
cuDNN Developer Library for Ubuntu18.04 (Deb)
cuDNN Code Samples and User Guide for Ubuntu18.04 (Deb)

要下载deb文件,不要下载tgz文件!(亲测tgz易错)
网址
https://developer.nvidia.com/rdp/cudnn-download
进入到下载文件所在目录

$ sudo dpkg -i libcudnn7_7.6.5.32-1+cuda10.2_amd64.deb
$ sudo dpkg -i libcudnn7-dev_7.6.5.32-1+cuda10.2_amd64.deb
$ sudo dpkg -i libcudnn7-doc_7.6.5.32-1+cuda10.2_amd64.deb

检测cuDNN是否安装成功:
运行/usr/src/cudnn_samples_v7 中的mnistCUDNN sample

$ cd /usr/src/cudnn_samples_v7
$ cp -r /usr/src/cudnn_samples_v7/ $HOME
$ cd $HOME/cudnn_samples_v7/mnistCUDNN
$ make clean && make
$ ./mnistCUDNN

如果出现 Test passed! 则说明安装成功
如果编译时出现fatal error: FreeImage.h: No such file or directory错误,参考https://blog.csdn.net/xhw205/article/details/116297555

  • 查看cudnn版本
    之前网上提供的方式cat /usr/local/cuda/include/cudnn.h | grep CUDNN_MAJOR -A 2已经不能用了,因为cudnn.h文件里内容修改了,不再存放版本信息。
    使用命令
cat /usr/local/cuda/include/cudnn.h | grep cudnn

查看,发现里面导入了cudnn_version.h文件,版本信息就存放在这个文件里
CUDA 11.3安装_第1张图片
使用find命令找到对应的文件就可以了

find / -name cudnn_version.h 2>&1 | grep -v "Permission denied"
cat /usr/include/cudnn_version.h | grep CUDNN_MAJOR -A 2

你可能感兴趣的:(配置,深度学习,pytorch,自动驾驶)