查询显卡有命令如下:
命令一:
lspci | grep -i vga
03:00.0 VGA compatible controller: NVIDIA Corporation GM200 [GeForce GTX TITAN X] (rev a1)
命令二:
lspci | grep -i nvidia
03:00.0 VGA compatible controller: NVIDIA Corporation GM200 [GeForce GTX TITAN X] (rev a1)
03:00.1 Audio device: NVIDIA Corporation GM200 High Definition Audio (rev a1)
命令三:
nvidia-smi ✔ pytorch 09:21:54 上午
Mon Oct 07 09:29:06 2019
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 515.65.01 Driver Version: 515.65.01 CUDA Version: 11.7 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA GeForce ... Off | 00000000:03:00.0 On | N/A |
| 22% 42C P8 19W / 250W | 378MiB / 12288MiB | 2% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 1957 G /usr/lib/xorg/Xorg 224MiB |
| 0 N/A N/A 2147 G /usr/bin/gnome-shell 38MiB |
| 0 N/A N/A 44892 G ...855553422090571671,131072 82MiB |
| 0 N/A N/A 67885 G ...nlogin/bin/sunloginclient 10MiB |
| 0 N/A N/A 102221 G ...2gtk-4.0/WebKitWebProcess 13MiB |
+-----------------------------------------------------------------------------+
显卡型号是GM200[GeForce GTX TITAN X]
网上搜索了许多教程,本人尝试了最简单的方法:
按下win可以唤出 Software & Updates 管理界面,然后选择“Addtional Drivers”,本人选择第一个选择;
PS:若主机有多张显卡时,一定要点击可以安装的驱动版本,不断尝试,就可以成功。
安装结束了,在终端测试,是否安装成功;
若显示内容,表示驱动安装成功。
nvidia-smi
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 510.47.03 Driver Version: 510.47.03 CUDA Version: 11.6 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA GeForce ... Off | 00000000:03:00.0 On | N/A |
| 22% 45C P8 22W / 250W | 262MiB / 12288MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 1157 G /usr/lib/xorg/Xorg 78MiB |
| 0 N/A N/A 1833 G /usr/bin/gnome-shell 84MiB |
| 0 N/A N/A 3157 G ...644175923361352850,131072 93MiB |
+-----------------------------------------------------------------------------+
出现上述虚拟框图就代表已经成功安装显卡驱动了,于是准备装cuda,版本要和上面的第一行的cuda version一致,本机这里是11.6,所以就要安装cuda 11.6或cuda 11.6以下版本。
请认准官网版本CUDA11.2
按照官网提示的输入到终端中,Download Installer for Linux Ubuntu 22.04 x86_64:
wget https://developer.download.nvidia.com/compute/cuda/11.7.0/local_installers/cuda_11.7.0_515.43.04_linux.run
sudo sh cuda_11.7.0_515.43.04_linux.run
运行.run文件,进入协议说明:
错误说明:安装cuda的时候提示有多个显卡驱动:Existing package manager installation of the driver found. It is strongly recommended that you remove this before continuing.
❤将命令行路径切换到安装文件所在目录,输入sudo sh cuda_11.2.0_460.27.04_linux.run
,输入密码进行安装。
上面界面,点击Continue,然后再输入accept。
cuda 11.2
❤安装过程中会有提示,需要输入accept,去掉Driver
选项,因为已经有驱动。继续选择Install
.安完会有driver未安装提示。
所以Driver就不要了,除了第二个,其他的都不要。(括号里面的叉号是选择的意思,回车进行选择)
Do you accept the above EULA? (accept/decline/quit):
│ accept #输入accept
│─────────────────────────────────────────────────────
#安装选项,由于我已经安装有Driver: 418.56,所以没有选择。
│ CUDA Installer
│ - [ ] Driver
│ [ ] 418.39
│ + [X] CUDA Toolkit 11.2
│ [ ] CUDA Samples 11.2
│ [ ] CUDA Demo Suite 10.2
│ [ ] CUDA Documentation 10.2
│ Options
│ Install #[ ]不选择,带X的是需要安装的部分,之后选择 Install
————————————————
──────────────────────────────────────────────────────────────────────────────┐
│ A symlink already exists at /usr/local/cuda. Update to this installation? │
│ Yes #选择Yes进行安装 │
│ No
————————————————
cuda 11.7
若已经安装了驱动,就不用选择Driver,不用在中括号中打叉。
┌──────────────────────────────────────────────────────────────────────────────┐
│ CUDA Installer │
│ - [ ] Driver │
│ [ ] 515.43.04 │
│ + [X] CUDA Toolkit 11.7 │
│ [X] CUDA Demo Suite 11.7 │
│ [X] CUDA Documentation 11.7 │
│ Options │
│ Install │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ Up/Down: Move | Left/Right: Expand | 'Enter': Select | 'A': Advanced options │
└──────────────────────────────────────────────────────────────────────────────┘
选择“Install”,进行下一步。
===========
= Summary =
===========
Driver: Not Selected
Toolkit: Installed in /usr/local/cuda-11.7/
Please make sure that
- PATH includes /usr/local/cuda-11.7/bin
- LD_LIBRARY_PATH includes /usr/local/cuda-11.7/lib64, or, add /usr/local/cuda-11.7/lib64 to /etc/ld.so.conf and run ldconfig as root
To uninstall the CUDA Toolkit, run cuda-uninstaller in /usr/local/cuda-11.7/bin
***WARNING: Incomplete installation! This installation did not install the CUDA Driver. A driver of version at least 515.00 is required for CUDA 11.7 functionality to work.
To install the driver using this installer, run the following command, replacing <CudaInstaller> with the name of this run file:
sudo <CudaInstaller>.run --silent --driver
Logfile is /var/log/cuda-installer.log
查看原来的zsh已有的环境变量:
echo $PATH
/bin:/usr/bin:/usr/local/bin:
这一步必不可少。
终端输入命令:vim ~/.profile
or vim ~/.bashrc
or vim ~/.zshrc(推荐使用)
按i进入编辑模式,输入:
export PATH=/usr/local/cuda-11.2/bin${PATH:+:${PATH}} #11.2
export LD_LIBRARY_PATH=/usr/local/cuda-11.2/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda-11.2/lib${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
:wq!保存文件并退出。
重新执行文件:
# source ~/.profile #重新执行文件(刷新)
source ~/.zshrc
nvcc -V #测试命令
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2020 NVIDIA Corporation
Built on Mon_Nov_30_19:08:53_PST_2020
Cuda compilation tools, release 11.2, V11.2.67
Build cuda_11.2.r11.2/compiler.29373293_0
自己找cudnn官方下载地址
(未注册的话,注册一个账号即可)
tar -zxvf cudnn-11.2-linux-x64-v8.1.1.33.tgz
cd cuda # 此处进入cudnn解压的目录
$ sudo cp cuda/include/cudnn.h /usr/local/cuda-10.2/include/
$ sudo cp cuda/lib64/libcudnn* /usr/local/cuda-10.2/lib64/
$ sudo chmod a+r /usr/local/cuda/include/cudnn.h /usr/local/cuda/lib64/libcudnn*
$ cat /usr/local/cuda-10.2/include/cudnn.h | grep CUDNN_MAJOR -A 2
以上命令出现问题:cat: /usr/local/cuda/include/cudnn_version.h: 没有那个文件或目录
关于cudnn版本的查看,大部分教程给的操作都是
cat /usr/local/cuda/include/cudnn.h | grep CUDNN_MAJOR -A 2
但是新一点的cudnn都无法再通过这条指令查看版本号了
这是因为,新一些的cudnn版本信息都写在在cudnn_version.h而不是cudnn.h
解决此问题:
$ sudo cp cuda/include/cudnn_version.h /usr/local/cuda/include #多加这一行命令
$ cat /usr/local/cuda/include/cudnn_version.h | grep CUDNN_MAJOR -A 2 #查看cudnn版本
#define CUDNN_MAJOR 8
#define CUDNN_MINOR 1
#define CUDNN_PATCHLEVEL 1
--
#define CUDNN_VERSION (CUDNN_MAJOR * 1000 + CUDNN_MINOR * 100 + CUDNN_PATCHLEVEL)
#endif /* CUDNN_VERSION_H */
cudnn官方下载地址
sudo gdebi libcudnn8_8.1.1.33-1+cuda11.2_amd64.deb
sudo gdebi libcudnn8-dev_8.1.1.33-1+cuda11.2_amd64.deb
sudo gdebi libcudnn8-samples_8.1.1.33-1+cuda11.2_amd64.deb
cudnn 8.5
选择Ubuntu22.04 x86_64相关版本:
Local Installer for Ubuntu22.04 x86_64 (Deb)
安装过程代码:
#sudo dpkg -i cudnn-local-repo-${OS}-8.x.x.x_1.0-1_amd64.deb
sudo dpkg -i cudnn-local-repo-ubuntu2004-8.5.0.96_1.0-1_amd64.deb
Selecting previously unselected package cudnn-local-repo-ubuntu2204-8.5.0.96.
(Reading database ... 199333 files and directories currently installed.)
Preparing to unpack cudnn-local-repo-ubuntu2204-8.5.0.96_1.0-1_amd64.deb ...
Unpacking cudnn-local-repo-ubuntu2204-8.5.0.96 (1.0-1) ...
Setting up cudnn-local-repo-ubuntu2204-8.5.0.96 (1.0-1) ...
The public CUDA GPG key does not appear to be installed.
To install the key, run this command:
sudo cp /var/cudnn-local-repo-ubuntu2204-8.5.0.96/cudnn-local-7ED72349-keyring.gpg /usr/share/keyrings/
#sudo cp /var/cudnn-local-repo-*/cudnn-local-*-keyring.gpg /usr/share/keyrings/
sudo cp /var/cudnn-local-repo-ubuntu2204-8.5.0.96/cudnn-local-7ED72349-keyring.gpg /usr/share/keyrings/
sudo apt-get update
#sudo apt-get install libcudnn8=8.x.x.x-1+cudaX.Y
#sudo apt-get install libcudnn8=8.5.0.96-1+cuda11.3
sudo apt-get install libcudnn8=8.5.0.96-1+cuda11.7
# https://itcn.blog/p/57211130760.html
#sudo apt-get install libcudnn8-dev=8.x.x.x-1+cudaX.Y
sudo apt-get install libcudnn8-dev=8.5.0.96-1+cuda11.7
#sudo apt-get install libcudnn8-samples=8.x.x.x-1+cudaX.Y
sudo apt-get install libcudnn8-samples=8.5.0.96-1+cuda11.7
1)复制cuDNN samples到home目录下
$ cp -r /usr/src/cudnn_samples_v7 /$HOME
2) 进入home目录
$ cd $HOME/cudnn_samples_v7/mnistCUDNN/
3) 编译mnistCUDNN
$ sudo make clean
$ sudo make
4)编译完成再运行mnistCUDNN
$ sudo ./mnistCUDNN
编译环节容易出现错误:
PS
遇到了一些程序包安装错误,我必须安装以下c查找的包:
sudo apt-get install libfreeimage3 libfreeimage-dev