参考:Ubuntu20.04下CUDA、cuDNN的详细安装与配置过程(图文)_嵌入式技术的博客-CSDN博客_ubuntu cudnn安装
【最新】cuDNN在CUDA11.7+Ubuntu20.04下的安装及卸载_weixin_54470372的博客-CSDN博客_dpkg: warning: ignoring request to remove cudnn-lo
官网NVIDIA CUDA Toolkit Documentation
NVIDIA Documentation Center | NVIDIA Developer | NVIDIA CUDA Toolkit
官网NVIDIA cuDNN DocumentationNVIDIA Documentation Center | NVIDIA Developer | NVIDIA cuDNN
sudo update-pciids
cgm@cgm:~/opencv-4.2.0/opencv-4.2.0/build$ lspci | grep -i nvidia
01:00.0 VGA compatible controller: NVIDIA Corporation Device 2560 (rev a1)
01:00.1 Audio device: NVIDIA Corporation Device 228e (rev a1)
cgm@cgm:~/opencv-4.2.0/opencv-4.2.0/build$ sudo update-pciids
[sudo] cgm 的密码:
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 283k 100 283k 0 0 27816 0 0:00:10 0:00:10 --:--:-- 68336
Done.
cgm@cgm:~/opencv-4.2.0/opencv-4.2.0/build$ lspci | grep -i nvidia
01:00.0 VGA compatible controller: NVIDIA Corporation GA106M [GeForce RTX 3060 Mobile / Max-Q] (rev a1)
01:00.1 Audio device: NVIDIA Corporation GA106 High Definition Audio Controller (rev a1)
更新后正确识别出了显卡型号。
Nvidia 卡信息的末尾是 rev a1,表示独显运行。 Nvidia 卡信息的末尾是 rev ff,表示独显已经关闭。
ubuntu20.04 安装NVIDIA驱动很容易,只需要打开系统设置->软件和更新->附加驱动->选择NVIDIA驱动->应用更改。该界面会自动根据电脑上的GPU显示推荐的NVIDIA显卡驱动。
NVIDIA(英伟达)显卡驱动安装完成后,在终端输入nvidia-smi
输出如下图所示的结果就表示安装成功了。下图中Driver Version显示的是当前安装的英伟达驱动版本号470.161.03,CUDA Version显示的是当前驱动版本可以安装的CUDA最高版本号11.4
查看电脑可以安装的版本(如果你的驱动正常不用看下面这些)
下面这个链接是我更新推荐的驱动造成的问题,建议有驱动就不要更新了.
ubuntu因更新驱动开不了机_楚歌again的博客-CSDN博客
ubuntu-drivers devices
安装nvidia驱动,选择上述图片recommend的版本
sudo apt install nvidia-driver-525-open
reboot
我安装了这个recommend的版本直接导致了严重的后果.
ubuntu因更新驱动开不了机_楚歌again的博客-CSDN博客
参考上面的链接,我还是安装的470的驱动,之后跳过安装显卡这一步
查看nvidia驱动信息
nvidia-smi
ubuntu20.04/Ubuntu22.04配置cuda和cuDNN_心儿痒痒的博客-CSDN博客
测试驱动是否安装成功以及查看驱动版本
打开终端输入nvidia-smi
,查看输出情况。若驱动安装成功,会输出类似下图的结果。
下图中需要注意的有两点:Driver Version显示的是当前安装的英伟达驱动版本号470.161.03,CUDA Version显示的是当前驱动版本可以安装的CUDA最高版本号11.4
Ubuntu 20.04安装CUDA 11.0, cuDNN - 简书
官网禁用Nouveau文档链接:CUDA Installation Guide for Linux
注意!在安装NVIDIA驱动以前需要禁止系统自带显卡驱动nouveau:可以先通过指令lsmod | grep nouveau查看nouveau驱动的启用情况,如果有输出表示nouveau驱动正在工作,如果没有内容输出则表示已经禁用了nouveau。
官网Runfile安装文档的链接: CUDA Installation Guide for Linux
cuda兼容性列表:
Table 2. CUDA Toolkit and Minimum Required Driver Version for CUDA Minor Version Compatibility
CUDA Toolkit |
Minimum Required Driver Version for CUDA Minor Version Compatibility* |
|
---|---|---|
Linux x86_64 Driver Version |
Windows x86_64 Driver Version |
|
CUDA 12.0.x |
>=525.60.13 |
>=527.41 |
CUDA 11.8.x |
>=450.80.02 |
>=452.39 |
CUDA 11.7.x |
>=450.80.02 |
>=452.39 |
CUDA 11.6.x |
>=450.80.02 |
>=452.39 |
CUDA 11.5.x |
>=450.80.02 |
>=452.39 |
CUDA 11.4.x |
>=450.80.02 |
>=452.39 |
CUDA 11.3.x |
>=450.80.02 |
>=452.39 |
CUDA 11.2.x |
>=450.80.02 |
>=452.39 |
CUDA 11.1 (11.1.0) |
>=450.80.02 |
>=452.39 |
CUDA 11.0 (11.0.3) |
>=450.36.06** |
>=451.22** |
可见:安装CUDA11.4 需要 Linux x86_64 Driver Version >=470.82.01
如下图所示,这里以CUDA11.4.0为例,介绍ubuntu20.04系统上CUDA的安装。我们可以从NVIDIA官网CUDA下载页面,网址为https://developer.nvidia.com/cuda-toolkit-archive,点击CUDA Toolkit 11.4.0下载相应版本的CUDA11.4.0。
在如下图所示的界面,以此选择Linux→x86_64→Ubuntu→ 20.04。然后弹出三种安装方法,根据安装经验这里推荐采用runfile(local)方法。这是由于CUDA的安装过程需要很多依赖库文件,CUDA的run文件虽然比另外两种安装方法的文件大,但是它包含了所有的依赖库文件,所以采用相对来说很容易安装成功。
在安装CUDA11.4之前需要首先安装一些相互依赖的库文件:
sudo apt-get install freeglut3-dev build-essential libx11-dev libxmu-dev libxi-dev libgl1-mesa-glx libglu1-mesa libglu1-mesa-dev
sudo apt-get install libglfw3-dev
下面为安装CUDA11.4.0的Ubuntu安装指令:
wget https://developer.download.nvidia.com/compute/cuda/11.4.0/local_installers/cuda_11.4.0_470.42.01_linux.run
//cuda_11.4.0_470.42.01_linux.run,表示为cuda_cuda版本号_显卡驱动最低要求版本号_操作系统名称.run
sudo sh cuda_11.4.0_470.42.01_linux.run
运行上面第二条指令后,稍等片刻,会弹出如下界面,点击Continue
然后再输入accept
。
接着,如下图所示,在弹出的界面中通过Enter
键,取消Driver
和470.42.01
的安装,然后点击Install
,等待
可以仔细阅读一下上面的安装信息:
cgm@cgm:~$ sudo sh cuda_11.4.0_470.42.01_linux.run
===========
= Summary =
===========
Driver: Not Selected
Toolkit: Installed in /usr/local/cuda-11.4/
Samples: Installed in /home/cgm/
Please make sure that
- PATH includes /usr/local/cuda-11.4/bin
- LD_LIBRARY_PATH includes /usr/local/cuda-11.4/lib64, or, add /usr/local/cuda-11.4/lib64 to /etc/ld.so.conf and run ldconfig as root
To uninstall the CUDA Toolkit, run cuda-uninstaller in /usr/local/cuda-11.4/bin
***WARNING: Incomplete installation! This installation did not install the CUDA Driver. A driver of version at least 470.00 is required for CUDA 11.4 functionality to work.
To install the driver using this installer, run the following command, replacing with the name of this run file:
sudo .run --silent --driver
Logfile is /var/log/cuda-installer.log
百度翻译一下
cgm@cgm:~$sudo sh cuda_111.4.0_470.42.01_linux.run
===========
=摘要=
===========
驱动程序:未选择
工具包:安装在/usr/local/cuda-11.4中/
示例:安装在/home/cgm中/
请确保
-PATH包括/usr/local/cuda-11.4/bin
-LD_LIBRARY_PATH包含/usr/local/cuda-11.4/lib64,或将/usr/local/cud-11.4/lib64添加到/etc/LD.so。conf并以root身份运行ldconfig
要卸载CUDA Toolkit,请在/usr/local/CUDA-11.4/bin中运行CUDA uninstaller
***警告:安装不完整!此安装未安装CUDA驱动程序。CUDA 11.4功能运行需要至少470.00版本的驱动程序。
要使用此安装程序安装驱动程序,请运行以下命令,将替换为此运行文件的名称:
sudo<CudaInstaller>。运行--静音--驱动程序
日志文件为/var/log/cuda-installer.log
看一下安装的位置吧
系统安装CUDA包括两个部分:NVIDIA CUDA GPU计算工具包和NVIDIA CUD示例包两个部分。
如下图所示,Ubuntu20.04系统会默认地将CUDA的NVIDIA GPU计算工具包安装到/usr/local/文件夹下面,可以看到该文件夹下多了两个文件夹cuda和cuda-11.4。
看一下样例的位置吧
官网环境配置的文档: CUDA Installation Guide for Linux
CUDA安装完成后,需要配置变量环境才能正常使用。首先在终端输入sudo gedit ~/.bashrc打开如下图所示的.bashrc文件。
然后,如下图所示在.bashrc文件的最后添加以下CUDA环境变量配置信息(我从不同的文章中看到这里添加的信息不仅相同,目前还不太清楚具体含义,所以这里仅仅罗列出它们):
sudo gedit ~/.bashrc
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda/lib64
export PATH=$PATH:/usr/local/cuda/bin
export CUDA_HOME=$CUDA_HOME:/usr/local/cuda
注意:上面的CUDA环境变量配置方法有很多,本文的配置方法中的cuda不要指定具体的版本,主要是为了电脑中多个CUDA版本的切换。
有的文章写的是这样(注意版本号)
export PATH=/usr/local/cuda-11.6/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda-11.6/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
有的文章写的是这样(注意版本号)
export PATH=/usr/local/cuda-11.2/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda-11.2/lib64
export LD_LIBRARY_PATH=/usr/local/cuda-11.2/lib64:$LD_LIBRARY_PATH
最后,在终端输入source ~/.bashrc
或者重新启终端使之生效。这时,我们就可以在终端输入nvcc -V
查看CUDA的安装信息,如下图所示,至此CUDA安装成功。
2.3. CUDA测试
对CUDA安装是否成功,需要进入NVIDIA CUDA示例包,其位于主目录
/home/cgm/NVIDIA_CUDA-11.0_Samples内,在该文件夹下打开终端,并输入make,等待。然后进入1_Utilities/deviceQuery文件夹,并在终端执行./deviceQuery命令,如下result=PASS则表示安装成功。
小插曲,报错
VulkanBaseApp.cpp:30:10: fatal error: GLFW/glfw3.h: 没有那个文件或目录
30 | #include
sudo apt-get install libglfw3-dev
cuDNN官方安装文档链接:Installation Guide :: NVIDIA Deep Learning cuDNN Documentation
从NVIDIA官网的cudnn下载页面上下载与安装CUDA对应的cudnn(需要注册),网址为https://developer.nvidia.com/rdp/cudnn-download。选择Ubuntu20.04系统下,CUDA11.4.0对应的cuDNN v版本,如下图所示:
对下载的cudnn-11.4-linux-x64-v8.2.4.15.tgz进行解压操作,得到一个文件夹cudnn-11.4-linux-x64-v8.2.4.15,命令为:
tar -zxvf cudnn-11.4-linux-x64-v8.2.4.15.tgz
然后,进入cudnn-11.4-linux-x64-v8.2.4.15,并右键->在终端打开使用下面两条指令
复制cuda文件夹下的文件 lib64 到 /usr/local/cuda-11.4/lib64/
复制cuda文件夹下的文件 linclude 到 /usr/local/cuda-11.4/include/。
sudo cp cuda/lib64/* /usr/local/cuda-11.4/lib64/
sudo cp cuda/include/* /usr/local/cuda-11.4/include/
拷贝完成后,我们可以使用如下的命令查看cuDNN的信息:
cat /usr/local/cuda-11.4/include/cudnn_version.h | grep CUDNN_MAJOR -A 2
输出下面的信息就是成功了。
从NVIDIA官网的cudnn下载页面上下载三个.deb
格式的检测文件,如下图所示:
在终端输入如下命令安装下载的三个.deb格式的检测文件:
sudo dpkg -i libcudnn8_8.2.4.15-1+cuda11.4_amd64.deb
sudo dpkg -i libcudnn8-dev_8.2.4.15-1+cuda11.4_amd64.deb
sudo dpkg -i libcudnn8-samples_8.2.4.15-1+cuda11.4_amd64.deb
查询
sudo dpkg -l | grep cudnn
通过上面三条指令,cuDNN的测试文件会自动安装在系统的/usr/src/cudnn_samples_v8
文件夹下,进入mnistCUDNN
下,执行命令make clean && make
。如果结果如下图所示,则表示cuDNN安装成功。
执行make时报错:
rm -rf *o
rm -rf mnistCUDNN
CUDA_VERSION is 11040
Linking agains cublasLt = true
CUDA VERSION: 11040
TARGET ARCH: x86_64
HOST_ARCH: x86_64
TARGET OS: linux
SMS: 35 50 53 60 61 62 70 72 75 80 86
/bin/sh: 1: cannot create test.c: Permission denied
/bin/sh: 1: cannot create test.c: Permission denied
g++: error: test.c: 没有那个文件或目录
g++: warning: ‘-x c’ after last input file has no effect
g++: fatal error: no input files
compilation terminated.
>>> WARNING - FreeImage is not set up correctly. Please ensure FreeImage is set up correctly. <<<
[@] /usr/local/cuda/bin/nvcc -I/usr/local/cuda/include -I/usr/local/cuda/include -IFreeImage/include -ccbin g++ -m64 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_53,code=sm_53 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_62,code=sm_62 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_72,code=sm_72 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -gencode arch=compute_86,code=compute_86 -o fp16_dev.o -c fp16_dev.cu
[@] g++ -I/usr/local/cuda/include -I/usr/local/cuda/include -IFreeImage/include -o fp16_emu.o -c fp16_emu.cpp
[@] g++ -I/usr/local/cuda/include -I/usr/local/cuda/include -IFreeImage/include -o mnistCUDNN.o -c mnistCUDNN.cpp
[@] /usr/local/cuda/bin/nvcc -ccbin g++ -m64 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_53,code=sm_53 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_62,code=sm_62 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_72,code=sm_72 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -gencode arch=compute_86,code=compute_86 -o mnistCUDNN fp16_dev.o fp16_emu.o mnistCUDNN.o -I/usr/local/cuda/include -I/usr/local/cuda/include -IFreeImage/include -L/usr/local/cuda/lib64 -L/usr/local/cuda/lib64 -L/usr/local/cuda/lib64 -lcublasLt -LFreeImage/lib/linux/x86_64 -LFreeImage/lib/linux -lcudart -lcublas -lcudnn -lfreeimage -lstdc++ -lm
(1)因为有warning:
WARNING - FreeImage is not set up correctly. Please ensure FreeImage is set up correctly.
所以先下载libfreeimage:sudo apt-get install libfreeimage3 libfreeimage-dev
(2) Permission denied,命令前添加sudo,即sudo make,成功
sudo make clean && sudo make
cgm@cgm:/usr/src/cudnn_samples_v8/mnistCUDNN$ ./mnistCUDNN
Executing: mnistCUDNN
cudnnGetVersion() : 8204 , CUDNN_VERSION from cudnn.h : 8204 (8.2.4)
Host compiler version : GCC 9.4.0
There are 1 CUDA capable devices on your machine :
device 0 : sms 30 Capabilities 8.6, SmClock 1425.0 Mhz, MemSize (Mb) 5921, MemClock 7001.0 Mhz, Ecc=0, boardGroupID=0
Using device 0
Testing single precision
Loading binary file data/conv1.bin
Loading binary file data/conv1.bias.bin
Loading binary file data/conv2.bin
Loading binary file data/conv2.bias.bin
Loading binary file data/ip1.bin
Loading binary file data/ip1.bias.bin
Loading binary file data/ip2.bin
Loading binary file data/ip2.bias.bin
Loading image data/one_28x28.pgm
Performing forward propagation ...
Testing cudnnGetConvolutionForwardAlgorithm_v7 ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: -1.000000 time requiring 178432 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: -1.000000 time requiring 184784 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: -1.000000 time requiring 2057744 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memory
Testing cudnnFindConvolutionForwardAlgorithm ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: 0.012288 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: 0.013184 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: 0.049984 time requiring 178432 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: 0.269312 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: 1.657632 time requiring 2057744 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: 3.042144 time requiring 184784 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memory
Testing cudnnGetConvolutionForwardAlgorithm_v7 ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: -1.000000 time requiring 128848 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: -1.000000 time requiring 2450080 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: -1.000000 time requiring 128000 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: -1.000000 time requiring 4656640 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: -1.000000 time requiring 1433120 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memory
Testing cudnnFindConvolutionForwardAlgorithm ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: 0.043008 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: 0.082944 time requiring 4656640 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: 0.109344 time requiring 128000 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: 0.315200 time requiring 2450080 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: 0.335872 time requiring 1433120 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: 0.984064 time requiring 128848 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memory
Resulting weights from Softmax:
0.0000000 0.9999399 0.0000000 0.0000000 0.0000561 0.0000000 0.0000012 0.0000017 0.0000010 0.0000000
Loading image data/three_28x28.pgm
Performing forward propagation ...
Testing cudnnGetConvolutionForwardAlgorithm_v7 ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: -1.000000 time requiring 178432 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: -1.000000 time requiring 184784 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: -1.000000 time requiring 2057744 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memory
Testing cudnnFindConvolutionForwardAlgorithm ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: 0.011264 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: 0.012288 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: 0.012288 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: 0.037888 time requiring 2057744 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: 0.043008 time requiring 178432 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: 0.052224 time requiring 184784 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memory
Testing cudnnGetConvolutionForwardAlgorithm_v7 ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: -1.000000 time requiring 128848 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: -1.000000 time requiring 2450080 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: -1.000000 time requiring 128000 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: -1.000000 time requiring 4656640 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: -1.000000 time requiring 1433120 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memory
Testing cudnnFindConvolutionForwardAlgorithm ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: 0.035840 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: 0.045056 time requiring 2450080 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: 0.075776 time requiring 4656640 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: 0.080896 time requiring 128848 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: 0.083744 time requiring 1433120 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: 0.102400 time requiring 128000 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memory
Resulting weights from Softmax:
0.0000000 0.0000000 0.0000000 0.9999288 0.0000000 0.0000711 0.0000000 0.0000000 0.0000000 0.0000000
Loading image data/five_28x28.pgm
Performing forward propagation ...
Resulting weights from Softmax:
0.0000000 0.0000008 0.0000000 0.0000002 0.0000000 0.9999820 0.0000154 0.0000000 0.0000012 0.0000006
Result of classification: 1 3 5
Test passed!
Testing half precision (math in single precision)
Loading binary file data/conv1.bin
Loading binary file data/conv1.bias.bin
Loading binary file data/conv2.bin
Loading binary file data/conv2.bias.bin
Loading binary file data/ip1.bin
Loading binary file data/ip1.bias.bin
Loading binary file data/ip2.bin
Loading binary file data/ip2.bias.bin
Loading image data/one_28x28.pgm
Performing forward propagation ...
Testing cudnnGetConvolutionForwardAlgorithm_v7 ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: -1.000000 time requiring 28800 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: -1.000000 time requiring 178432 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: -1.000000 time requiring 184784 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: -1.000000 time requiring 2057744 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memory
Testing cudnnFindConvolutionForwardAlgorithm ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: 0.012288 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: 0.012576 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: 0.025600 time requiring 28800 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: 0.049440 time requiring 178432 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: 0.054272 time requiring 184784 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: 0.054272 time requiring 2057744 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memory
Testing cudnnGetConvolutionForwardAlgorithm_v7 ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: -1.000000 time requiring 2450080 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: -1.000000 time requiring 4656640 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: -1.000000 time requiring 1433120 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memory
Testing cudnnFindConvolutionForwardAlgorithm ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: 0.039936 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: 0.045056 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: 0.047104 time requiring 2450080 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: 0.075776 time requiring 4656640 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: 0.086016 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: 0.086016 time requiring 1433120 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memory
Resulting weights from Softmax:
0.0000001 1.0000000 0.0000001 0.0000000 0.0000563 0.0000001 0.0000012 0.0000017 0.0000010 0.0000001
Loading image data/three_28x28.pgm
Performing forward propagation ...
Testing cudnnGetConvolutionForwardAlgorithm_v7 ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: -1.000000 time requiring 28800 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: -1.000000 time requiring 178432 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: -1.000000 time requiring 184784 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: -1.000000 time requiring 2057744 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memory
Testing cudnnFindConvolutionForwardAlgorithm ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: 0.010240 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: 0.013376 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: 0.025408 time requiring 28800 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: 0.040128 time requiring 178432 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: 0.044992 time requiring 2057744 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: 0.060416 time requiring 184784 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memory
Testing cudnnGetConvolutionForwardAlgorithm_v7 ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: -1.000000 time requiring 2450080 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: -1.000000 time requiring 4656640 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: -1.000000 time requiring 1433120 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memory
Testing cudnnFindConvolutionForwardAlgorithm ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: 0.041984 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: 0.043808 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: 0.046080 time requiring 2450080 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: 0.081920 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: 0.086912 time requiring 1433120 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: 0.090080 time requiring 4656640 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memory
Resulting weights from Softmax:
0.0000000 0.0000000 0.0000000 1.0000000 0.0000000 0.0000714 0.0000000 0.0000000 0.0000000 0.0000000
Loading image data/five_28x28.pgm
Performing forward propagation ...
Resulting weights from Softmax:
0.0000000 0.0000008 0.0000000 0.0000002 0.0000000 1.0000000 0.0000154 0.0000000 0.0000012 0.0000006
Result of classification: 1 3 5
Test passed!
注意在安装界面有这么一句话:
To uninstall the CUDA Toolkit, run cuda-uninstaller in /usr/local/cuda-11.4/bin
进入到/usr/local/cuda-11.4/bin目录下,而不是cuda目录。然后打开终端,输入sudo ./cuda-uninstaller。
输入命令后,弹出如下界面,通过回车键选中三个选项,最后选中Done。执行完下面指令后,上面的cuda文件就删除了。
最后,在终端输入命令sudo rm -rf /usr/local/cuda-11.4,就可以最终卸载CUDA11.4和cuDNN v8.2.4了。