Ubuntu16.04-NVidia驱动手动安装+CUDA+cuDNN安装记录

完整安装过程

  • 一、检查自己的电脑环境是否具备安装CUDA的条件
    • 1.验证自己的电脑是否有一个可以支持CUDA的GPU
    • 2.验证自己的Linux版本是否支持 CUDA(Ubuntu 16.04没问题)
    • 3.验证系统是否安装了gcc
    • 4.验证系统是否安装了kernel header和 package development
      • 4.1查看正在运行的系统内核版本:
  • 二、NVidia驱动
    • 1、卸载之前的驱动
    • 2、安装
    • 3、检查 NVIDIA Driver是否安装成功
  • 三、CUDA
    • A、按照runfile版本安装。
      • 0.安装
      • 1.添加环境变量
      • 2.检查是否安装成功
        • 2.1终端输入
        • 2.2验证驱动版本
        • 2.3验证CUDA Toolkit
        • 2.4最后再检查一下系统和CUDA-Capable device的连接情况
          • 终端输入 :
    • √ B、deb形式安装
      • 1、安装
      • 2、测试
    • C、CUDA卸载
  • 四、cuDNN
    • 1-1、Installing From A Tar File
    • 1-2、验证:
    • √ 2-1、Installing From A Debian File
    • 2-2、Verifying The cuDNN Install On Linux

注:如果使用anaconda,貌似不需要手动安装CUDA和cuDNN,安装tensorflow/pytorch时会自动安装。文中出现的版本号,在安装时需要酌情更改

CUDA Toolkit 11.0 Update1 (Aug 2020), Versioned Online Documentation
CUDA Toolkit 11.0 (May 2020), Versioned Online Documentation Download
cuDNN v8.0.2 (July 24th, 2020), for CUDA 11.0
版本太新,tar安装始终存在问题

最后安装的版本是:
CUDA Toolkit 10.2 (Nov 2019), Versioned Online Documentation
Download cuDNN v7.6.5 (November 18th, 2019), for CUDA 10.2

CUDA,NVIDIA Driver,Linux,GCC之间的版本对应关系表格

一、检查自己的电脑环境是否具备安装CUDA的条件

1.验证自己的电脑是否有一个可以支持CUDA的GPU

你可以电脑的配置信息中找到显卡的具体型号,如果你是双系统,在Windows下的设备管理器中也可以查到显卡的详细信息;
你也可以在ubuntu的终端中输入命令:

 lspci | grep -i nvidia

会显示出你的NVIDIA GPU版本信息,不过不是很详细。

我的显示为(GeForce GTX 970):

01:00.0 VGA compatible controller: NVIDIA Corporation GM204 [GeForce GTX 970] (rev a1)
01:00.1 Audio device: NVIDIA Corporation GM204 High Definition Audio Controller (rev a1)

2.验证自己的Linux版本是否支持 CUDA(Ubuntu 16.04没问题)

输入命令:

uname -m && cat /etc/*release

结果显示:

x86_64
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=16.04

......

3.验证系统是否安装了gcc

在终端中输入:

gcc -v   #or $ gcc --version

结果显示:

gcc (Ubuntu 5.4.0-6ubuntu1~16.04.12) 5.4.0 20160609
......

若未安装请使用下列命令进行安装:

sudo apt-get  install  build-essential

4.验证系统是否安装了kernel header和 package development

4.1查看正在运行的系统内核版本:

在终端中输入:

uname -r

结果显示:

4.15.0-112-generic

如果缺失,在终端中输入:

sudo apt-get install linux-headers-$(uname -r)

可以安装对应kernel版本的kernel header和package development

若以上各项验证检查均满足要求,便可进行下面的正式安装过程。如果没有满足要求的话,可以参考cuda的官方文档,里面有详细的针对每个问题的解决方案。

二、NVidia驱动

NVIDIA-Linux-x86_64-450.57.run
NVIDIA中国官网

Ubuntu笔记本Nvidia显卡使用状况,my神翻可以通过这里的链接安装显卡驱动。下面的安装步骤适用台式机安装驱动。

参考ubuntu16.04安装NVIDIA显卡驱动,第一次成功手动安装显卡驱动。
安装前 系统状态,ubuntu16.04初装完成,系统驱动还是核显的,没有换成系统自带的nvidia驱动之前的样子。

1、卸载之前的驱动

这步没用到,因为之前是没装的,复制的其他博客,看命令也不难解释。

#for case1: original driver installed by apt-get:
sudo apt-get remove --purge nvidia*

#for case2: original driver installed by runfile:
sudo chmod +x *.run
sudo ./NVIDIA-Linux-x86_64-384.59.run --uninstall

如果原驱动是用apt-get安装的,就用第1种方法卸载。
如果原驱动是用runfile安装的,就用–uninstall命令卸载。其实,用runfile安装的时候也会卸载掉之前的驱动,所以不手动卸载亦可。

2、安装

禁用 nouveau驱动:

lsmod | grep nouveau # 查看有没有输出,如果有信息输出,则需要禁掉
sudo gedit /etc/modprobe.d/blacklist.conf  #将默认的驱动拉进黑名单

在blacklist.conf的最后添加下面几行:

blacklist vga16fb
blacklist nouveau
blacklist rivafb
blacklist rivatv
blacklist nvidiafb

更新: sudo update-initramfs -u #这一步可能要快点,但也无妨。
重启: lsmod | grep nouveau # 查看有没有输出,如果没有任何信息输出,则说明ok

进入tty模式进行安装:
Ctrl+Alt+F1进入文本模式,Ctrl+Alt+F7返回图形界面模式(期间文本模式下的进度不会改变,还可一继续进入文本模式)

#输入账户名及密码
sudo  su #输入密码,以root权限运行
cd ~/  #    ~/ <=> /home/yourname/
sudo service lightdm stop # 关闭图形界面
#如果安装失败,重新打开图形界面sudo service lightdm restart 把刚刚加入黑免单的驱动删除重启就回到了原来的状态
sudo init 3 #这句官网有介绍,Switch to runlevel 3.
sudo sh NVIDIA-Linux-x86_64-410.78.run --no-opengl-files –no-x-check –no-nouveau-check
#–no-opengl-files 只安装驱动文件,不安装OpenGL文件。这个参数最重要,只有禁用opengl这样安装才不会出现循环登陆的问题
#–no-x-check 安装驱动时不检查X服务
#–no-nouveau-check 安装驱动时不检查nouveau 
#后面两个参数可不加。

如果在装的过程中出现以下信息,请选择(重要!,踩坑许久):
之前看他的报错提示,又去官网论坛找解决方法,又说是Ubuntu内核可能不支持需要升级Ubuntu内核.也是折腾了许久,其实不必。

  • The distribution-provided pre-install script failed! Are you sure you
    want to continue? 选择 yes 继续。
  • Would you like to register the kernel module souces with DKMS? This
    will allow DKMS to automatically build a new module, if you install a
    different kernel later? 选择 No 继续。
  • Nvidia’s 32-bit compatibility libraries? 选择 No 继续。
  • Would you like to run the nvidia-xconfigutility to automatically
    update your x configuration so that the NVIDIA x driver will be used
    when you restart x? Any pre-existing x confile will be backed up. 选择
    no 继续

最后会看到安装成功的提示。

Installation of NVIDIA Accelerated Graphics Driver for Linux-x86_64 (version:450.57) is now complete. Please update your xorg.conf file as appriate; see the file /usr/share/doc/NVIDIA_GLX-1.0/README.txt for details.
														OK
sudo service lightdm restart # 重新开启图形界面
nvidia-smi # 查看是否安装成功

3、检查 NVIDIA Driver是否安装成功

终端输入 :

cat /proc/driver/nvidia/version   #会输出NVIDIA Driver的版本号
lspci | grep -i nvidia            #查看gpu版本信息

三、CUDA

cuda-repo-ubuntu1604-11-0-local_11.0.2-450.51.05-1_amd64.deb
CUDA Toolkit 11.0 (May 2020), Versioned Online Documentation
CUDA Toolkit Archive

官方信息:CUDA Toolkit Documentation v10.2.89

官方安装指导:NVIDIA CUDA Installation Guide for Linux.

Nvidia驱动和cuda对照表

A、按照runfile版本安装。

0.安装

根据Ubuntu16.04+CUDA9.0 安装(全网最简便快速安装,测试成功),配置环境发。
Ubuntu 16.04 上安装 CUDA 9.0 详细教程
参考Ubuntu18.04安装Cuda10.1安装CUDA。

注意不要安装驱动,类似下图这样。

┌──────────────────────────────────────────────────────────────────────────────┐
│ CUDA Installer                                                               │
│ - [ ] Driver                                                                 │
│      [ ] 450.51.05                                                           │
│ + [X] CUDA Toolkit 11.0                                                      │
│   [X] CUDA Samples 11.0                                                      │
│   [X] CUDA Demo Suite 11.0                                                   │
│   [X] CUDA Documentation 11.0                                                │
│   Options                                                                    │
│   Install                                                                    │
│                                                                              │
│                                                                              │
│                                                                              │
│                                                                              │
│                                                                              │
│                                                                              │
│                                                                              │
│                                                                              │
│                                                                              │
│                                                                              │
│                                                                              │
│                                                                              │
│                                                                              │
│ Up/Down: Move | Left/Right: Expand | 'Enter': Select | 'A': Advanced options │

我的笔记本,出现这个图:

┌──────────────────────────────────────────────────────────────────────────────┐
│ Existing package manager installation of the driver found. It is strongly    │
│ recommended that you remove this before continuing.                          │
│ Abort                                                                        │
│ Continue                                                                     │
│                                                                              │
│                                                                              │
│                                                                              │
│                                                                              │
│                                                                              │
│                                                                              │
│                                                                              │
│                                                                              │
│                                                                              │
│                                                                              │
│                                                                              │
│                                                                              │
│                                                                              │
│                                                                              │
│                                                                              │
│                                                                              │
│                                                                              │
│                                                                              │
│ Up/Down: Move | 'Enter': Select                                              │

Continue

安装后的提示如下,需要配置环境。

===========
= Summary =
===========

Driver:   Not Selected
Toolkit:  Installed in /usr/local/cuda-10.2/
Samples:  Installed in /home/mooc/

Please make sure that
 -   PATH includes /usr/local/cuda-10.2/bin
 -   LD_LIBRARY_PATH includes /usr/local/cuda-10.2/lib64, or, add /usr/local/cuda-10.2/lib64 to /etc/ld.so.conf and run ldconfig as root

To uninstall the CUDA Toolkit, run cuda-uninstaller in /usr/local/cuda-10.2/bin

Please see CUDA_Installation_Guide_Linux.pdf in /usr/local/cuda-10.2/doc/pdf for detailed information on setting up CUDA.
***WARNING: Incomplete installation! This installation did not install the CUDA Driver. A driver of version at least 440.00 is required for CUDA 10.2 functionality to work.
To install the driver using this installer, run the following command, replacing <CudaInstaller> with the name of this run file:
    sudo <CudaInstaller>.run --silent --driver

Logfile is /var/log/cuda-installer.log
===========
= Summary =
===========

Driver:   Not Selected
Toolkit:  Installed in /usr/local/cuda-10.1/
Samples:  Installed in /home/lzy/

Please make sure that
 -   PATH includes /usr/local/cuda-10.1/bin
 -   LD_LIBRARY_PATH includes /usr/local/cuda-10.1/lib64, or, add /usr/local/cuda-10.1/lib64 to /etc/ld.so.conf and run ldconfig as root

To uninstall the CUDA Toolkit, run cuda-uninstaller in /usr/local/cuda-10.1/bin

Please see CUDA_Installation_Guide_Linux.pdf in /usr/local/cuda-10.1/doc/pdf for detailed information on setting up CUDA.
***WARNING: Incomplete installation! This installation did not install the CUDA Driver. A driver of version at least 418.00 is required for CUDA 10.1 functionality to work.
To install the driver using this installer, run the following command, replacing <CudaInstaller> with the name of this run file:
    sudo <CudaInstaller>.run --silent --driver

Logfile is /var/log/cuda-installer.log

1.添加环境变量

gedit ~/.bashrc

在文件末尾添加

export PATH=/usr/local/cuda-11.0/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda-11.0/lib64\${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
export PATH=/usr/local/cuda-10.1/bin:/usr/local/cuda-10.1/NsightCompute-2019.1${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda-10.1/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}

最后使其生效

source ~/.bashrc

2.检查是否安装成功

2.1终端输入

a.以下试按照runfile形式安装的验证方式。但deb方式安装发现NVIDIA_CUDA-10.2_Samples文件没有。

cd /usr/local/cuda-10.2/samples/1_Utilities/deviceQuery

sudo make

./deviceQuery

结果如图

./deviceQuery Starting...

 CUDA Device Query (Runtime API) version (CUDART static linking)

Detected 1 CUDA Capable device(s)

Device 0: "GeForce GTX 970"
  CUDA Driver Version / Runtime Version          11.0 / 10.2
  CUDA Capability Major/Minor version number:    5.2
  Total amount of global memory:                 4040 MBytes (4236115968 bytes)
  (13) Multiprocessors, (128) CUDA Cores/MP:     1664 CUDA Cores
  GPU Max Clock rate:                            1253 MHz (1.25 GHz)
  Memory Clock rate:                             3505 Mhz
  Memory Bus Width:                              256-bit
  L2 Cache Size:                                 1835008 bytes
  Maximum Texture Dimension Size (x,y,z)         1D=(65536), 2D=(65536, 65536), 3D=(4096, 4096, 4096)
  Maximum Layered 1D Texture Size, (num) layers  1D=(16384), 2048 layers
  Maximum Layered 2D Texture Size, (num) layers  2D=(16384, 16384), 2048 layers
  Total amount of constant memory:               65536 bytes
  Total amount of shared memory per block:       49152 bytes
  Total number of registers available per block: 65536
  Warp size:                                     32
  Maximum number of threads per multiprocessor:  2048
  Maximum number of threads per block:           1024
  Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
  Max dimension size of a grid size    (x,y,z): (2147483647, 65535, 65535)
  Maximum memory pitch:                          2147483647 bytes
  Texture alignment:                             512 bytes
  Concurrent copy and kernel execution:          Yes with 2 copy engine(s)
  Run time limit on kernels:                     Yes
  Integrated GPU sharing Host Memory:            No
  Support host page-locked memory mapping:       Yes
  Alignment requirement for Surfaces:            Yes
  Device has ECC support:                        Disabled
  Device supports Unified Addressing (UVA):      Yes
  Device supports Compute Preemption:            No
  Supports Cooperative Kernel Launch:            No
  Supports MultiDevice Co-op Kernel Launch:      No
  Device PCI Domain ID / Bus ID / location ID:   0 / 1 / 0
  Compute Mode:
     < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 11.0, CUDA Runtime Version = 10.2, NumDevs = 1
Result = PASS

出现Result = PASS则表示安装成功通过。

2.2验证驱动版本

cat /proc/driver/nvidia/version

结果显示类似
NVRM version: NVIDIA UNIX x86_64 Kernel Module 384.81 Sat Sep 2 02:43:11 PDT 2017
GCC version: gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.5)

2.3验证CUDA Toolkit

nvcc -V      # 会输出CUDA的版本信息

2.4最后再检查一下系统和CUDA-Capable device的连接情况

打开终端输入:

cd NVIDIA_CUDA-10.2_Samples/

然后终端输入:$ make
系统就会自动进入到编译过程,整个过程大概需要十几到二十分钟,请耐心等待。如果出现错误的话,系统会立即报错停止。

竟然出错了!使用make -k来跳过这个错误。

fatal error: nvscibuf.h

第一次运行时可能会报错,提示的错误信息可能会是系统中没有gcc,

解决办法就是通过命令重新安装gcc就行,在终端输入:$ sudo apt-get install gcc 安装完gcc后, 再make就正常了。

如果编译成功,最后会显示Finished building CUDA samples,如下图所示。

make[1]: Leaving directory '/home/mooc/NVIDIA_CUDA-11.0_Samples/7_CUDALibraries/simpleCUBLASXT'
Finished building CUDA samples
终端输入 :
 ./bandwidthTest

看到类似如下图片中的显示,则代表成功

[CUDA Bandwidth Test] - Starting...
Running on...

 Device 0: GeForce GTX 970
 Quick Mode

 Host to Device Bandwidth, 1 Device(s)
 PINNED Memory Transfers
   Transfer Size (Bytes)	Bandwidth(GB/s)
   32000000			11.8

 Device to Host Bandwidth, 1 Device(s)
 PINNED Memory Transfers
   Transfer Size (Bytes)	Bandwidth(GB/s)
   32000000			11.4

 Device to Device Bandwidth, 1 Device(s)
 PINNED Memory Transfers
   Transfer Size (Bytes)	Bandwidth(GB/s)
   32000000			145.7

Result = PASS

NOTE: The CUDA Samples are not meant for performance measurements. Results may vary when GPU Boost is enabled.

√ B、deb形式安装

Package Manager Installation

1、安装

根据官方安装指导:NVIDIA CUDA Installation Guide for Linux就行。

2、测试

如上 官方安装指导 测试。但,没有 NVIDIA_CUDA-10.2_Samples 文件夹。但samples文件也一样用。把/usr/local/cuda-11.0/samples复制出来,到home

cd samples
make
cd samples/1_Utilities/deviceQuery
./deviceQuery
cd samples/1_Utilities/bandwidthTest
./bandwidthTest

C、CUDA卸载

参考官方指导

  1. 卸载原来的cuda8.0(注意:不需要卸载显卡驱动,给自己找麻烦):
sudo /usr/local/cuda-11.0/bin/uninstall_cuda_11.0.pl
  1. 卸载之后,会发现 /usr/local/cuda-11.0目录下任然有文件存在,这是cudnn文件,所以还需要将cuda-11.0文件删除干净:
sudo rm -rf /usr/local/cuda-11.0

Removing CUDA Toolkit and Driver

To remove CUDA Toolkit:

$ sudo apt-get --purge remove "*cublas*" "cuda*"

To remove NVIDIA Drivers:

$ sudo apt-get --purge remove "*nvidia*"
sudo rm -rf cuda

sudo rm -r cuda-9.0

四、cuDNN

Download cuDNN v8.0.2 (July 24th, 2020), for CUDA 11.0

cuDNN
cudnn-archive
Downloading cuDNN For Linux
官方安装参考文档This Archives document provides access to previously released cuDNN documentation versions.

1-1、Installing From A Tar File

参考Ubuntu 16.04 配置安装 Tensorflow Gpu版本

选: cuDNN Library for Linux下载

Ubuntu16.04-NVidia驱动手动安装+CUDA+cuDNN安装记录_第1张图片

参考检测CUDNN是否成功安装
https://www.jianshu.com/p/8e9090a62342
https://www.cnblogs.com/liuwenhua/p/11521668.html
https://blog.csdn.net/wanzhen4330/article/details/81699769#cudnn%E7%9A%84%E5%AE%89%E8%A3%85

1、首先解压缩下的cudnn压缩包文件

tar -xzvf cudnn-11.0-linux-x64-v8.0.2.39.tgz

解压出:

cuda/include/cudnn.h
cuda/include/cudnn_adv_infer.h
cuda/include/cudnn_adv_train.h
cuda/include/cudnn_backend.h
cuda/include/cudnn_cnn_infer.h
cuda/include/cudnn_cnn_train.h
cuda/include/cudnn_ops_infer.h
cuda/include/cudnn_ops_train.h
cuda/include/cudnn_version.h
cuda/NVIDIA_SLA_cuDNN_Support.txt
cuda/lib64/libcudnn.so
cuda/lib64/libcudnn.so.8
cuda/lib64/libcudnn.so.8.0.2
cuda/lib64/libcudnn_adv_infer.so
cuda/lib64/libcudnn_adv_infer.so.8
cuda/lib64/libcudnn_adv_infer.so.8.0.2
cuda/lib64/libcudnn_adv_infer_static.a
cuda/lib64/libcudnn_adv_train.so
cuda/lib64/libcudnn_adv_train.so.8
cuda/lib64/libcudnn_adv_train.so.8.0.2
cuda/lib64/libcudnn_adv_train_static.a
cuda/lib64/libcudnn_cnn_infer.so
cuda/lib64/libcudnn_cnn_infer.so.8
cuda/lib64/libcudnn_cnn_infer.so.8.0.2
cuda/lib64/libcudnn_cnn_infer_static.a
cuda/lib64/libcudnn_cnn_train.so
cuda/lib64/libcudnn_cnn_train.so.8
cuda/lib64/libcudnn_cnn_train.so.8.0.2
cuda/lib64/libcudnn_cnn_train_static.a
cuda/lib64/libcudnn_ops_infer.so
cuda/lib64/libcudnn_ops_infer.so.8
cuda/lib64/libcudnn_ops_infer.so.8.0.2
cuda/lib64/libcudnn_ops_infer_static.a
cuda/lib64/libcudnn_ops_train.so
cuda/lib64/libcudnn_ops_train.so.8
cuda/lib64/libcudnn_ops_train.so.8.0.2
cuda/lib64/libcudnn_ops_train_static.a
cuda/lib64/libcudnn_static.a
cuda/lib64/libcudnn_static.a

Procedure
Navigate to your directory containing the cuDNN Tar file.
Unzip the cuDNN package.

$ tar -xzvf cudnn-x.x-linux-x64-v8.x.x.x.tgz

Copy the following files into the CUDA Toolkit directory, and change the file permissions.

$ sudo cp cuda/include/cudnn*.h /usr/local/cuda/include
$ sudo cp cuda/lib64/libcudnn* /usr/local/cuda/lib64
$ sudo chmod a+r /usr/local/cuda/include/cudnn*.h /usr/local/cuda/lib64/libcudnn*

1-2、验证:

cat /usr/local/cuda/include/cudnn.h | grep CUDNN_MAJOR -A 2

出错,说是链接问题。(在尝试CUDA11与cuDNN8时,tar形式的安装一直存在链接问题。

链接:

cd /usr/local/cuda/lib64/
sudo rm -rf libcudnn.so libcudnn.so.5
sudo ln -s libcudnn.so.6.0.21 libcudnn.so.6
sudo ln -s libcudnn.so.6 libcudnn.so

sudo chmod +r libcudnn.so.7.0.5
sudo ln -sf libcudnn.so.7.0.5 libcudnn.so.7  
sudo ln -sf libcudnn.so.7 libcudnn.so     
sudo ldconfig

如果没出错,会显示:

#define CUDNN_MAJOR 7
#define CUDNN_MINOR 6
#define CUDNN_PATCHLEVEL 5
--
#define CUDNN_VERSION (CUDNN_MAJOR * 1000 + CUDNN_MINOR * 100 + CUDNN_PATCHLEVEL)

#include "driver_types.h"

√ 2-1、Installing From A Debian File

官方安装参考
下载三个文件:

cuDNN Runtime Library for Ubuntu16.04 x86_64 (Deb)
cuDNN Developer Library for Ubuntu16.04 x86_64 (Deb)
cuDNN Code Samples and User Guide for Ubuntu16.04 x86_64 (Deb)

后按装:

sudo dpkg -i libcudnn8_8.0.2.39-1+cuda11.0_amd64.deb 
sudo dpkg -i libcudnn8-dev_8.0.2.39-1+cuda11.0_amd64.deb 
sudo dpkg -i libcudnn8-doc_8.0.2.39-1+cuda11.0_amd64.deb 

2-2、Verifying The cuDNN Install On Linux

cp -r /usr/src/cudnn_samples_v8/ /home/mooc/
cd /home/mooc/cudnn_samples_v8/mnistCUDNN
make clean && make
./mnistCUDNN

结果如下:

Executing: mnistCUDNN
cudnnGetVersion() : 8002 , CUDNN_VERSION from cudnn.h : 8002 (8.0.2)
Host compiler version : GCC 5.4.0

There are 1 CUDA capable devices on your machine :
device 0 : sms 13  Capabilities 5.2, SmClock 1253.0 Mhz, MemSize (Mb) 4039, MemClock 3505.0 Mhz, Ecc=0, boardGroupID=0
Using device 0

Testing single precision
Loading binary file data/conv1.bin
Loading binary file data/conv1.bias.bin
Loading binary file data/conv2.bin
Loading binary file data/conv2.bias.bin
Loading binary file data/ip1.bin
Loading binary file data/ip1.bias.bin
Loading binary file data/ip2.bin
Loading binary file data/ip2.bias.bin
Loading image data/one_28x28.pgm
Performing forward propagation ...
Testing cudnnGetConvolutionForwardAlgorithm_v7 ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: -1.000000 time requiring 178432 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: -1.000000 time requiring 57600 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: -1.000000 time requiring 2057744 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: -1.000000 time requiring 184784 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memory
Testing cudnnFindConvolutionForwardAlgorithm ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: 0.037568 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: 0.039200 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: 0.068480 time requiring 178432 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: 0.071264 time requiring 57600 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: 0.574752 time requiring 2057744 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: 3.837248 time requiring 184784 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memory
Testing cudnnGetConvolutionForwardAlgorithm_v7 ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: -1.000000 time requiring 1433120 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: -1.000000 time requiring 2450080 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: -1.000000 time requiring 2000 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: -1.000000 time requiring 4656640 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: -1.000000 time requiring 128000 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memory
Testing cudnnFindConvolutionForwardAlgorithm ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: 0.088384 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: 0.149120 time requiring 128000 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: 0.164736 time requiring 1433120 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: 0.203712 time requiring 4656640 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: 0.497536 time requiring 2000 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: 1.100000 time requiring 2450080 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memory
Resulting weights from Softmax:
0.0000000 0.9999399 0.0000000 0.0000000 0.0000561 0.0000000 0.0000012 0.0000017 0.0000010 0.0000000 
Loading image data/three_28x28.pgm
Performing forward propagation ...
Testing cudnnGetConvolutionForwardAlgorithm_v7 ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: -1.000000 time requiring 178432 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: -1.000000 time requiring 57600 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: -1.000000 time requiring 2057744 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: -1.000000 time requiring 184784 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memory
Testing cudnnFindConvolutionForwardAlgorithm ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: 0.018144 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: 0.026784 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: 0.051264 time requiring 57600 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: 0.057472 time requiring 178432 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: 0.086016 time requiring 184784 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: 0.143008 time requiring 2057744 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memory
Testing cudnnGetConvolutionForwardAlgorithm_v7 ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: -1.000000 time requiring 1433120 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: -1.000000 time requiring 2450080 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: -1.000000 time requiring 2000 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: -1.000000 time requiring 4656640 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: -1.000000 time requiring 128000 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memory
Testing cudnnFindConvolutionForwardAlgorithm ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: 0.081024 time requiring 2000 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: 0.090912 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: 0.108960 time requiring 2450080 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: 0.148992 time requiring 128000 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: 0.206688 time requiring 1433120 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: 0.261760 time requiring 4656640 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memory
Resulting weights from Softmax:
0.0000000 0.0000000 0.0000000 0.9999288 0.0000000 0.0000711 0.0000000 0.0000000 0.0000000 0.0000000 
Loading image data/five_28x28.pgm
Performing forward propagation ...
Resulting weights from Softmax:
0.0000000 0.0000008 0.0000000 0.0000002 0.0000000 0.9999820 0.0000154 0.0000000 0.0000012 0.0000006 

Result of classification: 1 3 5

Test passed!

Testing half precision (math in single precision)
Loading binary file data/conv1.bin
Loading binary file data/conv1.bias.bin
Loading binary file data/conv2.bin
Loading binary file data/conv2.bias.bin
Loading binary file data/ip1.bin
Loading binary file data/ip1.bias.bin
Loading binary file data/ip2.bin
Loading binary file data/ip2.bias.bin
Loading image data/one_28x28.pgm
Performing forward propagation ...
Testing cudnnGetConvolutionForwardAlgorithm_v7 ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: -1.000000 time requiring 178432 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: -1.000000 time requiring 184784 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: -1.000000 time requiring 2057744 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memory
Testing cudnnFindConvolutionForwardAlgorithm ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: 0.043584 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: 0.045472 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: 0.081376 time requiring 178432 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: 0.101952 time requiring 184784 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: 0.147296 time requiring 2057744 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: 0.433760 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memory
Testing cudnnGetConvolutionForwardAlgorithm_v7 ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: -1.000000 time requiring 2000 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: -1.000000 time requiring 4656640 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: -1.000000 time requiring 64000 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: -1.000000 time requiring 2450080 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: -1.000000 time requiring 1433120 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memory
Testing cudnnFindConvolutionForwardAlgorithm ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: 0.089216 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: 0.089856 time requiring 2000 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: 0.093024 time requiring 2450080 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: 0.156448 time requiring 64000 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: 0.163584 time requiring 1433120 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: 0.202016 time requiring 4656640 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memory
Resulting weights from Softmax:
0.0000001 1.0000000 0.0000001 0.0000000 0.0000563 0.0000001 0.0000012 0.0000017 0.0000010 0.0000001 
Loading image data/three_28x28.pgm
Performing forward propagation ...
Testing cudnnGetConvolutionForwardAlgorithm_v7 ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: -1.000000 time requiring 178432 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: -1.000000 time requiring 184784 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: -1.000000 time requiring 2057744 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memory
Testing cudnnFindConvolutionForwardAlgorithm ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: 0.021248 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: 0.021568 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: 0.026880 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: 0.056800 time requiring 178432 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: 0.076128 time requiring 184784 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: 0.143872 time requiring 2057744 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memory
Testing cudnnGetConvolutionForwardAlgorithm_v7 ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: -1.000000 time requiring 2000 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: -1.000000 time requiring 4656640 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: -1.000000 time requiring 64000 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: -1.000000 time requiring 2450080 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: -1.000000 time requiring 1433120 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memory
Testing cudnnFindConvolutionForwardAlgorithm ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: 0.090656 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: 0.092576 time requiring 2000 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: 0.110464 time requiring 2450080 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: 0.155904 time requiring 64000 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: 0.185184 time requiring 4656640 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: 0.239072 time requiring 1433120 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memory
Resulting weights from Softmax:
0.0000000 0.0000000 0.0000000 1.0000000 0.0000000 0.0000714 0.0000000 0.0000000 0.0000000 0.0000000 
Loading image data/five_28x28.pgm
Performing forward propagation ...
Resulting weights from Softmax:
0.0000000 0.0000008 0.0000000 0.0000002 0.0000000 1.0000000 0.0000154 0.0000000 0.0000012 0.0000006 

Result of classification: 1 3 5

Test passed!

cuDNN两种安装方式,验证方法好像还不通用。先按第二个来了。以后有问题再说。

你可能感兴趣的:(软件安装,ubuntu,cuda,nvidia)