【工具安装】CUDA & cuDNN

安装CUDA & cuDNN

  • NVIDIA显卡驱动安装
  • 验证NVIDIA显卡驱动安装成功
  • 查看显卡适配的CUDA版本
  • 下载cuda toolkit
  • 下载cuDNN(需要注册账号)
  • 安装CUDA
  • 安装cuDNN
  • 验证安装成功

NVIDIA显卡驱动安装

【工具安装】CUDA & cuDNN_第1张图片
记录:电脑设备管理器中选中NVIDIA显卡右键更新驱动,版本是388,NVIDIA驱动官网查看到显卡对应的最新版本是391.35,下载安装,显示此图形驱动程序无法找到兼容的硬件。放弃391,使用388版本驱动。后来使用驱动精灵,更新到了最新驱动版本512。

若出现nvidia-smi不是内部或外部指令的提示,解决方法有两种:
1 找到nvidia-smi.exe路径C:\Windows\System32\cmd.exe,重新打开命令行窗口,运行nvidia-smi;
2 配置环境变量;

验证NVIDIA显卡驱动安装成功

运行以下指令:

nvidia-smi

显示GPU内存、使用情况等信息,即为安装成功。

Fri Jun 17 18:23:19 2022
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 512.59       Driver Version: 512.59       CUDA Version: 11.6     |
|-------------------------------+----------------------+----------------------+
| GPU  Name            TCC/WDDM | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  ERR!               WDDM  | 00000000:01:00.0 Off |                  N/A |
| N/A    0C    P8    N/A /  N/A |      0MiB /  4096MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

查看显卡适配的CUDA版本

NVIDIA控制面板 --> 帮助 --> 系统信息 --> 显示
【工具安装】CUDA & cuDNN_第2张图片

NVIDIA控制面板 --> 帮助 --> 系统信息 --> 组件
【工具安装】CUDA & cuDNN_第3张图片

cuda toolkit与driver对应表
【工具安装】CUDA & cuDNN_第4张图片

下载cuda toolkit

CUDA Toolkit Downloads默认应该是最新版本的cuda。

CUDA所有版本列表
【工具安装】CUDA & cuDNN_第5张图片

下载cuDNN(需要注册账号)

官方cuDNN下载地址

需要注册NVIDIA账号并登录。

找到对应CUDA版本的cuDNN,cuDNN版本列表。
【工具安装】CUDA & cuDNN_第6张图片

旧版本点击下图中链接可找到对应的版本。
【工具安装】CUDA & cuDNN_第7张图片

安装CUDA

精简安装会重新安装驱动,选择自定义安装,去掉NVIDIA driver components对钩,其他都选,自定义安装目录,安装成功后,需要配置环境变量。
记录:在安装的时候,安装程序默认配置了path。
运行nvcc -V,显示信息说明CUDA安装配置成功。

nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Tue_Mar__8_18:36:24_Pacific_Standard_Time_2022
Cuda compilation tools, release 11.6, V11.6.124
Build cuda_11.6.r11.6/compiler.31057947_0

安装cuDNN

cuDNN中bin目录下的文件移动到 CUDA 的 bin 目录中。
cuDNN目录中的 include 中的文件移动到 CUDA 的 include 目录中。
cuDNN目录中的 lib 中的文件移动到 CUDA 的 lib 目录中。

验证安装成功

找到cuda安装目录E:\tool_develop\cuda\cuda_files\NVIDIA GPU Computing Toolkit\CUDA\v11.6\extras\demo_suite,在该路径下打开命令行窗口,执行以下命令,显示信息即为成功。

deviceQuery.exe
deviceQuery.exe Starting...

 CUDA Device Query (Runtime API) version (CUDART static linking)

Detected 1 CUDA Capable device(s)

Device 0: "NVIDIA GeForce GTX 960M"
  CUDA Driver Version / Runtime Version          11.6 / 11.6
  CUDA Capability Major/Minor version number:    5.0
  Total amount of global memory:                 4096 MBytes (4294836224 bytes)
  ( 5) Multiprocessors, (128) CUDA Cores/MP:     640 CUDA Cores
  GPU Max Clock rate:                            1176 MHz (1.18 GHz)
  Memory Clock rate:                             2505 Mhz
  Memory Bus Width:                              128-bit
  L2 Cache Size:                                 2097152 bytes
  Maximum Texture Dimension Size (x,y,z)         1D=(65536), 2D=(65536, 65536), 3D=(4096, 4096, 4096)
  Maximum Layered 1D Texture Size, (num) layers  1D=(16384), 2048 layers
  Maximum Layered 2D Texture Size, (num) layers  2D=(16384, 16384), 2048 layers
  Total amount of constant memory:               zu bytes
  Total amount of shared memory per block:       zu bytes
  Total number of registers available per block: 65536
  Warp size:                                     32
  Maximum number of threads per multiprocessor:  2048
  Maximum number of threads per block:           1024
  Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
  Max dimension size of a grid size    (x,y,z): (2147483647, 65535, 65535)
  Maximum memory pitch:                          zu bytes
  Texture alignment:                             zu bytes
  Concurrent copy and kernel execution:          Yes with 4 copy engine(s)
  Run time limit on kernels:                     Yes
  Integrated GPU sharing Host Memory:            No
  Support host page-locked memory mapping:       Yes
  Alignment requirement for Surfaces:            Yes
  Device has ECC support:                        Disabled
  CUDA Device Driver Mode (TCC or WDDM):         WDDM (Windows Display Driver Model)
  Device supports Unified Addressing (UVA):      Yes
  Device supports Compute Preemption:            No
  Supports Cooperative Kernel Launch:            No
  Supports MultiDevice Co-op Kernel Launch:      No
  Device PCI Domain ID / Bus ID / location ID:   0 / 1 / 0
  Compute Mode:
     < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 11.6, CUDA Runtime Version = 11.6, NumDevs = 1, Device0 = NVIDIA GeForce GTX 960M
Result = PASS
bandwidthTest.exe
[CUDA Bandwidth Test] - Starting...
Running on...

 Device 0: NVIDIA GeForce GTX 960M
 Quick Mode

 Host to Device Bandwidth, 1 Device(s)
 PINNED Memory Transfers
   Transfer Size (Bytes)        Bandwidth(MB/s)
   33554432                     5854.3

 Device to Host Bandwidth, 1 Device(s)
 PINNED Memory Transfers
   Transfer Size (Bytes)        Bandwidth(MB/s)
   33554432                     5881.0

 Device to Device Bandwidth, 1 Device(s)
 PINNED Memory Transfers
   Transfer Size (Bytes)        Bandwidth(MB/s)
   33554432                     67689.3

Result = PASS

NOTE: The CUDA Samples are not meant for performance measurements. Results may vary when GPU Boost is enabled.

你可能感兴趣的:(#,深度学习工具,cuda,cuDNN)