【WSL2笔记2】 搭建深度学习开发环境踩坑笔记

WSL2笔记2 搭建深度学习开发环境

  • 1、Anaconda 安装环境配置 (系统级-管理各环境)
    • 1.1 创建软件下载目录
    • 1.2 安装Anaconda
    • 1.3错误的画蛇添足
  • 2、NVIDIA Driver (系统级-各环境共享)
    • 2.1 官网
    • 2.2 安装win10版本NVIDIA驱动
    • 2.3 查看Nvidia-cuda
    • **`不要在 WSL 中安装任何 Linux 显卡驱动程序`**
    • 2.4 Ubuntu 生产环境掉驱动问题 Failed to initialize NVML: Driver/library version mismatch
      • 2.4.1 nvidia-smi
      • 2.4.2 查看一番
      • 2.4.2 停止nvidia更新 以免生产环境突然掉驱动
      • 2.4.3 关闭所有软件包自动更新
  • 3、CUDA Toolkit (系统级-各环境共享)
    • 3.1 CUDA Toolkit 官网
    • 3.2基本安装
    • 3.3 GPG Key报错
    • 3.4 查看CUDA状态
    • 3.5 Command 'nvcc' not found
    • 3.6 关于官方CUDA版本与虚拟环境cudatoolkit版本的关系与区别
      • 3.6.1 安装方法不同
      • 3.6.2 实现不同版本的cuda开发环境
  • 4、 cuDNN GPU加速的深度神经网络原语库 (系统级-各环境共享)
    • 4.1官网
    • 4.2 通过SSH传送cuDDN安装包到WSL
    • 4.3 安装zliblg
    • 4.4 安装cuDDN
      • 4.4.1 启用本地存储库
      • 4.4.2 导入 CUDA GPG 密钥
      • 4.4.3 刷新存储库元数据
      • 4.4.4 安装运行时库
      • 4.4.5 安装开发者库
      • 4.4.6 安装代码示例和cuDNN 库文档
    • 4.5 验证cuDNN
  • 5、深度学习框架
    • 5.1 PyTorch (环境级-各环境独立)
      • 5.1.1 官网
      • 5.1.2 Conda安装方式
      • 5.1.3 Pip方式安装
      • 5.1.4 Pip方式安装报错
      • 5.1.5 验证pytorch
    • 5.2 Tensorflow (环境级-各环境独立)
      • 5.2.1官网
      • 5.2.2 Python37 安装tensorflow 2.x
        • 5.2.2.2 AttributeError: module 'numpy' has no attribute 'object'.
        • 5.2.2.3 WARNING:root:Limited tf.compat.v2.summary API due to missing TensorBoard installation.
        • 5.2.2.4 ModuleNotFoundError: No module named 'tensorflow.contrib'
      • 5.2.3 Python37 安装旧版tensorflow 1.15
        • 5.2.3.1 安装cudatoolkit链接库
        • 5.2.3.2 指定安装GPU版本
      • 5.2.4 验证安装结果
      • 5.2.5 运行报错
        • 5.2.5.1 Could not load dynamic library 'libcudart.so.10.0'
        • 5.2.5.2 Could not load dynamic library 'libcudnn.so.7'
        • 5.2.5.3 /usr/local/cuda/targets/x86_64-linux/lib/libcudnn.so.8 is not a symbolic link
        • 5.2.5.4 error code is libcuda.so: cannot open shared object file
        • 5.2.5.5 运行报错,protobuft版本不匹配
        • 5.2.5.6 This TensorFlow binary is optimized with Intel(R) MKL-DNN
          • 5.2.5.6.1 安装MKL-DNN加速库 (编译报错未继续,以后有时间更新)
          • 5.2.5.6.2 源码重构安装TensorFlow (未亲测,以后有时间更新)
        • 5.2.5.6 WARNING:root:Limited tf.compat.v2.summary API due to missing TensorBoard installation.
    • 5.3 ONNX Runtime (ORT) (环境级-各环境独立)
      • 5.3.1 官网
      • 5.3.2 版本匹配一览表
      • 5.3.3 安装
        • 5.3.3.1 安装ONNX Runtime (ORT)
        • 5.3.3.2 安装ONNX模型导出模块
      • 5.3.4 测试代码
      • 5.3.5 可能出现的问题
        • 5.3.5.1 ModuleNotFoundError: No module named 'onnx'
        • 5.3.5.2 ValueError: This ORT build has ['TensorrtExecutionProvider', 'CUDAExecutionProvider', 'CPUExecutionProvider'] enabled.
  • 6、实测同一环境共存版本匹配方案
    • 6.1 实测参考
      • 6.1.1 Python 3.7
      • 6.2.2 Python 3.8.0
    • 6.2 protobuf版本冲突
      • 6.2.1 cannot import name 'builder' from 'google.protobuf.internal' (/home/my/anaconda3/envs/fbsp/lib/python3.8/site-packages/google/protobuf/internal/__init__.py)
    • 6.3 Numpy 版本冲突
      • 6.3.1 AttributeError: module 'numpy' has no attribute 'bool'.

1、Anaconda 安装环境配置 (系统级-管理各环境)

Anaconda官网版本档案
https://repo.anaconda.com/archive/

1.1 创建软件下载目录

cd ~
mkdir download
cd download

下载Anaconda安装包
wget https://repo.anaconda.com/archive/Anaconda3-2023.03-Linux-x86_64.sh
【WSL2笔记2】 搭建深度学习开发环境踩坑笔记_第1张图片

1.2 安装Anaconda

bash Anaconda3-2023.03-Linux-x86_64.sh

创建Python虚拟环境
conda create -n 名称 python=版本

激活环境
conda activate 名称
【WSL2笔记2】 搭建深度学习开发环境踩坑笔记_第2张图片

1.3错误的画蛇添足

设置Anaconda路径

$ vim ~/.bashrc

加入安装路径

 # Anaconda3
export PATH="/home/XXXX/anaconda3/bin:$PATH"
source activate

echo 'export PATH="~/anaconda3/bin:$PATH"' >> ~/.bashrc
echo 'source activate' >> ~/.bashrc

更新配置
source ~/.bashrc
错误的结果就是配置的所有虚拟环境都以base的python版本运行,无法配置每个虚拟环境使用不同python版本,失去了虚拟环境意义。

2、NVIDIA Driver (系统级-各环境共享)

2.1 官网

https://www.nvidia.com/download/index.aspx?lang=en-us

【WSL2笔记2】 搭建深度学习开发环境踩坑笔记_第3张图片

2.2 安装win10版本NVIDIA驱动

【WSL2笔记2】 搭建深度学习开发环境踩坑笔记_第4张图片

2.3 查看Nvidia-cuda

nvidia-smi

【WSL2笔记2】 搭建深度学习开发环境踩坑笔记_第5张图片

不要在 WSL 中安装任何 Linux 显卡驱动程序

https://docs.nvidia.cn/cuda/wsl-user-guide/index.html#getting-started-with-cuda-on-wsl-2

2.4 Ubuntu 生产环境掉驱动问题 Failed to initialize NVML: Driver/library version mismatch

2.4.1 nvidia-smi

生产环境:V100x4
系统版本:Ubuntu 22.04
凌晨还在用watch显示使用状态

+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.104.05             Driver Version: 535.104.05   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  Tesla V100-SXM2-16GB           Off | 00000000:00:08.0 Off |                    0 |
| N/A   47C    P0             184W / 300W |   6945MiB / 16384MiB |     75%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
|   1  Tesla V100-SXM2-16GB           Off | 00000000:00:09.0 Off |                    0 |
| N/A   45C    P0             249W / 300W |   7863MiB / 16384MiB |     91%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
|   2  Tesla V100-SXM2-16GB           Off | 00000000:00:0A.0 Off |                    0 |
| N/A   45C    P0             194W / 300W |   7983MiB / 16384MiB |     75%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
|   3  Tesla V100-SXM2-16GB           Off | 00000000:00:0B.0 Off |                    0 |
| N/A   35C    P0              41W / 300W |      0MiB / 16384MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
                                                                                         
+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|    0   N/A  N/A   1548534      C   python                                     6942MiB |
|    1   N/A  N/A   1548535      C   python                                     7860MiB |
|    2   N/A  N/A   1548536      C   python                                     7980MiB |
+---------------------------------------------------------------------------------------+

中午就发现这样了

$ nvidia-smi
Failed to initialize NVML: Driver/library version mismatch
NVML library version: 535.104

不管是nvtop还是nvitop还是gpustat都不管用

2.4.2 查看一番

  • 查看硬件
$ lspci | grep -i nvidia
00:08.0 3D controller: NVIDIA Corporation GV100GL [Tesla V100 SXM2 16GB] (rev a1)
00:09.0 3D controller: NVIDIA Corporation GV100GL [Tesla V100 SXM2 16GB] (rev a1)
00:0a.0 3D controller: NVIDIA Corporation GV100GL [Tesla V100 SXM2 16GB] (rev a1)
00:0b.0 3D controller: NVIDIA Corporation GV100GL [Tesla V100 SXM2 16GB] (rev a1)
  • 查看内核版本
$ cat /proc/driver/nvidia/version
NVRM version: NVIDIA UNIX x86_64 Kernel Module  535.86.10  Wed Jul 26 23:20:03 UTC 2023
GCC version:  gcc version 11.3.0 (Ubuntu 11.3.0-1ubuntu1~22.04.1) 
  • 查看显卡驱动
$ dpkg -l | grep nvidia
ii  gpustat                               0.6.0-1                                     all          pretty nvidia device monitor
iU  libnvidia-cfg1-535:amd64              535.104.05-0ubuntu0.22.04.4                 amd64        NVIDIA binary OpenGL/GLX configuration library
ii  libnvidia-common-535                  535.86.10-0ubuntu1                          all          Shared files used by the NVIDIA libraries
iU  libnvidia-compute-535:amd64           535.104.05-0ubuntu0.22.04.4                 amd64        NVIDIA libcompute package
iU  libnvidia-decode-535:amd64            535.104.05-0ubuntu0.22.04.4                 amd64        NVIDIA Video Decoding runtime libraries
iU  libnvidia-encode-535:amd64            535.104.05-0ubuntu0.22.04.4                 amd64        NVENC Video Encoding runtime libraryiU  libnvidia-extra-535:amd64             535.104.05-0ubuntu0.22.04.4                 amd64        Extra libraries for the NVIDIA driver
iU  libnvidia-fbc1-535:amd64              535.104.05-0ubuntu0.22.04.4                 amd64        NVIDIA OpenGL-based Framebuffer Capture runtime library
ii  libnvidia-gl-535:amd64                535.86.10-0ubuntu1                          amd64        NVIDIA OpenGL/GLX/EGL/GLES GLVND libraries and Vulkan ICD
iU  nvidia-compute-utils-535              535.104.05-0ubuntu0.22.04.4                 amd64        NVIDIA compute utilities
iU  nvidia-dkms-535                       535.104.05-0ubuntu0.22.04.4                 amd64        NVIDIA DKMS package
iU  nvidia-driver-535                     535.104.05-0ubuntu0.22.04.4                 amd64        NVIDIA driver metapackage
iU  nvidia-firmware-535-535.104.05        535.104.05-0ubuntu0.22.04.4                 amd64        Firmware files used by the kernel module
ii  nvidia-kernel-common-535              535.86.10-0ubuntu1                          amd64        Shared files used with the kernel module
iU  nvidia-kernel-source-535              535.104.05-0ubuntu0.22.04.4                 amd64        NVIDIA kernel source package
ii  nvidia-modprobe                       535.86.10-0ubuntu1                          amd64        Load the NVIDIA kernel driver and create device files
ii  nvidia-prime                          0.8.17.1                                    all          Tools to enable NVIDIA's Prime
ii  nvidia-settings                       535.86.10-0ubuntu1                          amd64        Tool for configuring the NVIDIA graphics driver
iU  nvidia-utils-535                      535.104.05-0ubuntu0.22.04.4                 amd64        NVIDIA driver support binaries
ii  screen-resolution-extra               0.18.2                                      all          Extension for the nvidia-settings control panel
iU  xserver-xorg-video-nvidia-535         535.104.05-0ubuntu0.22.04.4                 amd64        NVIDIA binary Xorg driver
  • 查看驱动日志
$ cat /proc/driver/nvidia/version
2023-09-27 06:18:38 upgrade nvidia-driver-535:amd64 535.86.10-0ubuntu1 535.104.05-0ubuntu0.22.04.4
2023-09-27 06:18:38 status half-configured nvidia-driver-535:amd64 535.86.10-0ubuntu1
2023-09-27 06:18:38 status unpacked nvidia-driver-535:amd64 535.86.10-0ubuntu1
2023-09-27 06:18:38 status half-installed nvidia-driver-535:amd64 535.86.10-0ubuntu1
2023-09-27 06:18:38 status unpacked nvidia-driver-535:amd64 535.104.05-0ubuntu0.22.04.4
2023-09-27 06:18:38 upgrade libnvidia-gl-535:amd64 535.86.10-0ubuntu1 535.104.05-0ubuntu0.22.04.4
2023-09-27 06:18:38 status half-configured libnvidia-gl-535:amd64 535.86.10-0ubuntu1
2023-09-27 06:18:38 status unpacked libnvidia-gl-535:amd64 535.86.10-0ubuntu1
2023-09-27 06:18:38 status half-installed libnvidia-gl-535:amd64 535.86.10-0ubuntu1
2023-09-27 06:18:38 status unpacked libnvidia-gl-535:amd64 535.86.10-0ubuntu1
2023-09-27 06:18:38 status installed libnvidia-gl-535:amd64 535.86.10-0ubuntu1
2023-09-27 06:18:38 upgrade nvidia-dkms-535:amd64 535.86.10-0ubuntu1 535.104.05-0ubuntu0.22.04.4
2023-09-27 06:18:38 status half-configured nvidia-dkms-535:amd64 535.86.10-0ubuntu1
2023-09-27 06:18:46 status unpacked nvidia-dkms-535:amd64 535.86.10-0ubuntu1
2023-09-27 06:18:46 status half-installed nvidia-dkms-535:amd64 535.86.10-0ubuntu1
2023-09-27 06:18:46 status unpacked nvidia-dkms-535:amd64 535.104.05-0ubuntu0.22.04.4
2023-09-27 06:18:46 upgrade nvidia-kernel-source-535:amd64 535.86.10-0ubuntu1 535.104.05-0ubuntu0.22.04.4
2023-09-27 06:18:46 status half-configured nvidia-kernel-source-535:amd64 535.86.10-0ubuntu1
2023-09-27 06:18:46 status unpacked nvidia-kernel-source-535:amd64 535.86.10-0ubuntu1
2023-09-27 06:18:46 status half-installed nvidia-kernel-source-535:amd64 535.86.10-0ubuntu1
2023-09-27 06:18:46 status unpacked nvidia-kernel-source-535:amd64 535.104.05-0ubuntu0.22.04.4
2023-09-27 06:18:46 install nvidia-firmware-535-535.104.05:amd64  535.104.05-0ubuntu0.22.04.4
2023-09-27 06:18:46 status half-installed nvidia-firmware-535-535.104.05:amd64 535.104.05-0ubuntu0.22.04.4
2023-09-27 06:18:47 status unpacked nvidia-firmware-535-535.104.05:amd64 535.104.05-0ubuntu0.22.04.4
2023-09-27 06:18:47 upgrade nvidia-kernel-common-535:amd64 535.86.10-0ubuntu1 535.104.05-0ubuntu0.22.04.4
2023-09-27 06:18:47 status half-configured nvidia-kernel-common-535:amd64 535.86.10-0ubuntu1
2023-09-27 06:18:47 status unpacked nvidia-kernel-common-535:amd64 535.86.10-0ubuntu1
2023-09-27 06:18:47 status half-installed nvidia-kernel-common-535:amd64 535.86.10-0ubuntu1
2023-09-27 06:18:47 status unpacked nvidia-kernel-common-535:amd64 535.86.10-0ubuntu1
2023-09-27 06:18:47 status installed nvidia-kernel-common-535:amd64 535.86.10-0ubuntu1
2023-09-27 06:18:47 upgrade libnvidia-decode-535:amd64 535.86.10-0ubuntu1 535.104.05-0ubuntu0.22.04.4
2023-09-27 06:18:47 status half-configured libnvidia-decode-535:amd64 535.86.10-0ubuntu1
2023-09-27 06:18:47 status unpacked libnvidia-decode-535:amd64 535.86.10-0ubuntu1
2023-09-27 06:18:47 status half-installed libnvidia-decode-535:amd64 535.86.10-0ubuntu1
2023-09-27 06:18:47 status unpacked libnvidia-decode-535:amd64 535.104.05-0ubuntu0.22.04.4
2023-09-27 06:18:47 upgrade libnvidia-compute-535:amd64 535.86.10-0ubuntu1 535.104.05-0ubuntu0.22.04.4
2023-09-27 06:18:47 status half-configured libnvidia-compute-535:amd64 535.86.10-0ubuntu1
2023-09-27 06:18:47 status unpacked libnvidia-compute-535:amd64 535.86.10-0ubuntu1
2023-09-27 06:18:47 status half-installed libnvidia-compute-535:amd64 535.86.10-0ubuntu1
2023-09-27 06:18:48 status unpacked libnvidia-compute-535:amd64 535.104.05-0ubuntu0.22.04.4
2023-09-27 06:18:48 upgrade libnvidia-extra-535:amd64 535.86.10-0ubuntu1 535.104.05-0ubuntu0.22.04.4
2023-09-27 06:18:48 status half-configured libnvidia-extra-535:amd64 535.86.10-0ubuntu1
2023-09-27 06:18:48 status unpacked libnvidia-extra-535:amd64 535.86.10-0ubuntu1
2023-09-27 06:18:48 status half-installed libnvidia-extra-535:amd64 535.86.10-0ubuntu1
2023-09-27 06:18:48 status unpacked libnvidia-extra-535:amd64 535.104.05-0ubuntu0.22.04.4
2023-09-27 06:18:48 upgrade nvidia-compute-utils-535:amd64 535.86.10-0ubuntu1 535.104.05-0ubuntu0.22.04.4
2023-09-27 06:18:48 status half-configured nvidia-compute-utils-535:amd64 535.86.10-0ubuntu1
2023-09-27 06:18:48 status unpacked nvidia-compute-utils-535:amd64 535.86.10-0ubuntu1
2023-09-27 06:18:48 status half-installed nvidia-compute-utils-535:amd64 535.86.10-0ubuntu1
2023-09-27 06:18:48 status unpacked nvidia-compute-utils-535:amd64 535.104.05-0ubuntu0.22.04.4
2023-09-27 06:18:48 upgrade libnvidia-encode-535:amd64 535.86.10-0ubuntu1 535.104.05-0ubuntu0.22.04.4
2023-09-27 06:18:48 status half-configured libnvidia-encode-535:amd64 535.86.10-0ubuntu1
2023-09-27 06:18:48 status unpacked libnvidia-encode-535:amd64 535.86.10-0ubuntu1
2023-09-27 06:18:48 status half-installed libnvidia-encode-535:amd64 535.86.10-0ubuntu1
2023-09-27 06:18:48 status unpacked libnvidia-encode-535:amd64 535.104.05-0ubuntu0.22.04.4
2023-09-27 06:18:48 upgrade nvidia-utils-535:amd64 535.86.10-0ubuntu1 535.104.05-0ubuntu0.22.04.4
2023-09-27 06:18:48 status half-configured nvidia-utils-535:amd64 535.86.10-0ubuntu1
2023-09-27 06:18:48 status unpacked nvidia-utils-535:amd64 535.86.10-0ubuntu1
2023-09-27 06:18:48 status half-installed nvidia-utils-535:amd64 535.86.10-0ubuntu1
2023-09-27 06:18:48 status unpacked nvidia-utils-535:amd64 535.104.05-0ubuntu0.22.04.4
2023-09-27 06:18:48 upgrade xserver-xorg-video-nvidia-535:amd64 535.86.10-0ubuntu1 535.104.05-0ubuntu0.22.04.4
2023-09-27 06:18:48 status half-configured xserver-xorg-video-nvidia-535:amd64 535.86.10-0ubuntu1
2023-09-27 06:18:48 status unpacked xserver-xorg-video-nvidia-535:amd64 535.86.10-0ubuntu1
2023-09-27 06:18:48 status half-installed xserver-xorg-video-nvidia-535:amd64 535.86.10-0ubuntu1
2023-09-27 06:18:48 status unpacked xserver-xorg-video-nvidia-535:amd64 535.104.05-0ubuntu0.22.04.4
2023-09-27 06:18:48 upgrade libnvidia-fbc1-535:amd64 535.86.10-0ubuntu1 535.104.05-0ubuntu0.22.04.4
2023-09-27 06:18:48 status half-configured libnvidia-fbc1-535:amd64 535.86.10-0ubuntu1
2023-09-27 06:18:48 status unpacked libnvidia-fbc1-535:amd64 535.86.10-0ubuntu1
2023-09-27 06:18:48 status half-installed libnvidia-fbc1-535:amd64 535.86.10-0ubuntu1
2023-09-27 06:18:48 status unpacked libnvidia-fbc1-535:amd64 535.104.05-0ubuntu0.22.04.4
2023-09-27 06:18:48 upgrade libnvidia-cfg1-535:amd64 535.86.10-0ubuntu1 535.104.05-0ubuntu0.22.04.4
2023-09-27 06:18:48 status half-configured libnvidia-cfg1-535:amd64 535.86.10-0ubuntu1
2023-09-27 06:18:48 status unpacked libnvidia-cfg1-535:amd64 535.86.10-0ubuntu1
2023-09-27 06:18:48 status half-installed libnvidia-cfg1-535:amd64 535.86.10-0ubuntu1
2023-09-27 06:18:48 status unpacked libnvidia-cfg1-535:amd64 535.104.05-0ubuntu0.22.04.4

2023-09-27 06:18:38 upgrade nvidia-driver-535:amd64 535.86.10-0ubuntu1 535.104.05-0ubuntu0.22.04.4
原来是偷偷升级了535.86.10 -> 535.104.05,NVIDIA 内核驱动版本与系统驱动不一致

2.4.2 停止nvidia更新 以免生产环境突然掉驱动

sudo apt-mark hold nvidia-driver-版本

$ sudo apt-mark hold  nvidia-driver-535
nvidia-driver-535 set on hold.

2.4.3 关闭所有软件包自动更新

考虑生产环境保持软件和环境稳定,关闭软件包自动更新
sudo dpkg-reconfigure unattended-upgrades

$ sudo dpkg-reconfigure unattended-upgrades
Replacing config file /etc/apt/apt.conf.d/20auto-upgrades with new version

【WSL2笔记2】 搭建深度学习开发环境踩坑笔记_第6张图片选择No,不同意自动下载并安装稳定版软件升级

3、CUDA Toolkit (系统级-各环境共享)

3.1 CUDA Toolkit 官网

https://developer.nvidia.com/cuda-downloads?target_os=Linux&target_arch=x86_64&Distribution=WSL-Ubuntu&target_version=2.0&target_type=deb_local

【WSL2笔记2】 搭建深度学习开发环境踩坑笔记_第7张图片

历史版本
https://developer.nvidia.com/cuda-toolkit-archive

WSL 上的 CUDA 用户指南
https://docs.nvidia.cn/cuda/wsl-user-guide/index.html#getting-started-with-cuda-on-wsl-2

3.2基本安装

wget https://developer.download.nvidia.com/compute/cuda/repos/wsl-ubuntu/x86_64/cuda-wsl-ubuntu.pin
sudo mv cuda-wsl-ubuntu.pin /etc/apt/preferences.d/cuda-repository-pin-600
wget https://developer.download.nvidia.com/compute/cuda/12.1.0/local_installers/cuda-repo-wsl-ubuntu-12-1-local_12.1.0-1_amd64.deb
sudo dpkg -i cuda-repo-wsl-ubuntu-12-1-local_12.1.0-1_amd64.deb
sudo cp /var/cuda-repo-wsl-ubuntu-12-1-local/cuda-*-keyring.gpg /usr/share/keyrings/
sudo apt-get update
sudo apt-get -y install cuda

3.3 GPG Key报错

W: GPG error: file:/var/cuda-repo-wsl-ubuntu-12-1-local  InRelease: The following signatures couldn't be verified because the public key is not available: NO_PUBKEY CDD5140FF7B46061
E: The repository 'file:/var/cuda-repo-wsl-ubuntu-12-1-local  InRelease' is not signed.
N: Updating from such a repository can't be done securely, and is therefore disabled by default.
N: See apt-secure(8) manpage for repository creation and user configuration details.

【WSL2笔记2】 搭建深度学习开发环境踩坑笔记_第8张图片

删除GPG key
sudo apt-key del 7fa2af80
安装GPG key
sudo cp /var/cuda-repo-wsl-ubuntu-12-1-local/cuda-F7B46061-keyring.gpg /usr/share/keyrings/
【WSL2笔记2】 搭建深度学习开发环境踩坑笔记_第9张图片

3.4 查看CUDA状态

nvcc -V

3.5 Command ‘nvcc’ not found

编辑路径配置
vim ~/.bashrc
加入系统路径

export LD_LIBRARY_PATH=LD_LIBRARY_PATH:/usr/local/cuda/lib64
export PATH=$PATH:/usr/local/cuda/bin
export CUDA_HOME=$CUDA_HOME:/usr/local/cuda

echo 'export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/usr/local/cuda/lib64"' >> ~/.bashrc
echo 'export PATH="$PATH:/usr/local/cuda/bin"' >> ~/.bashrc
echo 'export CUDA_HOME="$CUDA_HOME:/usr/local/cuda"'>> ~/.bashrc

更新配置
source ~/.bashrc

3.6 关于官方CUDA版本与虚拟环境cudatoolkit版本的关系与区别

3.6.1 安装方法不同

  • 官方提供的CUDA(Toolkit)
    wget https://developer.download.nvidia.com/compute/cuda/repos/wsl-ubuntu/x86_64/cuda-wsl-ubuntu.pin
    sudo mv cuda-wsl-ubuntu.pin /etc/apt/preferences.d/cuda-repository-pin-600
    wget https://developer.download.nvidia.com/compute/cuda/12.1.0/local_installers/cuda-repo-wsl-ubuntu-12-1-local_12.1.0-1_amd64.deb
    sudo dpkg -i cuda-repo-wsl-ubuntu-12-1-local_12.1.0-1_amd64.deb
    sudo cp /var/cuda-repo-wsl-ubuntu-12-1-local/cuda-*-keyring.gpg /usr/share/keyrings/
    sudo apt-get update
    sudo apt-get -y install cuda
  • Conda提供子环境方法cudatoolkit
    conda install cudatoolkit=10.0 -c pytorch

3.6.2 实现不同版本的cuda开发环境

  • 安装官方CUDA Toolkit,选用与显卡驱动匹配的最新版,它向下兼容
    它提供用于创建高性能 GPU 加速应用程序的完整开发环境,包括 GPU 加速库、调试和优化工具、C/C++ 编译器以及用于部署应用程序的运行时库。
  • 安装虚拟子环境CUDA Toolkit 的版本不能高于主环境中的官方CUDA版本
    为了匹配子环境其他软件版本,在虚拟子环境中安装的其他版本CUDA toolkit,属于运行时库等动态链接库,用于调用CUDA功能。

4、 cuDNN GPU加速的深度神经网络原语库 (系统级-各环境共享)

4.1官网

https://developer.nvidia.com/rdp/cudnn-archive
需要注册账号登录下载
【WSL2笔记2】 搭建深度学习开发环境踩坑笔记_第10张图片

4.2 通过SSH传送cuDDN安装包到WSL

【WSL2笔记2】 搭建深度学习开发环境踩坑笔记_第11张图片WSL2安装SSH服务请参考 这里

4.3 安装zliblg

sudo apt-get install zlib1g

(base) fb@VP01:~/download$ conda activate modelscope
(modelscope) fb@VP01:~/download$ sudo apt-get install zlib1g
[sudo] password for fb:
Readi

你可能感兴趣的:(Ubuntu,深度学习,python,ubuntu,pytorch,tensorflow)