深度学习环境配置避坑-NVIDIA A100-PCIE-40GB配置pytorch1.10.0

深度学习环境配置避坑-NVIDIA A100-PCIE-40GB配置pytorch

  • 查看A100支持CUDA版本
  • 踩坑
  • 解决方法
    • 尝试 - 从pytorch官网查询对应pip命令:

查看A100支持CUDA版本

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.89.02    Driver Version: 525.89.02    CUDA Version: 12.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA A100-PCI...  Off  | 00000000:18:00.0 Off |                    0 |
| N/A   34C    P0    34W / 250W |  16640MiB / 40960MiB |      0%      Default |
|                               |                      |             Disabled |
+-------------------------------+----------------------+----------------------+
|   1  NVIDIA A100-PCI...  Off  | 00000000:3B:00.0 Off |                    0 |
| N/A   35C    P0    36W / 250W |      2MiB / 40960MiB |      0%      Default |
|                               |                      |             Disabled |
+-------------------------------+----------------------+----------------------+

踩坑

这里需要为NVIDIA A100-PCIE-40GB配置pytorch运行环境,已配置环境为pytorch==1.10.0 torchvision==0.11.0 torchaudio==0.10.0 cudatoolkit=10.2,运行pytorch代码报错

NVIDIA A100-PCIE-40GB with CUDA capability sm_80 is not compatible with the current PyTorch installation.
The current PyTorch install supports CUDA capabilities sm_37 sm_50 sm_60 sm_70.
If you want to use the NVIDIA A100-PCIE-40GB GPU with PyTorch, please check the instructions at https://pytorch.org/get-started/locally/

warnings.warn(incompatible_device_warn.format(device_name, capability, " ".join(arch_list), device_name))

报错分析:NVIDIA A100-PCIE-40GB 带有的CUDA算力是8.0,它和现有的PyTorch版本不匹配,现有的PyTorch版本支持的CUDA算力是 3.7,5.0,6.0,7.0,7.5。

解决方法

将CUDA版本提高到11.0以上。

尝试 - 从pytorch官网查询对应pip命令:

# CUDA 11.1
pip install torch==1.10.0+cu111 torchvision==0.11.0+cu111 torchaudio==0.10.0 -f https://download.pytorch.org/whl/torch_stable.html

再重新运行pytorch代码,成功。

你可能感兴趣的:(环境配置,深度学习,人工智能)