Ubuntu 安装CUDA

Ubuntu安装最新版CUDA和cuDNN(TO小白)
Ubuntu18.04安装Cuda10.1/Cudnn
NVIDIA CUDA Toolkit 11.0 安装与卸载(Linux/Ubuntu)

1. 下载文件:

先查看支持的cuda版本,使用命令nvidia-smi,如下图所示,可以看到支持的最大版本为11.4
Ubuntu 安装CUDA_第1张图片

打开cuda官网: https://developer.nvidia.com/cuda-downloads?target_os=Linux&target_arch=x86_64&Distribution=Ubuntu&target_version=18.04&target_type=runfile_local
选择自己电脑的版本, 如图所示, 如果需要老版的,请去这里:https://developer.nvidia.com/cuda-toolkit-archive
例如:cuda 10.2
Ubuntu 安装CUDA_第2张图片
根据也页面提示,执行命令下载文件:

wget https://developer.download.nvidia.com/compute/cuda/11.8.0/local_installers/cuda_11.8.0_520.61.05_linux.run

2. 开始安装

进入下载文件所在目录执行下面命令:

sudo sh cuda_11.8.0_520.61.05_linux.run

安装过程注意看屏幕上的提示,如果提示已经安装了Driver并强烈建议你删除驱动,你可以选择继续安装,并注意在下一步即下图中用空格键删除安装驱动的选项,然后在执行Install.
Ubuntu 安装CUDA_第3张图片

3. 添加环境变量

执行:sudo vim ~/.bashrc

export CUDA_HOME=/usr/local/cuda 
export PATH=$PATH:$CUDA_HOME/bin 
export LD_LIBRARY_PATH=/usr/local/cuda-11.8/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}

保存退出更新: source ~/.bashrc

4.测试

nvcc -V
# 输出:
(base) nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Wed_Sep_21_10:33:58_PDT_2022
Cuda compilation tools, release 11.8, V11.8.89
Build cuda_11.8.r11.8/compiler.31833905_0

如果使用pytorch,可以使用如下语句查询是否可用:

import torch
print(torch.__version__)  # 查看torch当前版本号
print(torch.version.cuda)  # 编译当前版本的torch使用的cuda版本号
print(torch.cuda.is_available())  # 查看当前cuda是否可用于当前版本的Torch,如果输出True,则表示可用

5. 卸载

卸载

cd /usr/local/cuda-11.8/bin/
sudo ./cuda-uninstaller   # 这个cuda-uninstaller可能在文件中看不见,直接运行就行了
sudo rm -rf /usr/local/cuda-11.8

6. 其它问题

[INFO]: Finished with code: 256 , [ERROR]: Install of driver component failed

https://forums.developer.nvidia.com/t/info-finished-with-code-256-error-install-of-driver-component-failed/107661/4
查看包错文件:/var/log/cuda-installer.log/var/log/nvidia-installer.log
Based on the information from the latter, in my particular case the problem was due to installation while running the X server:

...
-> The file '/tmp/.X0-lock' exists and appears to contain the process ID '1596' of a runnning X server.
ERROR: You appear to be running an X server; please exit X before installing.  For further details, please see the section INSTALLING THE NVIDIA DRIVER in the README available on the Linux driver download page at www.nvidia.com.
ERROR: Installation has failed.  Please see the file '/var/log/nvidia-installer.log' for details.  You may find suggestions on fixing installation problems in the README available on the Linux driver download page at www.nvidia.com.

I was able to successfully install CUDA 10.2 by addressing the issue above.

Suggestion: It would be nice in the future if the descriptive information about errors from nvidia-installer.log would appear at the console output in case of installation failure, instead of error codes as it is now.

可以直接kill调,可能导致机器重启。

PyTorch安装

https://pytorch.org/get-started/previous-versions/
Ubuntu 安装CUDA_第4张图片

你可能感兴趣的:(软件安装与使用,ubuntu,linux,运维)