CentOS nvidia+cuda+cudnn 安装

介绍

CentOS下安装nvidia+cuda+cudnn

NVIDIA驱动

  1. 去NVIDIA官网下载合适版本驱动,

  2. 安装lspci,使用下面命令,找寻lspci,发现在pciutils中,故安装pciutils

    yum whatprovides */lspci
    yum install pciutils 
    
  3. 检查是否安装了NVIDIA的GPU(硬件层面):lspci | grep -i nvidia

  4. 安装kernel-devel和kernel-headers

    sudo yum install kernel-devel
    sudo yum install kernel-headers
    
  5. 赋予运行权限chmod a+x NVIDIA-Linux-x86_64-410.78.run

  6. 禁用nouveau

    # 打开配置文件:
    vi /usr/lib/modprobe.d/dist-blacklist.conf
    # 加上或修改 两行
    blacklist nouveau
    options nouveau modeset=0
    查看nouveau是否禁用, 如果没有输出代表成功
    lsmod | grep nouveau 
    
  7. 可选

    备份原来的 initramfs nouveau image镜像
    mv /boot/initramfs-$(uname -r).img /boot/initramfs-$(uname -r)-nouveau.img  
    创建新的 initramfs image镜像
    dracut /boot/initramfs-$(uname -r).img  $(uname -r)  
    
  8. 安装

    运行命令
    sudo ./NVIDIA-Linux-x86_64-410.78.run
    如果报错,则使用
    sudo ./Nvidia*.sh --kernel-source-path=/usr/src/kernels/按TAB补全
    

CUDA

  1. 去这里选择合适版本下载

  2. 赋予运行权限chmod a+x cuda_10.0.130_410.48_linux.run

  3. 安装sudo ./cuda_10.0.130_410.48_linux.run

    1. 会先有个阅读声明,一直按D,然后accept。 
    2. 很多选项
        Install NVIDIA Accelerated Graphics Driver for Linux-x86_64 410.48?
        (y)es/(n)o/(q)uit: n
        Install the CUDA 10.0 Toolkit?
        (y)es/(n)o/(q)uit: y
        Enter Toolkit Location
        [ default is /usr/local/cuda-10.0 ]: 
        Do you want to install a symbolic link at /usr/local/cuda?
        (y)es/(n)o/(q)uit: y
        Install the CUDA 10.0 Samples?
        (y)es/(n)o/(q)uit: n
    
        选项install the OpenGL libraries,如果双显卡(集显+独显)选择n,如果只有独显可以选择y,如果双显卡选择y的话,会出现黑屏或者循环登录的问题,如果加了上面的参数就不会出现这个选项了。 
    
    3. 安装过程结束后会有以下信息:
        ===========
        = Summary =
        ===========
        Driver:   Not Selected
        Toolkit:  Installed in /usr/local/cuda-10.0
        Samples:  Not Selected
    
        Please make sure that
        -   PATH includes /usr/local/cuda-10.0/bin
        -   LD_LIBRARY_PATH includes /usr/local/cuda-10.0/lib64, or, add /usr/local/cuda-10.0/lib64 to /etc/ld.so.conf and run ldconfig as root
    
        To uninstall the CUDA Toolkit, run the uninstall script in /usr/local/cuda-10.0/bin
    
        Please see CUDA_Installation_Guide_Linux.pdf in /usr/local/cuda-10.0/doc/pdf for detailed information on setting up CUDA.
    
        ***WARNING: Incomplete installation! This installation did not install the CUDA Driver. A driver of version at least 384.00 is required for CUDA 10.0 functionality to work.
        To install the driver using this installer, run the following command, replacing  with the name of this run file:
            sudo .run -silent -driver
    
        Logfile is /tmp/cuda_install_11482.log
    
  4. 将cuda的bin文件和lib导出到系统环境中,版本不一样则更换其中cuda-x.x

    export PATH="/usr/local/cuda-10.0/bin:$PATH" 
    export LD_LIBRARY_PATH="/usr/local/cuda-10.0/lib64:$LD_LIBRARY_PATH"
    或者
    vi ~/.bashrc
    export PATH="/usr/local/cuda-10.0/bin:$PATH" 
    export LD_LIBRARY_PATH="/usr/local/cuda-10.0/lib64:$LD_LIBRARY_PATH"
    source ~/.bashrc
    
  5. 测试:如果下面测试的最后结果都是Result = PASS,说明CUDA安装成功啦。

    1. 如果成功会输出版本信息nvcc –V

    2. 编译并测试设备 deviceQuery:

      cd /usr/local/cuda-9.2/samples/1_Utilities/deviceQuery
      sudo make
      ./deviceQuery
      
    3. 编译并测试带宽 bandwidthTest:

      cd ../bandwidthTest
      sudo make
      ./bandwidthTest
      
  6. 其他
    所需的libcudart.so.8.0如果正确安装的话,以下两种方法同理:

    1. sudo ldconfig /usr/local/cuda-8.0/lib64
    2. export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda-8.0/lib64
    3. 如果仍然不行,再尝试执行:
    export PATH=\$PATH:/usr/local/cuda-8.0/bin 
    export LIBRARY_PATH=$LIBRARY_PATH:/usr/local/cuda-8.0/lib64 
    source /etc/profile 
    
    1. 此时会显示/sbin/ldconfig.real: /usr/local/cuda-8.0/lib64/libcudnn.so.6 不是符号连接。不用担心,这时已经解决问题了。

安装cudnn

参考:https://docs.nvidia.com/deeplearning/sdk/cudnn-install/index.html

  1. 去这里下载相应版本

  2. 解压
    tar -xzvf cudnn-10.0-linux-x64-v7.tgz

  3. 复制

    cp include/cudnn.h /usr/local/cuda-10.0/include/
    cp lib64/libcudnn* /usr/local/cuda-10.0/lib64/
    
  4. 授权
    sudo chmod a+r /usr/local/cuda-10.0/include/cudnn.h /usr/local/cuda-10.0/lib64/libcudnn*

你可能感兴趣的:(CentOS nvidia+cuda+cudnn 安装)