在/etc/modprobe.d/blacklist.conf
里添加,如下内容,并执行 sudo update-initramfs -u
命令,
blacklist nouveau
options nouveau modeset=0
重启后用 lsmod | grep nouveau
, 如果没有任何输出说明禁用成功。
ubuntu 18.04 直接使用sudo ubuntu-drivers autoinstall
安装的是390的驱动,cuda 10 显卡必须是410以上,所以要安装新一点的驱动
安装更新版本的驱动可以先添加源:
$ sudo add-apt-repository ppa:graphics-drivers/ppa
$ sudo apt update
然后执行:
$ ubuntu-drivers devices
== /sys/devices/pci0000:00/0000:00:02.0/0000:05:00.0 ==
modalias : pci:v000010DEd00001C82sv000019DAsd00002456bc03sc00i00
vendor : NVIDIA Corporation
model : GP107 [GeForce GTX 1050 Ti]
driver : nvidia-driver-410 - third-party free
driver : nvidia-driver-415 - third-party free
driver : nvidia-driver-390 - distro non-free
driver : nvidia-driver-418 - third-party free
driver : nvidia-driver-396 - third-party free
driver : nvidia-driver-430 - third-party free recommended
driver : xserver-xorg-video-nouveau - distro free builtin
Selecting previously unselected package nvidia-dkms-430.
最后安装
$ sudo apt install nvidia-driver-430
下载地址如下:
https://developer.nvidia.com/cuda-toolkit-archive
这里下载:cuda_10.1.105_418.39_linux.run
使用如下命令安装:
$ sudo sh cuda_10.1.105_418.39_linux.run --no-opengl-libs
注意第一项显卡驱动不要装,之前安装过了。其他的都装。
声明一下环境变量,并将其写入到 ~/.bashrc 文件(在用户目录下)的尾部,输入内容如下
export PATH=/usr/local/cuda-10.1/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda-10.1/lib64:$LD_LIBRARY_PATH
保存退出,并输入下面指令使环境变量立刻生效:
$source ~/.bashrc
在命令行输入:
$ sudo vim /etc/profile
在打开的文件末尾加入:
export PATH=/usr/local/cuda-10.1/bin:$PATH
$ sudo vim /etc/ld.so.conf.d/cuda.conf
在打开的文件中添加如下语句:
/usr/local/cuda-10.1/lib64
保存退出,然后执行
$ sudo ldconfig
使链接立即生效。
切换到 CUDA 9.1 Samples 默认安装路径(即在/home/用户/ NVIDIA_CUDA-10.1_Samples 目录下), 终端下输入
$ cd ~/NVIDIA_CUDA-10.1_Samples
$ sudo make all -j8
$ cd bin/x86_64/linux/release
$ ./deviceQuery
如果 CUDA 安装成功,则显示相关信息 Result = PASS,具体如下:
$ ./deviceQuery
./deviceQuery Starting...
CUDA Device Query (Runtime API) version (CUDART static linking)
Detected 1 CUDA Capable device(s)
Device 0: "GeForce GTX 1050 Ti"
CUDA Driver Version / Runtime Version 10.2 / 10.1
CUDA Capability Major/Minor version number: 6.1
Total amount of global memory: 4039 MBytes (4234936320 bytes)
( 6) Multiprocessors, (128) CUDA Cores/MP: 768 CUDA Cores
GPU Max Clock rate: 1392 MHz (1.39 GHz)
Memory Clock rate: 3504 Mhz
Memory Bus Width: 128-bit
L2 Cache Size: 1048576 bytes
Maximum Texture Dimension Size (x,y,z) 1D=(131072), 2D=(131072, 65536), 3D=(16384, 16384, 16384)
Maximum Layered 1D Texture Size, (num) layers 1D=(32768), 2048 layers
Maximum Layered 2D Texture Size, (num) layers 2D=(32768, 32768), 2048 layers
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 49152 bytes
Total number of registers available per block: 65536
Warp size: 32
Maximum number of threads per multiprocessor: 2048
Maximum number of threads per block: 1024
Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
Max dimension size of a grid size (x,y,z): (2147483647, 65535, 65535)
Maximum memory pitch: 2147483647 bytes
Texture alignment: 512 bytes
Concurrent copy and kernel execution: Yes with 2 copy engine(s)
Run time limit on kernels: Yes
Integrated GPU sharing Host Memory: No
Support host page-locked memory mapping: Yes
Alignment requirement for Surfaces: Yes
Device has ECC support: Disabled
Device supports Unified Addressing (UVA): Yes
Device supports Compute Preemption: Yes
Supports Cooperative Kernel Launch: Yes
Supports MultiDevice Co-op Kernel Launch: Yes
Device PCI Domain ID / Bus ID / location ID: 0 / 5 / 0
Compute Mode:
< Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >
deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 10.2, CUDA Runtime Version = 10.1, NumDevs = 1
Result = PASS
下载地址:https://developer.nvidia.com/rdp/cudnn-archive
$ sudo dpkg -i libcudnn7_7.6.1.34-1+cuda10.1_amd64.deb
$ sudo dpkg -i libcudnn7-dev_7.6.1.34-1+cuda10.1_amd64.deb
$ sudo dpkg -i libcudnn7-doc_7.6.1.34-1+cuda10.1_amd64.deb
6.1复制 cuDNN sample 到一个目录下,这里复制到 HOME 下
$ cp -r /usr/src/cudnn_samples_v7 /$HOME
6.2进入 HOME 目录
$ cd $HOME/cudnn_samples_v7/mnistCUDNN/
6.3 编译 mnistCUDNN sample
$ make clean && make all –j8
6.4 运行 mnistCUDNN sample
$ ./mnistCUDNN
如果出现 Test passed!表明 cuDNN 已成功安装
$ ./mnistCUDNN
cudnnGetVersion() : 7601 , CUDNN_VERSION from cudnn.h : 7601 (7.6.1)
Host compiler version : GCC 5.5.0
There are 1 CUDA capable devices on your machine :
device 0 : sms 6 Capabilities 6.1, SmClock 1392.0 Mhz, MemSize (Mb) 4038, MemClock 3504.0 Mhz, Ecc=0, boardGroupID=0
Using device 0
Testing single precision
Loading image data/one_28x28.pgm
Performing forward propagation ...
Testing cudnnGetConvolutionForwardAlgorithm ...
Fastest algorithm is Algo 1
Testing cudnnFindConvolutionForwardAlgorithm ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: 0.020288 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: 0.038912 time requiring 3464 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: 0.044032 time requiring 57600 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: 0.101376 time requiring 203008 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: 0.121856 time requiring 207360 memory
Resulting weights from Softmax:
0.0000000 0.9999399 0.0000000 0.0000000 0.0000561 0.0000000 0.0000012 0.0000017 0.0000010 0.0000000
Loading image data/three_28x28.pgm
Performing forward propagation ...
Resulting weights from Softmax:
0.0000000 0.0000000 0.0000000 0.9999288 0.0000000 0.0000711 0.0000000 0.0000000 0.0000000 0.0000000
Loading image data/five_28x28.pgm
Performing forward propagation ...
Resulting weights from Softmax:
0.0000000 0.0000008 0.0000000 0.0000002 0.0000000 0.9999820 0.0000154 0.0000000 0.0000012 0.0000006
Result of classification: 1 3 5
Test passed!
Testing half precision (math in single precision)
Loading image data/one_28x28.pgm
Performing forward propagation ...
Testing cudnnGetConvolutionForwardAlgorithm ...
Fastest algorithm is Algo 1
Testing cudnnFindConvolutionForwardAlgorithm ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: 0.019360 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: 0.032768 time requiring 3464 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: 0.040960 time requiring 28800 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: 0.100992 time requiring 203008 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: 0.116416 time requiring 207360 memory
Resulting weights from Softmax:
0.0000001 1.0000000 0.0000001 0.0000000 0.0000563 0.0000001 0.0000012 0.0000017 0.0000010 0.0000001
Loading image data/three_28x28.pgm
Performing forward propagation ...
Resulting weights from Softmax:
0.0000000 0.0000000 0.0000000 1.0000000 0.0000000 0.0000714 0.0000000 0.0000000 0.0000000 0.0000000
Loading image data/five_28x28.pgm
Performing forward propagation ...
Resulting weights from Softmax:
0.0000000 0.0000008 0.0000000 0.0000002 0.0000000 1.0000000 0.0000154 0.0000000 0.0000012 0.0000006
Result of classification: 1 3 5
Test passed!
运行tensorflow时提示如下错误:
2019-07-24 13:01:26.926999: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Could not dlopen library 'libcudart.so.10.0'; dlerror: libcudart.so.10.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda-10.1/lib64:
2019-07-24 13:01:26.927165: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Could not dlopen library 'libcublas.so.10.0'; dlerror: libcublas.so.10.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda-10.1/lib64:
2019-07-24 13:01:26.927349: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Could not dlopen library 'libcufft.so.10.0'; dlerror: libcufft.so.10.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda-10.1/lib64:
2019-07-24 13:01:26.927516: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Could not dlopen library 'libcurand.so.10.0'; dlerror: libcurand.so.10.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda-10.1/lib64:
2019-07-24 13:01:26.927688: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Could not dlopen library 'libcusolver.so.10.0'; dlerror: libcusolver.so.10.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda-10.1/lib64:
2019-07-24 13:01:26.927847: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Could not dlopen library 'libcusparse.so.10.0'; dlerror: libcusparse.so.10.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda-10.1/lib64:
解决安装如下内容:
$ conda install cudatoolkit
$ conda install cudnn
参考:https://github.com/tensorflow/tensorflow/issues/26182
或者重新安装 keras-gpu:
$ conda install keras-gpu