i7-7700k + TITAN X + 16G DDR4 2400 + 256G SSD
ubuntu16.0.4+ anaconda3+ tensorflow-gpu(0.12.1)
电脑已经安装CUDA9+ cuDNN7.1, 本次安装CUDA8.0.44 + cuDNN5.1
相关命令:
查看cuda版本 : nvcc -V
查看位置 : which nvcc
查看NVIDIA动态使用情况: watch -n 1 nvidia-smi
cuda 版本 : cat /usr/local/cuda/version.txt
cudnn 版本 : cat /usr/local/cuda/include/cudnn.h | grep CUDNN_MAJOR -A 2
NVIDIA 驱动版本 : cat /proc/driver/nvidia/version
查看环境变量 : env
LD_DEBUG=all cat
卸载cuda : sudo /usr/local/cuda-8.0/bin/uninstall_cuda_8.0.pl
卸载NVIDIA Driver : sudo /usr/bin/nvidia-uninstall
多版本CUDA切换:
sudo rm -rf /usr/local/cuda
sudo ln -s /usr/local/cuda-8.0 /usr/local/cuda
sudo ln -s /usr/local/cuda-9.1 /usr/local/cuda
查看目录属性: ls -l 目录名
具有管理员权限的文件管理器, 比如移动文件夹 : sudo nautilus
加入-R 参数,将权限传递给子文件夹 : chmod -R 777 /home/mypackage
**********************************************************************************************************
GitHub上下了个程序,tensorflow-gpu=0.12,gpu下跑报错,应该是CUDA版本高了
E tensorflow/stream_executor/cuda/cuda_dnn.cc:378] Loaded runtime CuDNN library: 7103 (compatibility version 7100) but source was compiled with 5105 (compatibility version 5100). If using a binary install, upgrade your CuDNN library to match. If building from sources, make sure the library loaded at runtime matches a compatible version specified during compile configuration.
E tensorflow/stream_executor/cuda/cuda_dnn.cc:378] Loaded runtime CuDNN library: 7103 (compatibility version 7100) but source was compiled with 5105 (compatibility version 5100). If using a binary install, upgrade your CuDNN library to match. If building from sources, make sure the library loaded at runtime matches a compatible version specified during compile configuration.
F tensorflow/core/kernels/conv_ops.cc:532] Check failed: stream->parent()->GetConvolveAlgorithms(&algorithms)
F tensorflow/core/kernels/conv_ops.cc:532] Check failed: stream->parent()->GetConvolveAlgorithms(&algorithms)
Ubuntu16.04下安装多版本cuda和cudnn_AscentOf的博客-CSDN博客_cuda多版本安装 :Ubuntu16.04下安装多版本cuda和cudnn
Ubuntu16.04下同时安装CUDA8.0和CUDA7.0_Rosun_的博客-CSDN博客 :Ubuntu16.04下同时安装CUDA8.0和CUDA7.0
https://blog.csdn.net/maple2014/article/details/78574275 :安装多版本 cuda ,多版本之间切换
https://blog.csdn.net/mumoDM/article/details/79462604 :多版本CUDA问题
Windows Tensorflow GPU安装_ChainingBlocks的博客-CSDN博客 :windows下tensorflow-gpu安装
0、 Tensorflow gpu 官方安装指南:
https://www.tensorflow.org/install/install_windows
下载CUDA并安装:
各个版本的CUDA :CUDA Toolkit Archive | NVIDIA Developer
下载CUDNN (要注册)
CUDNN库下载地址:NVIDIA cuDNN | NVIDIA Developer
Installation Guide for Linux : cuda_8.0.44(官方安装说明) cuda_8.0.44
安装CUDA8.0和cuDNN5.1:
Ubuntu16.04下安装cuda和cudnn的三种方法(亲测全部有效)_老王回归的博客-CSDN博客_ubuntu安装cuda
Ubuntu下安装CUDA8.0及nvidia驱动(详细教程)_autotian的博客-CSDN博客_cuda8.0
下载好后直接命令行解压然后复制 lib64 和 include 文件夹到 usr/local/cuda-8.0,命令如下:
# Installing from a Tar Filetar -zxvf 压缩文件名.tar.gz
sudo cp cuda/include/cudnn.h /usr/local/cuda-8.0/include
sudo cp cuda/lib64/libcudnn* /usr/local/cuda-8.0/lib64
sudo chmod a+r /usr/local/cuda-8.0/include/cudnn.h
sudo chmod a+r /usr/local/cuda-8.0/lib64/libcudnn*
gedit ~/.bashrc #更改 ~/.bashrc 文件,添加两行
export PATH=/usr/local/cuda/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
下面的不行:
export PATH="$PATH:/usr/local/cuda/bin"
export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/usr/local/cuda/lib64/"
sudo /etc/profile #必须更改/etc/profile 文件, 而且更改后必须重启计算机才有效 (source /etc/profile 不能生效)
export PATH=/usr/local/cuda-9.1/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda-9.1/lib64:$LD_LIBRARY_PATH
改为:
export PATH=/usr/local/cuda/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH
从cuda9.1切换到cuda8.0:
sudo rm -rf /usr/local/cuda #删除之前创建的软链接 sudo ln -s /usr/local/cuda-8.0 /usr/local/cuda #创建新 cuda 的软链接
从cuda8.0切换到cuda9.0:
sudo rm -rf /usr/local/cuda #删除之前创建的软链接
sudo ln -s /usr/local/cuda-9.1 /usr/local/cuda #创建新 cuda 的软链接
可以用命令来查看cuda是否切换完成:
$ nvcc --version nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2017 NVIDIA Corporation Built on Fri_Sep__1_21:08:03_CDT_2017 Cuda compilation tools, release 9.0, V9.0.176
which nvcc :查看nvcc位置
CUDA8.0+cuDNN5.1 未完全安装:
Installing the CUDA Toolkit in /usr/local/cuda-8.0 ...
Installing the CUDA Samples in /home/human-machine ...
Copying samples to /home/human-machine/NVIDIA_CUDA-8.0_Samples now...
Finished copying samples.
===========
= Summary =
===========
Driver: Not Selected
Toolkit: Installed in /usr/local/cuda-8.0
Samples: Installed in /home/human-machine
Please make sure that
- PATH includes /usr/local/cuda-8.0/bin
- LD_LIBRARY_PATH includes /usr/local/cuda-8.0/lib64, or, add /usr/local/cuda-8.0/lib64 to /etc/ld.so.conf and run ldconfig as root
To uninstall the CUDA Toolkit, run the uninstall script in /usr/local/cuda-8.0/bin
Please see CUDA_Installation_Guide_Linux.pdf in /usr/local/cuda-8.0/doc/pdf for detailed information on setting up CUDA.
***WARNING: Incomplete installation! This installation did not install the CUDA Driver. A driver of version at least 361.00 is required for CUDA 8.0 functionality to work.
To install the driver using this installer, run the following command, replacing with the name of this run file:
sudo .run -silent -driver
Logfile is /tmp/cuda_install_1534.log
#编译并测试设备 deviceQuery:
切换到例子存放的路径,默认路径是 ~/NVIDIA_CUDA-7.5_Samples ,切换到相应路径
然后终端输入:$ make
运行编译生成的二进制文件
编译后的二进制文件, 默认存放在 ~/NVIDIA_CUDA-7.5_Samples/bin
切换路径 :$ cd /NVIDIA_CUDA-8.0_Samples/bin/x86_64/linux/release
终端输入 :$ ./deviceQuery
#编译并测试带宽 bandwidthTest:
cd ../bandwidthTest
sudo make
./bandwidthTest
如果这两个测试的最后结果都是Result = PASS,说明CUDA安装成功
CUDA8.0+cuDNN5.1报错,但tensorflow-gpu可以跑,tensorboard也可以用。
human-machine@humanmachine-System-Product-Name:~/NVIDIA_CUDA-8.0_Samples$ make
make[1]: Entering directory '/home/human-machine/NVIDIA_CUDA-8.0_Samples/0_Simple/simpleVoteIntrinsics_nvrtc'
find: `/usr/local/cuda-8.0/lib64/stubs': 没有那个文件或目录
>>> WARNING - libcuda.so not found, CUDA Driver is not installed. Please re-install the driver. <<<
[@] g++ -I../../common/inc -I/usr/local/cuda-8.0/include -o simpleVoteIntrinsics.o -c simpleVoteIntrinsics.cpp
[@] g++ -L/usr/local/cuda-8.0/lib64 -L/usr/local/cuda-8.0/lib64/stubs -o simpleVoteIntrinsics_nvrtc simpleVoteIntrinsics.o -lcuda -lnvrtc
[@] mkdir -p ../../bin/x86_64/linux/release
[@] cp simpleVoteIntrinsics_nvrtc ../../bin/x86_64/linux/release
make[1]: Leaving directory '/home/human-machine/NVIDIA_CUDA-8.0_Samples/0_Simple/simpleVoteIntrinsics_nvrtc'
make[1]: Entering directory '/home/human-machine/NVIDIA_CUDA-8.0_Samples/0_Simple/matrixMul'
"/usr/local/cuda-8.0"/bin/nvcc -ccbin g++ -I../../common/inc -m64 -gencode arch=compute_20,code=sm_20 -gencode arch=compute_30,code=sm_30 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_60,code=compute_60 -o matrixMul.o -c matrixMul.cu
nvcc warning : The 'compute_20', 'sm_20', and 'sm_21' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
cc1plus: fatal error: cuda_runtime.h: 没有那个文件或目录
compilation terminated.
Makefile:250: recipe for target 'matrixMul.o' failed
make[1]: *** [matrixMul.o] Error 1
make[1]: Leaving directory '/home/human-machine/NVIDIA_CUDA-8.0_Samples/0_Simple/matrixMul'
Makefile:52: recipe for target '0_Simple/matrixMul/Makefile.ph_build' failed
make: *** [0_Simple/matrixMul/Makefile.ph_build] Error 2
CUDA9.1+cuDNN7.1 编译测试正常:
Ubuntu 14.04 上安装 CUDA 7.5/8.0 超详细教程_服务器应用_Linux公社-Linux系统门户网站 :参考官方文档,干货
ubuntu 同时安装cuda8.0与cuda9.0,cuda9.1_七爷OK的博客-CSDN博客
Ubuntu16.04搭建GTX1080+CUDA9.0+cuDNN7.0.4+Tensorflow1.6.0环境