Ubuntu16.04 + CUDA8.0 + CuDNN + OpenCV + caffe

先装NVIDIA官方驱动,然后装CUDA,随后装CuDNN,接着装OpenCV,最后安装配置caffe


为了给我的GTX1080安装好英伟达的官方驱动,看了一些帖子,结果发现安装的方法众说纷纭。 想起上次在笔记本上安装驱动时,陷入login loop的痛苦,这一次一定调研清楚在下手!

下面的片段来自百度贴吧,大致总结了我看到的几种安装Nvidia驱动的方法:

安装Ubuntu显卡驱动的方法有以下几种:
1.直接去nvidia官网下载驱动包安装(这种方法麻烦并且失败率很高,失败的话就进入不了桌面了)
2.从PPA中安装,PPA分为官方和私有(这种方式既安全方便,也能安装上最新的驱动)
3.从附加驱动里安装(附加驱动安装最简单和安全,但是这种方式无法安装最新的驱动)

这里主要介绍如何通过PPA安装nvidia显卡驱动
第一步:添加PPA
第二步:更新列表
第三步:查找驱动
第四步:安装驱动 下面详细解释每一步如何操作

第一步:在进行第一步前,你首先要确定适合你显卡最新的nvidia的驱动版本,可以去官网查询。确定了最新nvidia版本号之后,按CTRL+ALT+T打开terminal,输入sudo apt-cache search nvidia*查看列表
中时候有你要的版本号。如果有的话就不用添加私有PPA了。然后输入sudo apt-get install nvidia -367
然后回车,这里的367是驱动版本号,替换成你的版本号。安装完成之后重启电脑,然后在 nvidia
X server setting中的PRIME Profiles中勾选nvidia。如果官方的PPA中没有最新的nvidia驱动,那么你就
要手动添加私有PPA了,在这个网址https://launchpad.net/ubuntu/+ppas中搜索nvidia
driver,然后在搜索结果中查找到包含你要的驱动的PPA,然后根据PPA页面提供的方法添加PPA,例如下面的网页中的ppa添加到系统中:打开terminal然后把网页上的这两句复制上贴上去,然后回车。
sudo add-apt-repository ppa:graphics-drivers/ppa
sudo apt update
然后安装驱动,在terminal中输入:
sudo apt-get install nvidia -367(把367改成你要的驱动版本号)
按enter开始安装,安装完成之后重启电脑,然后在 nvidia X server setting中 的PRIME Profiles中勾选nvidia。

我的对应驱动是375:
Ubuntu16.04 + CUDA8.0 + CuDNN + OpenCV + caffe_第1张图片

而且运行sudo apt-cache search nvidia*正好能找到对应版本的驱动
但是运行sudo apt-get install nvidia -375却出现安装包依赖的问题,后来通过更改成阿里的源解决!

下面一篇文章讲到了ubuntu中集显和独显的切换:
http://blog.csdn.net/ichsonx/article/details/46941235

下面一篇文章告诉怎么查看显卡驱动是否安装成功,如何查看当前使用的是独显还是集显和驱动等信息!
http://blog.csdn.net/jay463261929/article/details/55098945

下面这篇文章也说明,平常如果不玩大型游戏,不看高清视频的话,使用集显更加节能
http://www.xitongzhijia.net/xtjc/20151215/63663.html

见了鬼了。 装好驱动后,sudo nvidia-smi也显示安装成功了,但是打开nvidia x settings却没有prime profile的选项,导致无法切换到集显,捣鼓了半天,也没找到方法。。

经不起耗,先用独显显示算了,赶紧把CUDA和CuDNN装好吧。。


CUDA的安装,首先要从官网下载对应版本的CUDA版本,但是1.4G的文件教育网的下载速度实在感人,好在网上有人下好了上传了百度云(链接见下面),虽然不是最新版,但是也可以用
http://www.cnblogs.com/kingstrong/p/5959664.html

下载之后,按照下面的步骤
http://www.cnblogs.com/Qwells/p/6086773.html#undefined
一步步来就OK了

另外,有一点必须要独立的拿出来讲一下,也就是上面的步骤在安装完成后设置环境变量是使用:
Ubuntu16.04 + CUDA8.0 + CuDNN + OpenCV + caffe_第2张图片

而我看别人的设置环境变量的方法是往~/.bashrc里面写:
Ubuntu16.04 + CUDA8.0 + CuDNN + OpenCV + caffe_第3张图片
来自:http://www.52nlp.cn/%E6%B7%B1%E5%BA%A6%E5%AD%A6%E4%B9%A0%E4%B8%BB%E6%9C%BA%E7%8E%AF%E5%A2%83%E9%85%8D%E7%BD%AE-ubuntu-16-04-nvidia-gtx-1080-cuda-8

或者是有两步(环境变量设置和动态链接库设置):
Ubuntu16.04 + CUDA8.0 + CuDNN + OpenCV + caffe_第4张图片

总之,现在我有点懵逼了。 现在关于环境变量的设置,我印象中至少有三种方法:
1. 通过~/.bashrc
2. 通过在/etc/profile
3. 通过在/etc/ld.so.conf.d/ 新建一个.conf,比如opencv.conf
不知道这几种有什么区别?!
这篇文章可能会有帮助:
http://www.cnblogs.com/big-tree/p/5874336.html

接下来是CuDNN,这个也要从官网下载文件,只有七八十兆,还好。
下载下来后,解压缩。 已经安装好CUDA之后,CuDNN的安装很简单,就是复制几个文件,添加一些软链接之类的(日后如果CuDNN又更新了,方便升级)。

下载cuDNN后解压
sudo cp lib* /usr/local/cuda/lib64/
sudo cp cudnn.h /usr/local/cuda/include/

更新软链接
cd /usr/local/cuda/lib64/
sudo rm -rf libcudnn.so libcudnn.so.5
sudo ln -s libcudnn.so.5.1.0 libcudnn.so.5
sudo ln -s libcudnn.so.5 libcudnn.so

具体可以参考
http://blog.csdn.net/autoliuweijie/article/details/53173928 (这个博客的最后一句话我没搞明白)
http://blog.csdn.net/jhszh418762259/article/details/52958287?locationNum=8&fps=1 (完全正解!)


caffe按照官网教程来就行:
http://caffe.berkeleyvision.org/install_apt.html
dependencies已经安装完了(官网提到的),明天做下面的:(可以先把opencv装好!)
http://caffe.berkeleyvision.org/installation.html#compilation

OpenCV的安装,官网有教程(下面链接)
http://docs.opencv.org/3.1.0/d7/d9f/tutorial_linux_install.html
按照教程一步步知道make完毕后,注意! 有一篇博客讲到了使用checkinstall来方便以后的升级卸载,下图:
Ubuntu16.04 + CUDA8.0 + CuDNN + OpenCV + caffe_第5张图片
来自链接:http://blog.csdn.net/hit2015spring/article/details/53510909

关于checkinstall的介绍,还可以参考(但是我怎么感觉下面两个链接提到的checkinstall的用法和上面的博客不太一样,下面的并没有执行sudo make install):
https://www.aliyun.com/zixun/content/3_12_513777.html
http://blog.sina.com.cn/s/blog_4178f4bf0101cmt7.html (这篇文章里还引用了两个链接,但是第一个失效了,第二个不错)
下面这个链接也和上面两个链接的意思相同
http://blog.csdn.net/talkxin/article/details/50660014
看来上面的博客使用checkinstall的想法是好的,但是确实用错了。

以上提到的checkinstall确实是一个非常给力的工具,以后只要涉及到从源码安装的程序都可以用上checkinstall,方便管理卸载,回想我以前卸载opencv还要用find命令,甚是麻烦!


PS:
sudo checkinstall的过程中(一共花了好长时间),猛然发现我的根目录快塞满了:
Ubuntu16.04 + CUDA8.0 + CuDNN + OpenCV + caffe_第6张图片
幸亏安装完后,删除临时文件后,大小变小了(不过感觉还是挺大的),以后从源码编译安装的时候,要把目录选到home下去。 参考:https://www.zhihu.com/question/22484307
Ubuntu16.04 + CUDA8.0 + CuDNN + OpenCV + caffe_第7张图片

最后,测试OpenCV是否安装成功:
http://www.tuicool.com/articles/nYJrYra


OpenCV安装成功后,然后配置安装caffe,dependencies按照官网来就行。随后,在执行make all 之前要仔细仔细地检查一下Makefile.config文件,只要有一个地方错了就得make clean从头再来。我的Makefile.config文件如下

几个需要注意的地方,从上到下依次是:
1. USE_CUDNN := 1 取消注释
2. # CPU_ONLY := 1 保持注释
3. OPENCV_VERSION := 3 取消注释(如果事先已经安装配置好了OpenCV3)
4. BLAS := atlas 看先前是不是安装的atlas作为BLAS
5. PYTHON_INCLUDE 及后面所有涉及到python的地方,ubuntu默认的Python和Anaconda二者取其一
6. 拜Ubuntu16.04的专属bug所赐(因为文件包含不同),INCLUDE_DIRS部分,/usr/local/include 要变成/usr/local/include /usr/include/hdf5/serial/
7. 同上, LIBRARY_DIRS部分,/usr/local/lib /usr/lib要变成/usr/local/lib /usr/lib /usr/lib/x86_64-linux-gnu /usr/lib/x86_64-linux-gnu/hdf5/serial
8. 还有错误的话,随时百度就行,也可以参考我以前的那篇CPU版的caffe博客,里面有一些好的引用。 程序员就是要习惯trial and error

## Refer to http://caffe.berkeleyvision.org/installation.html
# Contributions simplifying and improving our build system are welcome!

# cuDNN acceleration switch (uncomment to build with cuDNN).
 USE_CUDNN := 1

# CPU-only switch (uncomment to build without GPU support).
# CPU_ONLY := 1

# uncomment to disable IO dependencies and corresponding data layers
# USE_OPENCV := 0
# USE_LEVELDB := 0
# USE_LMDB := 0

# uncomment to allow MDB_NOLOCK when reading LMDB files (only if necessary)
#   You should not set this flag if you will be reading LMDBs with any
#   possibility of simultaneous read and write
# ALLOW_LMDB_NOLOCK := 1

# Uncomment if you're using OpenCV 3
 OPENCV_VERSION := 3

# To customize your choice of compiler, uncomment and set the following.
# N.B. the default for Linux is g++ and the default for OSX is clang++
# CUSTOM_CXX := g++

# CUDA directory contains bin/ and lib/ directories that we need.
CUDA_DIR := /usr/local/cuda
# On Ubuntu 14.04, if cuda tools are installed via
# "sudo apt-get install nvidia-cuda-toolkit" then use this instead:
# CUDA_DIR := /usr

# CUDA architecture setting: going with all of them.
# For CUDA < 6.0, comment the *_50 through *_61 lines for compatibility.
# For CUDA < 8.0, comment the *_60 and *_61 lines for compatibility.
CUDA_ARCH := -gencode arch=compute_20,code=sm_20 \
        -gencode arch=compute_20,code=sm_21 \
        -gencode arch=compute_30,code=sm_30 \
        -gencode arch=compute_35,code=sm_35 \
        -gencode arch=compute_50,code=sm_50 \
        -gencode arch=compute_52,code=sm_52 \
        -gencode arch=compute_60,code=sm_60 \
        -gencode arch=compute_61,code=sm_61 \
        -gencode arch=compute_61,code=compute_61

# BLAS choice:
# atlas for ATLAS (default)
# mkl for MKL
# open for OpenBlas
BLAS := atlas
# Custom (MKL/ATLAS/OpenBLAS) include and lib directories.
# Leave commented to accept the defaults for your choice of BLAS
# (which should work)!
# BLAS_INCLUDE := /path/to/your/blas
# BLAS_LIB := /path/to/your/blas

# Homebrew puts openblas in a directory that is not on the standard search path
# BLAS_INCLUDE := $(shell brew --prefix openblas)/include
# BLAS_LIB := $(shell brew --prefix openblas)/lib

# This is required only if you will compile the matlab interface.
# MATLAB directory should contain the mex binary in /bin.
# MATLAB_DIR := /usr/local
# MATLAB_DIR := /Applications/MATLAB_R2012b.app

# NOTE: this is required only if you will compile the python interface.
# We need to be able to find Python.h and numpy/arrayobject.h.
PYTHON_INCLUDE := /usr/include/python2.7 \
        /usr/lib/python2.7/dist-packages/numpy/core/include

## Above block, although it's not explicitly expressed in caffe installation instruction,
## I still comment it for I am using anaconda instead.
# Anaconda Python distribution is quite popular. Include path:
# Verify anaconda location, sometimes it's in root.
# ANACONDA_HOME := $(HOME)/anaconda2
# PYTHON_INCLUDE := $(ANACONDA_HOME)/include \
        # $(ANACONDA_HOME)/include/python2.7 \
        # $(ANACONDA_HOME)/lib/python2.7/site-packages/numpy/core/include

# Uncomment to use Python 3 (default is Python 2)
# PYTHON_LIBRARIES := boost_python3 python3.5m
# PYTHON_INCLUDE := /usr/include/python3.5m \
#                 /usr/lib/python3.5/dist-packages/numpy/core/include

# We need to be able to find libpythonX.X.so or .dylib.
 PYTHON_LIB := /usr/lib
# PYTHON_LIB := $(ANACONDA_HOME)/lib

# Homebrew installs numpy in a non standard path (keg only)
# PYTHON_INCLUDE += $(dir $(shell python -c 'import numpy.core; print(numpy.core.__file__)'))/include
# PYTHON_LIB += $(shell brew --prefix numpy)/lib

# Uncomment to support layers written in Python (will link against Python libs)
# WITH_PYTHON_LAYER := 1

# Whatever else you find you need goes here.
# INCLUDE_DIRS := $(PYTHON_INCLUDE) /usr/local/include  this line is changed into the following
INCLUDE_DIRS := $(PYTHON_INCLUDE) /usr/local/include /usr/include/hdf5/serial/
# LIBRARY_DIRS := $(PYTHON_LIB) /usr/local/lib /usr/lib
LIBRARY_DIRS := $(PYTHON_LIB) /usr/local/lib /usr/lib /usr/lib/x86_64-linux-gnu /usr/lib/x86_64-linux-gnu/hdf5/serial

# If Homebrew is installed at a non standard location (for example your home directory) and you use it for general dependencies
# INCLUDE_DIRS += $(shell brew --prefix)/include
# LIBRARY_DIRS += $(shell brew --prefix)/lib

# NCCL acceleration switch (uncomment to build with NCCL)
# https://github.com/NVIDIA/nccl (last tested version: v1.2.3-1+cuda8.0)
# USE_NCCL := 1

# Uncomment to use `pkg-config` to specify OpenCV library paths.
# (Usually not necessary -- OpenCV libraries are normally installed in one of the above $LIBRARY_DIRS.)
# USE_PKG_CONFIG := 1

# N.B. both build and distribute dirs are cleared on `make clean`
BUILD_DIR := build
DISTRIBUTE_DIR := distribute

# Uncomment for debugging. Does not work on OSX due to https://github.com/BVLC/caffe/issues/171
# DEBUG := 1

# The ID of the GPU that 'make runtest' will use to run unit tests.
TEST_GPUID := 0

# enable pretty build (comment to see full commands)
Q ?= @

还有一个可能存在的坑,来自makefile中opencv配置的问题,见:https://github.com/facebook/C3D/issues/253

================================================================
2017/10/18更新
关于makefile的配置,也是坑的重灾区,我这一段时间编译caffe-ssd就总是出现boost库的相关error,原因就是我直接用服务器上现有caffe中的makefile替换掉了caffe-ssd中的makefile,而这两者有一个不同就是替换之前的makefile中有一句原本是

LIBRARIES += glog gflags protobuf boost_system boost_filesystem boost_regex m hdf5_hl hdf5

替换后成了

LIBRARIES += glog gflags protobuf boost_system boost_filesystem m hdf5_hl hdf5

修改之后编译就成功了,网上说的caffe要求boost>=1.55无此必要,因为这里编译的boost就是1.54。

================================================================

然后就是:
make all -j7
make test
make runtest
make pycaffe

然后添加

export PYTHONPATH=/path/to/caffe/python:$PYTHONPATH

到~/.bashrc里面,之后source ~/.bashrc就可以了

然后就可以import caffe试试,并且跑跑测试的mnist。
Ubuntu16.04 + CUDA8.0 + CuDNN + OpenCV + caffe_第8张图片

你可能感兴趣的:(caffe)