大概半年前吧,由于学习需要,本人需要在Ubuntu上使用caffe,当时对于Linux,caffe一窍不通,配置安装caffe就失败好多次,光系统重装不下十多次。为什么会这样呢,根本原因是我对这个东西不懂,但是还有一个间接原因就是网上的关于安装配置caffe的博文有好多都是胡说八道,东抄西抄,我曾经看到好几篇博客错得一模一样。不给社区做贡献也就算了,还误人子弟!!!
所以我就将自己的配置过程详细记录下来,一是为了以后自己再次配置方便查阅。二是希望其他人不要走我的弯路。
如果要用NVIDIA的显卡做GPU加速,必然要安装人家显卡的驱动。安装显卡驱动有两种方法,一种是用apt-get,一种是用官方的驱动安装包。
sudo apt-get install nvidia-367
nvidia-smi
nvidia-settings
用apt-get安装唯一的缺点应该就是显卡驱动版本较低,但是几乎不会影响以后的操作,但本人在用apt-get安装显卡驱动时发生一件匪夷所思的事情,就是我打的是安装nvidia-367,但最终安装好的居然是375.39版本,这个版本可是非常新了,这事情碰到两次。但由于说真的驱动版本不会影响后续操作,我也就没这么去深究。
首先当然是去NVIDIA官网(点此跳转)(http://www.nvidia.cn/Download/index.aspx?lang=cn)去下载与自己显卡配套的驱动。如图是本人电脑显卡的信息,注意如果是笔记本电脑,产品系列中要去选有Notebooks后缀的选项。
将下载好的安装文件复制到Ubuntu的本地目录。按Ctrl + Alt + F1组合键切换到控制台,输入用户名和密码登录系统。而后输入如下命令来关闭图形会话。
sudo service lightdm stop
sudo sh NVIDIA-Linux-*-334.21.run
cuda是nvidia的编程语言平台,先去官网下载cuda的安装文件(点此跳转)
sudo apt-get install libxi-dev libxmu-dev
sudo sh cuda_8.0.61_375.26_linux.run
Do you accept the previously read EULA?
accept/decline/quit: accept
Install NVIDIA Accelerated Graphics Driver for Linux-x86_64 375.26?
(y)es/(n)o/(q)uit: n
Install the CUDA 8.0 Toolkit?
(y)es/(n)o/(q)uit: y
Enter Toolkit Location
[ default is /usr/local/cuda-8.0 ]:
Do you want to install a symbolic link at /usr/local/cuda?
(y)es/(n)o/(q)uit: y
Install the CUDA 8.0 Samples?
(y)es/(n)o/(q)uit: y
Enter CUDA Samples Location
[ default is /home/yangyuan ]:
Installing the CUDA Toolkit in /usr/local/cuda-8.0 ...
Missing recommended library: libXi.so
Missing recommended library: libXmu.so
Installing the CUDA Samples in /home/yangyuan ...
Copying samples to /home/yangyuan/NVIDIA_CUDA-8.0_Samples now...
Finished copying samples.
===========
= Summary =
===========
Driver: Not Selected
Toolkit: Installed in /usr/local/cuda-8.0
Samples: Installed in /home/yangyuan, but missing recommended libraries
Please make sure that
- PATH includes /usr/local/cuda-8.0/bin
- LD_LIBRARY_PATH includes /usr/local/cuda-8.0/lib64, or, add /usr/local/cuda-8.0/lib64 to /etc/ld.so.conf and run ldconfig as root
To uninstall the CUDA Toolkit, run the uninstall script in /usr/local/cuda-8.0/bin
Please see CUDA_Installation_Guide_Linux.pdf in /usr/local/cuda-8.0/doc/pdf for detailed information on setting up CUDA.
***WARNING: Incomplete installation! This installation did not install the CUDA Driver. A driver of version at least 361.00 is required for CUDA 8.0 functionality to work.
To install the driver using this installer, run the following command, replacing with the name of this run file:
sudo .run -silent -driver
Logfile is /tmp/cuda_install_2421.log
gedit ~/.bashrc
export PATH=/usr/local/cuda-8.0/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda-8.0/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
sudo gedit /etc/profile
export PATH=/usr/local/cuda/bin:$PATH
sudo gedit /etc/ld.so.conf.d/cuda.conf
/usr/local/cuda/lib64
sudo ldconfig
-还有一个测试办法,通过运行cuda的samples来测试(对应地址以自己电脑上的为准)返回GPU的信息则表示配置成功。
cd ~/NVIDIA_CUDA-8.0_Samples/samples/1_Utilities/deviceQuery
make
sudo ./deviceQuery
注意ubuntu的gcc编译器是5.4.0,cuda8.0是支持5.0以上的编译器,所以并不需要降级,如果要用runfile文件安装cuda7.5,就需要将gcc和g++降级到5以下版本,但我是觉得要是不用最新的cuda8.0,那完全可以通过apt-get来安装cuda呀:
sudo apt-get install nvidia-cuda-toolkit
总之Ubuntu的apt-get安装库是非常的方便快捷,但唯一的缺点就是一般都是较老的版本。
sudo cp cudnn.h /usr/local/cuda/include/ #复制头文件
再cd进入lib64目录下的动态文件进行复制和链接:
sudo cp lib* /usr/local/cuda/lib64/ #复制动态链接库
cd /usr/local/cuda/lib64/
sudo ln -s libcudnn.so.6.0.21 libcudnn.so.6 #生成软衔接
sudo ln -s libcudnn.so.6 libcudnn.so #生成软链接
(通过ls命令来查看libcudnn.so的具体版本号,如下图libcudnn.so的具体版本号为5.1.10,个人根据自己的情况做对应修改)
如果仅仅是为了配置安装caffe来安装opencv,本人建议直接用apt-get来安装,因为caffe里面用到的opencv模块非常有限,仅限于图片的读写,缩放等。
sudo apt install libopencv-dev python-opencv
关于opencv最新版的配置安装详见我的另外一篇博文《Ubuntu16.04 OpenCV安装笔记》。
sudo add-apt-repository universe
sudo apt-get update -y
sudo apt-get install -y build-essential cmake git pkg-config
# General Dependencies
sudo apt-get install libprotobuf-dev libleveldb-dev libsnappy-dev libhdf5-serial-dev protobuf-compiler -y
sudo apt-get install --no-install-recommends libboost-all-dev -y
# BLAS
sudo apt-get install libatlas-base-dev -y
# Remaining Dependencies
sudo apt-get install libgflags-dev libgoogle-glog-dev liblmdb-dev -y
sudo apt-get install -y python-dev
sudo apt-get install -y python-numpy python-scipy python-setuptools python-pip cython python-skimage python-protobuf
git clone https://github.com/BVLC/caffe.git
cd caffe
cp Makefile.config.example Makefile.config
gedit Makefile.config
# cuDNN acceleration switch (uncomment to build with cuDNN).
USE_CUDNN := 1
(当显卡CUDA Capability低于3.0时,cudnn加速时不支持的,比如我的笔记本电脑显卡为GeForce 610,其CUDA Capability为2.1(该指标可在此网站查看https://developer.nvidia.com/cuda-gpus),当然不支持cudnn加速,刚开始不知道直接导致后面make runtest的后部分出错,mnist报错:caffe make runtest error(core dumped)Check failed: status == CUDNN_STATUS_SUCCESS (6 vs. 0))
# Uncomment if you're using OpenCV 3 如果用的是opencv3版本
OPENCV_VERSION := 3
# Uncomment to support layers written in Python (will link against Python libs)
WITH_PYTHON_LAYER := 1
INCLUDE_DIRS := $(PYTHON_INCLUDE) /usr/local/include /usr/include/hdf5/serial
LIBRARY_DIRS := $(PYTHON_LIB) /usr/local/lib /usr/lib /usr/lib/x86_64-linux-gnu /usr/lib/x86_64-linux-gnu/hdf5/serial
因为ubuntu16.04的文件包含位置发生了变化,尤其是需要用到的hdf5的位置,所以需要更改这一路径(如果不改,会直接导致后面make的时候出错,报错:/usr/bin/ld: cannot find -lhd5和/usr/bin/ld: cannot find -lhd5_hl)。
PYTHON_INCLUDE := /usr/include/python2.7 \
/usr/lib/python2.7/dist-packages/numpy/core/include
改成
PYTHON_INCLUDE := /usr/include/python2.7 \
/usr/local/lib/python2.7/dist-packages/numpy/core/include
gedit Makefile
将
NVCCFLAGS +=-ccbin=$(CXX) -Xcompiler-fPIC $(COMMON_FLAGS)
替换
NVCCFLAGS += -D_FORCE_INLINES -ccbin=$(CXX) -Xcompiler -fPIC $(COMMON_FLAGS)
编辑/usr/local/cuda/include/host_config.h,将其中的第115行注释掉:
sudo gedit /usr/local/cuda/include/host_config.h
将
#error-- unsupported GNU version! gcc versions later than 4.9 are not supported!
改为
//#error-- unsupported GNU version! gcc versions later than 4.9 are not supported!
make -j8 all
sudo make -j8 runtest
(此处要用管理员权限sudo才可以完全成功运行)
make pycaffe
cd python
python
>>>import caffe #测试安装成功
(添加环境变量
gedit ~/.bashrc
将export PYTHONPATH=/home/用户名/caffe/python:$PYTHONPATH添加到文件中。
source ~/.bashrc 使更改生效。
这样,在其他地方打开python,也可以import caffe了。否则你编写一个Python文件不在/home/用户名/caffe/python里面的话import caffe会报错,说找不到该模块)
到这里Caffe开发环境就配置好了!我们可以利用MNIST数据集对caffe进行测试,过程如下:
cd ~/caffe
./data/mnist/get_mnist.sh
./examples/mnist/create_mnist.sh
sudo ./examples/mnist/train_lenet.sh
训练的时候可以看到损失与精度数值,如下图: