整理一下深度学习搭建环境遇到的坑,这次实验使用Fast-rcnn(caffe环境) Faster-rcnn(caffe环境)yolo(在C++环境,不需要搭建)算法,实验是识别汽车前后门板上的50多个小焊点。
感谢曾师兄、裴师兄、老莫为了实验的环境搭建所作出的贡献
实验使用阿里云平台(没钱买服务器,按小时扣费那种),注意重要文件存在系统盘,数据盘每次登陆清空;
选择服务器类别,配置按自己的算法要求来,GPU选择了12G,显卡是Nvida M40,镜像选择ubuntu16.04,可视化桌面需要自己安装。不建议选择ubuntu16.04+nvida的镜像,因为它不会帮你禁用原来的显卡,会出现循环登录或屏幕显示有较大问题;
有钱还是自己配置好的显卡和GPU吧,实验室的话可以申请买服务器,阿里云真是个坑,平台使用一个月左右花了2000,对个人用户十分不友好,有时还会出现资源被占用的情况。如果使用服务器建议可现在淘宝购买优惠券,能省个几百块钱。深度学习应用图像处理对显卡要求比较高,大家根据自己的实际情况,尽量选择较好的,后面的训练才不会报溢出,而且训练时间比较短,我的Fast-rcnn用VGG16网络训练一次7个小时,时间成本非常大;
说一下GPU,网上有不少CPU训练的教程,实际CPU不能用来训练,甚至测试都不可以,只能用来跑demo,所以还是老老实实用GPU吧。
Ubuntu建议用16.04,用最新的18.06经常会出现找不到包的情况,而且会出现各种奇奇怪怪的问题。还要注意CUDA CUDNN安装版本需对应。CUDA9.0对应CUDNN7.0,CUDA8.0对应CUDNN5.0、CUDNN6.0,两个都装了,需要的时候更换软连接就好。
CUDA8.0的安装参考 https://blog.csdn.net/jonms/article/details/79318566 ,这一部分是老莫安装的所以在此不详写,以下是安装cudnn步骤。
sudo cp cudnn.h /usr/local/cuda/include/
sudo cp lib* /usr/local/cuda/lib64/
cd /usr/local/cuda/lib64/
sudo rm -rf libcudnn.so libcudnn.so.5
sudo ln -s libcudnn.so.6.0.21 libcudnn.so.6
sudo ln -s libcudnn.so.6 libcudnn.so
caffe基于opencv环境的,我不确定仅仅安装python-opencv是否有用,建议大家还是老实安装opencv比较好。这里说一下不建议用anaconda安装opencv。anaconda会隐藏一些系统路径,导致一些包你安装在系统里,一些装在anaconda上,出现找不到这种模块的问题,最后安装不成功。用C编译是最稳妥并且之后使用不太容易出错。
安装教程参考 https://www.pyimagesearch.com/2016/10/24/ubuntu-16-04-how-to-install-opencv/, 以下归纳所有步骤,并对一些关键点做出备注。
sudo apt-get update
sudo apt-get upgrade
sudo apt-get install build-essential cmake pkg-config
sudo apt-get install libjpeg8-dev libtiff5-dev libjasper-dev libpng12-dev
sudo apt-get install libavcodec-dev libavformat-dev libswscale-dev libv4l-dev
sudo apt-get install libxvidcore-dev libx264-dev
sudo apt-get install libgtk-3-dev
sudo apt-get install libatlas-base-dev gfortran
sudo apt-get install python2.7-dev
cd ~
wget -O opencv.zip https://github.com/Itseez/opencv/archive/3.1.0.zip
unzip opencv.zip
wget -O opencv_contrib.zip https://github.com/Itseez/opencv_contrib/archive/3.1.0.zip
unzip opencv_contrib.zip
安装pip——python包管理器
cd ~
wget https://bootstrap.pypa.io/get-pip.py
sudo python get-pip.py
安装Numpy
pip install numpy
注意在Fast r-cnn中使用numpy==1.10,这里安装高版本的需要在训练Fast r-cnn时更改其python文件的部分代码
使用cmake安装opencv
结合 https://blog.kickview.com/building-a-digits-dev-machine-on-ubuntu-16-04/ 根据自己运行的需求更改编译内容
cd ~/opencv-3.1.0/
mkdir build
cd build
cmake -D CMAKE_BUILD_TYPE=RELEASE\
-D CMAKE_INSTALL_PREFIX=/usr/local \
-D WITH_CUDA=ON \
-D ENABLE_FAST_MATH=1 \
-D CUDA_FAST_MATH=1 \
-D WITH_CUBLAS=1\
-D INSTALL_PYTHON_EXAMPLES=ON \
-D OPENCV_EXTRA_MODULES_PATH= ~ /opencv_contrib-3.1.0/modules \
-D BUILD_opencv_dnn=OFF \
-D BUILD_EXAMPLES=ON \
-D PYTHON_EXECUTABLE= ~/.virtualenvs/cv/bin/python ..
注意opencv_dnn=OFF,安装caffe包括dnn,如重复在opencv会出现问题, WITH_CUDA=ON使用CUDA8.0,PYTHON_EXECUTABLE使用系统里的python。运行后检查图示。
检查是否支持cuda OpenCL 等,检查python2的路径是否正确(图示在虚拟环境里,根据自己的安装目录检查)等。
编译
make -j8
安装
sudo make install
sudo ldconfig
cd ~
workon cv
python
import cv2
cv2.__version__
参考 https://blog.csdn.net/ZWX2445205419/article/details/72673121 及 https://blog.kickview.com/building-a-digits-dev-machine-on-ubuntu-16-04/ 关于caffe安装这部分
sudo apt-get update
sudo apt-get upgrade
sudo apt-get install -y build-essential cmake git pkg-config
sudo apt-get install -y libprotobuf-dev libleveldb-dev libsnappy-dev libhdf5-serial-dev protobuf-compiler
sudo apt-get install -y libatlas-base-dev
sudo apt-get install -y --no-install-recommends libboost-all-dev
sudo apt-get install -y libgflags-dev libgoogle-glog-dev liblmdb-dev
(Python general)
sudo apt-get install -y python-pip
(Python 2.7 development files)
sudo apt-get install -y python-dev
sudo apt-get install -y python-numpy python-scipy
(OpenCV 2.4)
sudo apt-get install -y libopencv-dev
git clone https://github.com/BVLC/caffe
cp Makefile.config.example Makefile.config
## Refer to http://caffe.berkeleyvision.org/installation.html
# Contributions simplifying and improving our build system are welcome!
# cuDNN acceleration switch (uncomment to build with cuDNN).
USE_CUDNN := 1 # 使用CUDNN
# CPU-only switch (uncomment to build without GPU support).
# CPU_ONLY := 1 #使用GPU不更改
# uncomment to disable IO dependencies and corresponding data layers
# USE_OPENCV := 0 #使用opencv不更改
# USE_LEVELDB := 0
# USE_LMDB := 0
# uncomment to allow MDB_NOLOCK when reading LMDB files (only if necessary)
# You should not set this flag if you will be reading LMDBs with any
# possibility of simultaneous read and write
# ALLOW_LMDB_NOLOCK := 1
# Uncomment if you're using OpenCV 3
OPENCV_VERSION := 3 #安装的OpenCV版本为3.1.0
# To customize your choice of compiler, uncomment and set the following.
# N.B. the default for Linux is g++ and the default for OSX is clang++
# CUSTOM_CXX := g++
# CUDA directory contains bin/ and lib/ directories that we need.
CUDA_DIR := /usr/local/cuda-8.0 # 更改为安装的cuda路径
# On Ubuntu 14.04, if cuda tools are installed via
# "sudo apt-get install nvidia-cuda-toolkit" then use this instead:
# CUDA_DIR := /usr
# CUDA architecture setting: going with all of them.
# For CUDA < 6.0, comment the *_50 through *_61 lines for compatibility.
# For CUDA < 8.0, comment the *_60 and *_61 lines for compatibility.
CUDA_ARCH := -gencode arch=compute_20,code=sm_20 \
-gencode arch=compute_20,code=sm_21 \
-gencode arch=compute_30,code=sm_30 \
-gencode arch=compute_35,code=sm_35 \
-gencode arch=compute_50,code=sm_50 \
-gencode arch=compute_52,code=sm_52 \
-gencode arch=compute_60,code=sm_60 \
-gencode arch=compute_61,code=sm_61 \
-gencode arch=compute_61,code=compute_61
# BLAS choice:
# atlas for ATLAS (default)
# mkl for MKL
# open for OpenBlas
BLAS := atlas
# Custom (MKL/ATLAS/OpenBLAS) include and lib directories.
# Leave commented to accept the defaults for your choice of BLAS
# (which should work)!
# BLAS_INCLUDE := /path/to/your/blas
# BLAS_LIB := /path/to/your/blas
# Homebrew puts openblas in a directory that is not on the standard search path
# BLAS_INCLUDE := $(shell brew --prefix openblas)/include
# BLAS_LIB := $(shell brew --prefix openblas)/lib
# This is required only if you will compile the matlab interface.
# MATLAB directory should contain the mex binary in /bin.
# MATLAB_DIR := /usr/local
# MATLAB_DIR := /Applications/MATLAB_R2012b.app
MATLAB_DIR := /usr/local/MATLAB/R2014a # matlab的安装路径,没有则注释掉
# NOTE: this is required only if you will compile the python interface.
# We need to be able to find Python.h and numpy/arrayobject.h.
PYTHON_INCLUDE := /usr/include/python2.7 \ # 使用系统自带Python库和安装好的numpy库
/usr/lib/python2.7/dist-packages/numpy/core/include
# Anaconda Python distribution is quite popular. Include path:
# Verify anaconda location, sometimes it's in root.
# ANACONDA_HOME := $(HOME)/anaconda2
# PYTHON_INCLUDE := $(ANACONDA_HOME)/include \
$(ANACONDA_HOME)/include/python2.7 \
$(ANACONDA_HOME)/lib/python2.7/site-packages/numpy/core/include
# Uncomment to use Python 3 (default is Python 2)
# PYTHON_LIBRARIES := boost_python3 python3.5m
# PYTHON_INCLUDE := /usr/include/python3.5m \
# /usr/lib/python3.5/dist-packages/numpy/core/include
# We need to be able to find libpythonX.X.so or .dylib.
PYTHON_LIB := /usr/lib
# PYTHON_LIB := $(ANACONDA_HOME)/lib
# Homebrew installs numpy in a non standard path (keg only)
# PYTHON_INCLUDE += $(dir $(shell python -c 'import numpy.core; print(numpy.core.__file__)'))/include
# PYTHON_LIB += $(shell brew --prefix numpy)/lib
# Uncomment to support layers written in Python (will link against Python libs)
WITH_PYTHON_LAYER := 1 #使用python支持,去掉注释
# Whatever else you find you need goes here.
INCLUDE_DIRS := $(PYTHON_INCLUDE) /usr/local/include /usr/include/hdf5/serial /usr/lib/x86_64-linux-gnu/hdf5/serial #注意添加路径是否正确
LIBRARY_DIRS := $(PYTHON_LIB) /usr/local/lib /usr/lib /usr/lib/x86_64-linux-gnu/hdf5/serial
# If Homebrew is installed at a non standard location (for example your home directory) and you use it for general dependencies
# INCLUDE_DIRS += $(shell brew --prefix)/include
# LIBRARY_DIRS += $(shell brew --prefix)/lib
# NCCL acceleration switch (uncomment to build with NCCL)
# https://github.com/NVIDIA/nccl (last tested version: v1.2.3-1+cuda8.0)
# USE_NCCL := 1
# Uncomment to use `pkg-config` to specify OpenCV library paths.
# (Usually not necessary -- OpenCV libraries are normally installed in one of the above $LIBRARY_DIRS.)
# USE_PKG_CONFIG := 1
# N.B. both build and distribute dirs are cleared on `make clean`
BUILD_DIR := build
DISTRIBUTE_DIR := distribute
# Uncomment for debugging. Does not work on OSX due to https://github.com/BVLC/caffe/issues/171
# DEBUG := 1
# The ID of the GPU that 'make runtest' will use to run unit tests.
TEST_GPUID := 0
# enable pretty build (comment to see full commands)
Q ?= @
cd ~/caffe/python
for req in $(cat requirements.txt); do pip install $req; done
cd ~/caffe
mkdir build
cd build
cmake ..
make all -j8
make install
make test
make runtest -j8
与opencv相似,检查支持的库及路径是否正确。包括检测到的CUDA版本,cudnn版本,opencv版本,numpy版本,各种支持的版本,python配置路径make pycaffe
安装过程与caffe类似,不赘述export PYTHONPATH=/path/to/caffe/python:$PYTHONPATH
Import caffe