caffe版本flownet2 python3环境配置及错误集锦

前言

目前caffe对于python2.7支持较好,所以flownet2的caffe版本安装多在python2.7实现,近期笔者需要用到python3环境下的flownet2,因此本文介绍在python3中安装GPU的flownet2。
代码地址 https://github.com/lmb-freiburg/flownet2
安装环境 python3.6 + CUDA9.0 + Cudnn7.4.2 + opencv3 +anaconda3
以上环境自行安装,不做安装介绍

1、Flownet2环境配置

1、环境依赖安装

sudo apt-get install libboost-all-dev libsnappy-dev libgflags-dev libgoogle-glog-dev libleveldb-dev libopenblas-dev 
sudo apt-get install libprotobuf-dev libleveldb-dev libhdf5-serial-dev protobuf-compiler 
sudo apt-get install liblapack-dev libatlas-base-dev liblmdb-dev 
sudo apt-get install git cmake build-essential 

2、conda 配置虚拟环境

conda create -n flownet2  python=3.6
conda install numpy=1.17.0 cython scipy=1.0.0 scikit-image 
pip install msgpack opencv-python protobuf

3、修改Makefile.config和Makefile

修改Makefile.config文件中的相关变量

# cuDNN acceleration switch (uncomment to build with cuDNN).
USE_CUDNN := 1
USE_OPENCV := 1

# CPU-only switch (uncomment to build without GPU support).
# CPU_ONLY := 1
OPENCV_VERSION := 3

还有

# BLAS choice:
# atlas for ATLAS (default)
# mkl for MKL
# open for OpenBlas
BLAS := open   **这里因为之前安装了libopenblas-dev所以选择open**
 **在这里配置python相关变量**
ANACONDA_HOME := $(HOME)/anaconda3/envs/flownet2
PYTHON_LIBRARIES := boost_python3 python3.6m
PYTHON_INCLUDE := $(ANACONDA_HOME)/include \
        $(ANACONDA_HOME)/include/python3.6m \
        $(ANACONDA_HOME)/lib/python3.6/site-packages/numpy/core/include \
        /usr/include/python3.6m
PYTHON_LIB := $(ANACONDA_HOME)/lib
LINKFLAGS := -Wl,-rpath,$(PYTHON_LIB)
INCLUDE_DIRS := \
	$(PYTHON_INCLUDE) \
	$CUDA_DIR/include \
	/usr/include \
	/usr/local/include \
	/usr/include/hdf5/serial/
LIBRARY_DIRS := \
	$(PYTHON_LIB) \
    /usr/lib \
	$CUDA_DIR/lib64 \
	/usr/lib/x86_64-linux-gnu/hdf5/serial \
	/usr/local/lib 

修改Makefile文件中的相关变量

LIBRARIES += glog gflags protobuf boost_system boost_filesystem m hdf5_serial_hl hdf5_serial opencv_core opencv_highgui opencv_imgproc

NVCCFLAGS += -D_FORCE_INLINES -ccbin=$(CXX) -Xcompiler -fPIC $(COMMON_FLAGS)

4、编译flownet2

make -j 5 all tools pycaffe 

配置环境, 每次使用 flownet2时, 都要进行如下操作:

source set-env.sh 

2、安装flownet2时的错误集锦

1、/include/caffe/solver.hpp:3:30: fatal error: boost/function.hpp

sudo apt-get install libboost-python-dev

2、/include/caffe/common.hpp:5:27: fatal error: gflags/gflags.h:

sudo apt-get install libgflags-dev

3、./include/caffe/common.hpp:6:26: fatal error: glog/logging.h:

 sudo apt-get install libgoogle-glog-dev

4、 src/caffe/net.cpp:8:18: fatal error: hdf5.h

在Makefile.config文件中,将:INCLUDE_DIRS := $(PYTHON_INCLUDE) /usr/local/include
改为:INCLUDE_DIRS := $(PYTHON_INCLUDE) /usr/local/include /usr/include/hdf5/serial/
在Makefile文件,将:LIBRARIES += glog gflags protobuf boost_system boost_filesystem m hdf5_hl hdf5
改为:LIBRARIES += glog gflags protobuf boost_system boost_filesystem m hdf5_serial_hl hdf5_serial

5、./include/caffe/util/cudnn.hpp:113:70: error: too few arguments to function ‘cudnnStatus_t cudnnSetConvolution2dDescriptor(cudnnConvolutionDescriptor_t, int, int, int, int, int, int, cudnnConvolutionMode_t, cudnnDataType_t)’

将之前编译通过的caffe中的源码中的一个cudnn.hpp文件拷贝到出问题的caffe的相应位置,重新编译就能解决,即

cp cudnn.hpp flownet2/include/caffe/util/

这里给我的cudnn.hpp文件,可以直接拷贝覆盖。

6、src/caffe/layers/custom_data_layer.cpp:4:24: fatal error: leveldb/db.h:

sudo apt-get install libleveldb-dev

7、./include/caffe/layers/custom_data_layer.hpp:10:18: fatal error: lmd

b.h:

sudo apt-get install liblmdb-dev

8、/usr/bin/ld: 找不到 -lboost_python3

如果使用的是Python3.6版本,发现是没有libboost_python-py3.6.so 文件的,那就得自己编译一个了,方法如下:
安装 boost_1_67_0.tar.gz:

wget http://sourceforge.net/projects/boost/files/boost/1.67.0/boost_1_67_0.tar.gz
cd boost_1_67_0/
./bootstrap.sh --with-libraries=python --with-toolset=gcc
./b2 --with-python include="/usr/include/python3.6m"
./b2 install

ln -s /usr/local/lib/libboost_python36.so.1.67.0 /usr/local/lib/libboost_python3.so
vim ~/.bashrc
export LD_LIBRARY_PATH="/usr/local/lib:$LD_LIBRARY_PATH"
#退出vim
source ~/.bashrc

如果仍然找不到libboost_python3.so,可以直接复制到build/lib中

cp /usr/local/lib/libboost_python36.so.1.67.0 flownet2/build/lib/
cp /usr/local/lib/libboost_python3.so flownet2/build/lib/

9、/usr/bin/ld: 找不到 -lcblas /usr/bin/ld: 找不到 -latlas

sudo apt-get install libopenblas-dev

将Makefile.config中BLAS choice改为open

 BLAS choice:
#atlas for ATLAS (default)
#mkl for MKL
#open for OpenBlas
BLAS := open

10、ModuleNotFoundError: No module named 'google

import caffe时出现标题的错误,

pip install protobuf

11、TypeError: expected bytes,str found

import caffe时遇到下面的错

Failed to include caffe_pb2, things might go wrong!
Traceback (most recent call last):
  File "", line 1, in 
  File "/home/xingchen/workplace/caffe/python/caffe/__init__.py", line 4, in 
    from .proto.caffe_pb2 import TRAIN, TEST
  File "/home/xingchen/workplace/caffe/python/caffe/proto/caffe_pb2.py", line 17, in 
    serialized_pb='\n\x0b\x63\x61\x66\x66\x65.proto\x12\x05\x63\x61\x66\x66\x65\"\x1c\n\tBlobShape\x12\x0f\n\x03\x64i...'
  File "/home/xingchen/anaconda3/lib/python3.6/site-packages/google/protobuf/descriptor.py", line 824, in __new__
    return _message.default_pool.AddSerializedFile(serialized_pb)
TypeError: expected bytes, str found

原因:编译生成的caffe_pb2.py有问题
解决:找一份正确的caffe_pb2.py替换原来的caffe/python/caffe/proto/caffe_pb2.py即可

cp caffe_pb2.py flownet2/python/caffe/proto/caffe_pb2.py

这里附上我的替换文件caffe_pb2.py,可以直接拿来进行替换。
链接:https://pan.baidu.com/s/1sdZwKqe_Qf2pVlkv63XdMg
提取码:vr3e

11、ImportError: libboost_python3.so.1.67.0: cannot open shared object file: No such file or directory

将/usr/lib64/libboost_python3.so和libboost_python3.so.1.67.0复制到build/lib下

cp /usr/local/lib/libboost_python36.so.1.67.0 flownet2/build/lib/
cp /usr/local/lib/libboost_python3.so flownet2/build/lib/

12、from google.protobuf import symbol_database as _symbol_database

import caffe时遇到下面的错

from google.protobuf import symbol_database as _symbol_database
ImportError: cannot import name 'symbol_database'

protobuf未安装好

pip install protobuf=3.11.2

13、libtiff.so冲突

编译时遇到以下的错误

/lib/../lib64/libopencv_highgui.so: undefined reference to `TIFFReadRGBAStrip@LIBTIFF_4.0'
/lib/../lib64/libopencv_highgui.so: undefined reference to `TIFFIsTiled@LIBTIFF_4.0'
/lib/../lib64/libopencv_highgui.so: undefined reference to `TIFFWriteScanline@LIBTIFF_4.0'
/lib/../lib64/libopencv_highgui.so: undefined reference to `TIFFGetField@LIBTIFF_4.0'
/lib/../lib64/libopencv_highgui.so: undefined reference to `TIFFScanlineSize@LIBTIFF_4.0'
/lib/../lib64/libopencv_highgui.so: undefined reference to `TIFFReadEncodedTile@LIBTIFF_4.0'
/lib/../lib64/libopencv_highgui.so: undefined reference to `TIFFReadRGBATile@LIBTIFF_4.0'
/lib/../lib64/libopencv_highgui.so: undefined reference to `TIFFClose@LIBTIFF_4.0'
/lib/../lib64/libopencv_highgui.so: undefined reference to `TIFFRGBAImageOK@LIBTIFF_4.0'
/lib/../lib64/libopencv_highgui.so: undefined reference to `TIFFOpen@LIBTIFF_4.0'
/lib/../lib64/libopencv_highgui.so: undefined reference to `TIFFReadEncodedStrip@LIBTIFF_4.0'
/lib/../lib64/libopencv_highgui.so: undefined reference to `TIFFSetField@LIBTIFF_4.0'
/lib/../lib64/libopencv_highgui.so: undefined reference to `TIFFSetWarningHandler@LIBTIFF_4.0'
/lib/../lib64/libopencv_highgui.so: undefined reference to `TIFFSetErrorHandler@LIBTIFF_4.0'

libtiff.so冲突,删除掉annoconda环境中的libtiff.so,这里删除flownet2中的libtiff.so

rm $(HOME)/anaconda3/envs/flownet2/lib/libtiff.so*

14、Warning! HDF5 library version mismatched error

Warning! HDF5 library version mismatched error
The HDF5 header files used to compile this application do not match
the version used by the HDF5 library to which this application is linked.
Data corruption or segmentation faults may occur if the application continues.
This can happen when an application was compiled by one version of HDF5 but
linked with a different version of static or shared HDF5 library.
You should recompile the application or check your shared library related
settings such as ‘LD_LIBRARY_PATH’.
You can, at your own risk, disable this warning by setting the environment
variable ‘HDF5_DISABLE_VERSION_CHECK’ to a value of ‘1’.
Setting it to 2 or higher will suppress the warning messages totally.

pip install h5py

如果还是不行,可以尝试在环境中将HDF5_DISABLE_VERSION_CHECK设为1,无奈之举,但是依然可以运行

export HDF5_DISABLE_VERSION_CHECK=1

15、nvcc This support is currently experimental, and must be enabled with the -std=c++11 or -std=gnu++11 compiler options.

修改Makefile文件,在其中加上C++11的支持

CXXFLAGS += -pthread -fPIC $(COMMON_FLAGS) $(WARNINGS) -std=c++11
NVCCFLAGS += -D_FORCE_INLINES -ccbin=$(CXX) -Xcompiler -fPIC $(COMMON_FLAGS) -std=c++11
LINKFLAGS += -pthread -fPIC $(COMMON_FLAGS) $(WARNINGS) -std=c++11

16、redefinition of argument ‘compiler-bindir’

提示重复定义 redefinition of argument ‘compiler-bindir’
是添加的三行代码中有和原来的代码重复定义冲突的,把15错误中定义三个变量之前的代码删除,保留15中的三行代码即可。

17、Layer conv1 has unknown engine.

运行flownet2生成flo光流的时候出现的问题
在加载的模型中的.prototxt.template文件中,将Engine中CUDNN全部换成CAFFE

18、WARNING: Logging before InitGoogleLogging() is written to STDERR

运行flownet2生成flo光流的时候出现的问题

WARNING: Logging before InitGoogleLogging() is written to STDERR
F0424 17:31:30.465745 30231 syncedmem.cpp:56] Check failed: error == cudaSuccess (2 vs. 0)  out of memory

因为显卡的显存不够,造成out of memory,这时候可以选用较小模型,比如FlowNet2-cs。

你可能感兴趣的:(深度学习,python,caffe,神经网络,深度学习)