环境:ubuntu 16.04 , nvidia drive px2, AutoChauffeur, driveworks 6.x, cuda 9.x
首先,caffe2有两个仓库地址(05/07/2018),一个是 https://github.com/caffe2/caffe2 另一个是与PyTorch合并后的github地址https://github.com/pytorch/pytorch
官方安装指导地址为https://caffe2.ai/docs/getting-started.html?platform=mac&configuration=prebuilt, 可以选install from prebuild 或者build from source code. 支持多个平台。
我们目标是安装到Nvidia PX2上,cpu架构是aarch64的,不管是选择ubuntu(下面会提示x86的安装命令)还是选择nvidia tegra,都跟Nvidia PX2不太匹配。
prebuild是通过anaconda安装,但是anaconda并不支持aarch64安装,当然万事没绝对,参照https://github.com/conda/conda/pull/5190#issuecomment-319774238 号称可以做到安装anaconda到arm 64,但是即使anaconda在nvidia PX2 上安装成功,也不确定caffe2的prebuild 安装:
conda install -c caffe2 caffe2-cuda9.0-cudnn7
是否是支持aarch64的build. 反正我看不出来这个版本是不是arm的(https://anaconda.org/caffe2/caffe2-cuda9.0-cudnn7),所以放弃了通过anaconda的prebuild安装方式。
改为“build from source"方式,其中官网提示有这么一段:
If you plan to use GPU instead of CPU only, then you should install NVIDIA CUDA 8 and cuDNN v5.1 or v6.0, a GPU-accelerated library of primitives for deep neural networks.NVIDIA’s detailed instructions or if you’re feeling lucky try the quick install set of commands below.
Update your graphics card drivers first! Otherwise you may suffer from a wide range of difficult to diagnose errors.
开始一直理解为只能安装cuda8,当前我们的环境是driveworks6.x with cuda9, 所以首先尝试安装单独cuda8,然后在nvidia devtalk中询问得到答案是drive px2上的cuda不能单独安装。(https://devtalk.nvidia.com/default/topic/1037182/how-to-install-cuda8-instead-of-cuda9-without-reinstall-driveworks-drive-px2-/#) 当然,如果你是linux x86 你可以在官网找到对应的cuda安装方式.
后来尝试保留cuda9,继续安装caffe2,发现依然报各种奇奇怪怪的错误,例如:
caffe2 OpenMPI found, but it is not built with CUDA support. 等等,但是很多其他错误会提示同样的这个提示。
最后看到这个topic :https://devtalk.nvidia.com/default/topic/1032628/?comment=5253931 开始怀疑caffe2的版本问题,相比driveworks太新。
果断回到caffe2 老的github仓库https://github.com/caffe2/caffe2,寻找支持detectron模块的较老版本。
按住老的安装指导步骤:https://github.com/caffe2/caffe2/releases 之前出现的错误消失。
1. 如果出现以下错误:
ning into 'third_party/eigen'...
Username for 'https://github.com': naruya
Password for 'https://[email protected]':
remote: Repository not found.
fatal: repository 'https://github.com/RLovelett/eigen.git/' not found
解决方式参考https://github.com/onnx/onnx-coreml/issues/318 或者自己直接到bitbucket 上下载最新的eigien放在third_party 下。
2. check eigen 版本方式: cat /usr/include/eigen3/Eigen/src/Core/util/Macros.h | grep VERSION
如果版本太低会提示:
/home/nvidia/me/caffe2/caffe2/operators/conv_op_eigen.cc:8:2: error: #error "Caffe2 requires Eigen to be at least 3.3.0.";
#error "Caffe2 requires Eigen to be at least 3.3.0.";
3.如果不想用系统版本的eigen,而想用third_party下的eigen,就要修改caffe2/cmake/Dependencies.cmake中对eigen的配置了:
#find_package(Eigen3)
#if(EIGEN3_FOUND)
# message(STATUS "Found system Eigen at " ${EIGEN3_INCLUDE_DIR})
# include_directories(${EIGEN3_INCLUDE_DIR})
#else()
message(STATUS "Did not find system Eigen. Using third party subdirectory.")
include_directories(${PROJECT_SOURCE_DIR}/third_party/eigen3)
#endif()
4. 如果出现错误:The source directory does not contain a CMakeLists.txt file
是因为你clone caffe2代码后没有做git submodule update --init
操作。
最后另外其他错误可以先到这里看看是否有你出现的错误:https://caffe2.ai/docs/faq.html