本文章是基于没有root权限(不能使用sudo等指令的情况下)进行配置
全文有些许长,涉及到我配了十天来的坎坷,方便以后学习浏览。
需要搭建的是maskrcnn_benchmark,那就先看看原来的github给的搭建指导
maskrcnn项目地址的install.md
ubuntu的环境我觉得是16./18./ 20.问题都不大
## Installation
### Requirements:
- PyTorch 1.0 from a nightly release. It **will not** work with 1.0 nor 1.0.1. Installation instructions can be found in https://pytorch.org/get-started/locally/
- torchvision from master
- cocoapi
- yacs
- matplotlib
- GCC >= 4.9
- OpenCV
- CUDA >= 9.0
### Option 1: Step-by-step installation
```bash
# first, make sure that your conda is setup properly with the right environment
# for that, check that `which conda`, `which pip` and `which python` points to the
# right path. From a clean conda env, this is what you need to do
conda create --name maskrcnn_benchmark -y
conda activate maskrcnn_benchmark
# this installs the right pip and dependencies for the fresh python
conda install ipython pip
# maskrcnn_benchmark and coco api dependencies
pip install ninja yacs cython matplotlib tqdm opencv-python
# follow PyTorch installation in https://pytorch.org/get-started/locally/
# we give the instructions for CUDA 9.0
conda install -c pytorch pytorch-nightly torchvision cudatoolkit=9.0
export INSTALL_DIR=$PWD
# install pycocotools
cd $INSTALL_DIR
git clone https://github.com/cocodataset/cocoapi.git
cd cocoapi/PythonAPI
python setup.py build_ext install
# install cityscapesScripts
cd $INSTALL_DIR
git clone https://github.com/mcordts/cityscapesScripts.git
cd cityscapesScripts/
python setup.py build_ext install
# install apex
cd $INSTALL_DIR
git clone https://github.com/NVIDIA/apex.git
cd apex
python setup.py install --cuda_ext --cpp_ext
# install PyTorch Detection
cd $INSTALL_DIR
git clone https://github.com/facebookresearch/maskrcnn-benchmark.git
cd maskrcnn-benchmark
# the following will install the lib with
# symbolic links, so that you can modify
# the files if you want and won't need to
# re-build it
python setup.py build develop
整体上是按照官方的搭配流程,但是我们可以看到官方给的配置还是Pytorch1.0,Cuda9.0,现在去Nvidia官网逛一圈都知道,时代变了,版本迭代太快,这也为我后面花了十天才配成功奠定了基调。
以下安装过程是我成功的搭建出来的,供大家参考不一定只有这一种方法,但是如果有朋友真的是被折腾的难受,不妨按照帖子试试我的方法。
nvcc版本9.0
我认为这个nvcc的版本是成功的大头
cudatoolkit版本9.0(与nvcc保持一致)
torch=1.5.0
torchvision=0.4.1
这是实验室的显卡信息(了解)
可能大家这个时候对cuda版本、nvcc版本、cudatoolkit、pytorch之间的关系有疑惑,建议可以百度先搜一搜,我也只是有点领悟不能准确说。
千万不要大意觉得参考别人帖子觉得版本方面就差一点没问题,既然别人成功了最好是按照别人的脚步一步步复现,我自己就是血的教训
1.非root用户安装cuda=9.0:参考非root权限安装cuda
一般实验室的Cuda版本肯定不会是9.0,我实验室的版本一开始是nvcc --version =10.1.243,按照需要的cuda和cudann包需要自取链接:
提取码:hsts
2.安装conda环境:一般实验室会有存相应的包,没有的话也可以自己下安装conda
3.开始安装pytorch环境:一步步来
conda create --name maskrcnn_benchmark python=3.7
conda activate maskrcnn_benchmark
conda install ipython
pip install ninja yacs cython matplotlib tqdm opencv-python
conda install cudatoolkit=9.0
pip install torchvision==0.4.1
pip install torch==1.5.0
$PWD是下载存放的目录,对于实验室同学最好不要放在系统盘
export INSTALL_DIR=$PWD
# install pycocotools
cd $INSTALL_DIR
git clone https://github.com/cocodataset/cocoapi.git
cd cocoapi/PythonAPI
python setup.py build_ext install
# install cityscapesScripts
cd $INSTALL_DIR
git clone https://github.com/mcordts/cityscapesScripts.git
cd cityscapesScripts/
python setup.py build_ext install
# install apex
cd $INSTALL_DIR
git clone https://github.com/NVIDIA/apex.git
cd apex
python setup.py install
(注意这个地方我跟官方的区别)
# install PyTorch Detection
cd $INSTALL_DIR
git clone https://github.com/facebookresearch/maskrcnn-benchmark.git
cd maskrcnn-benchmark
python setup.py build develop
(编译这个时候可能会出现一些错误比如:
error: identifier "AT_CHECK" is undefined
将报错行中的AT_CHECK替换为TORCH_CHECK即可)
cd demo
python webcam.py --min-image-size 300 MODEL.DEVICE cpu
(如果你电脑有摄像头的话)
我是远程到服务器电脑,所以调用不了摄像头会报错,这个时候改一下代码识别图片
可能的一些报错:
1.ImportError: cannot import name ‘PILLOW_VERSION’ from ‘PIL’
2.cannot connect to X server, couldnt connect to display
3.python环境匹配安装3.7
# cam = cv2.VideoCapture(0)
# while True:
# start_time = time.time()
# ret_val, img = cam.read()
# composite = coco_demo.run_on_opencv_image(img)
# print("Time: {:.2f} s / img".format(time.time() - start_time))
# cv2.imshow("COCO detections", composite)
# if cv2.waitKey(1) == 27:
# break # esc to quit
# cv2.destroyAllWindows()
path = "3.jpg"
img = cv2.imread(path)
composite = coco_demo.run_on_opencv_image(img)
cv2.imshow("COCO detections", composite)
cv2.waitKey(0)
上述过程是最后成功配出来的步骤,中间也会出现一些小错误但都是可以解决的,由于试错过程太惨就没有在上面啰嗦,希望大家都能成功运行吧,我得继续踩坑开始探究怎么用它训练自己的数据集了。
试错经历:
0.不要没装环境就直接跑demo,会出现找不到maskrcnn_benchmark,先按照步骤一步步装好,再运行demo
1.使用cuda10.1 ,pytorch按照官网的conda安装,出现不匹配以及各种问题
2.中间逛遍了各个帖子一直在尝试更换pytoch版本和cudatoolkit版本,一直在apex出错,当按照网上方法直接改成python setup.py install成功是成功但是在build maskrcnn仍然出错when build apex occur the errors:41 errors detected in the compilation of “/tmp/tmpxft_00206a36_00000000-6_multi_tensor_lamb_stage_2.cpp1.ii”.人麻了,这个提交给github的issue也没有大佬回复,心凉了一半
3.也尝试更改gcc版本,最终失败告终,一度想放弃,每天都说今天是最后一天尝试,但就是很不甘心哈哈
4.这十天来每天就是conda deactivate | conda create -n | pip |中频繁切换,最后这个成功我也不知道为什么以前那些试错的版本不行也没有什么解释,就很迷,最新的环境不能向下支持就很恶心反正