在ubuntu上使用pytorch,来实现的faster-rcnn这个代码的复现。
以下是一路上遇到的各种奇怪的问题以及解决的办法。
基本上全部参考这个
https://github.com/ruotianluo/pytorch-faster-rcnn
部分参考这个
https://blog.csdn.net/xzy5210123/article/details/81530993
git clone https://github.com/ruotianluo/pytorch-faster-rcnn.git
cd data
git clone https://github.com/pdollar/coco.git
cd coco/PythonAPI
make
cd ../../..
wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtrainval_06-Nov-2007.tar
wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtest_06-Nov-2007.tar
wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCdevkit_08-Jun-2007.tar
tar xvf VOCtrainval_06-Nov-2007.tar
tar xvf VOCtest_06-Nov-2007.tar
tar xvf VOCdevkit_08-Jun-2007.tar
它应该具有这种基本结构
$ VOCdevkit / #开发套件
$ VOCdevkit / VOCcode / # VOC实用代码
$ VOCdevkit / VOC2007 #图像集,注释等
# ...等几个目录...
为PASCAL VOC数据集创建符号链接
cd ~/Documents/A_RCNN/A_codeTemplate/pytorch-faster-rcnn/data
ln -s $VOCdevkit VOCdevkit2007
使用符号链接是一个好主意,因为您可能希望在多个项目之间共享相同的PASCAL数据集安装。
-----------------------------------------------------------------------------------
下一步:
使用预先训练的模型进行演示和测试
下载预先训练的模型
(可选) 您也可以从tf-faster-rcnn模型转换,而不是下载我的预训练或转换模型。您可以从tf-faster-rcnn下载tensorflow预训练模型。然后运行:
python tools / convert_from_tensorflow.py --tensorflow_model resnet_model.ckpt
python tools / convert_from_tensorflow_vgg.py --tensorflow_model vgg_model.ckpt
此脚本将.pth在与tensorflow模型相同的文件夹中创建一个具有相同名称的文件。
---------------------------------------------------------------------------------------
依次执行:
(base) twinkle@twinkle-ubuntu:~$ conda activate rcnn36
(rcnn36) twinkle@twinkle-ubuntu:~$ cd ~/Documents/A_RCNN/A_codeTemplate/pytorch-faster-rcnn/data
(rcnn36) twinkle@twinkle-ubuntu:~/Documents/A_RCNN/A_codeTemplate/pytorch-faster-rcnn/data$ ln -s $VOCdevkit VOCdevkit2007
(rcnn36) twinkle@twinkle-ubuntu:~/Documents/A_RCNN/A_codeTemplate/pytorch-faster-rcnn/data$ cd ..
(rcnn36) twinkle@twinkle-ubuntu:~/Documents/A_RCNN/A_codeTemplate/pytorch-faster-rcnn$ NET=vgg16
(rcnn36) twinkle@twinkle-ubuntu:~/Documents/A_RCNN/A_codeTemplate/pytorch-faster-rcnn$ TRAIN_IMDB=voc_2007_trainval
(rcnn36) twinkle@twinkle-ubuntu:~/Documents/A_RCNN/A_codeTemplate/pytorch-faster-rcnn$ mkdir -p output/${NET}/${TRAIN_IMDB}
(rcnn36) twinkle@twinkle-ubuntu:~/Documents/A_RCNN/A_codeTemplate/pytorch-faster-rcnn$ cd output/${NET}/${TRAIN_IMDB}
(rcnn36) twinkle@twinkle-ubuntu:~/Documents/A_RCNN/A_codeTemplate/pytorch-faster-rcnn/output/vgg16/voc_2007_trainval$ ln -s ../../../data/voc_2007_trainval ./default
(rcnn36) twinkle@twinkle-ubuntu:~/Documents/A_RCNN/A_codeTemplate/pytorch-faster-rcnn/output/vgg16/voc_2007_trainval$ cd ../../..
(rcnn36) twinkle@twinkle-ubuntu:~/Documents/A_RCNN/A_codeTemplate/pytorch-faster-rcnn$ python ./tools/demo.py --net vgg16 --dataset pascal_voc
-----------------------------------------------------------------------------------
No module named 'easydict'
pip install easydict
-----------------------------------------------------------------------------------
No module named 'cv2'
pip install opencv-python
RuntimeError: Detected that PyTorch and torchvision were compiled with different CUDA versions. PyTorch has CUDA Version=9.0 and torchvision has CUDA Version=10.0. Please reinstall the torchvision that matches your PyTorch install.
(rcnn36) twinkle@twinkle-ubuntu:~/Documents/A_RCNN/A_codeTemplate/pytorch-faster-rcnn$ python
Python 3.6.8 |Anaconda, Inc.| (default, Dec 30 2018, 01:22:34)
[GCC 7.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> torch.__version__
'1.1.0'
>>> torch.version.cuda
'9.0.176'
>>> quit()
(rcnn36) twinkle@twinkle-ubuntu:~/Documents/A_RCNN/A_codeTemplate/pytorch-faster-rcnn$ cat /usr/local/cuda/version.txt
CUDA Version 10.0.130
(rcnn36) twinkle@twinkle-ubuntu:~/Documents/A_RCNN/A_codeTemplate/pytorch-faster-rcnn$ pip uninstall torch
Successfully uninstalled torch-1.1.0
torchvision 0.3.0 has requirement torch>=1.1.0, but you'll have torch 1.0.1 which is incompatible.
Successfully installed torch-1.0.1
pip uninstall torch
(rcnn36) twinkle@twinkle-ubuntu:~/Documents/A_RCNN/A_codeTemplate/pytorch-faster-rcnn$ conda install pytorch torchvision cudatoolkit=10.0 -c pytorch
-----------------------------------------------------------------------------------
conda update -n base -c defaults conda
-----------------------------------------------------------------------------------
No module named 'matplotlib'
(rcnn36) twinkle@twinkle-ubuntu:~/Documents/A_RCNN/A_codeTemplate/pytorch-faster-rcnn$ python
Python 3.6.8 |Anaconda, Inc.| (default, Dec 30 2018, 01:22:34)
[GCC 7.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> torch.__version__
'1.1.0'
>>> torch.version.cuda
'10.0.130'
>>> quit()
(rcnn36) twinkle@twinkle-ubuntu:~/Documents/A_RCNN/A_codeTemplate/pytorch-faster-rcnn$ pip install matplotlib
-----------------------------------------------------------------------------------
No module named 'tensorboardX'
(rcnn36) twinkle@twinkle-ubuntu:~/Documents/A_RCNN/A_codeTemplate/pytorch-faster-rcnn$ pip install tensorboardX
-----------------------------------------------------------------------------------
No module named 'scipy'
(rcnn36) twinkle@twinkle-ubuntu:~/Documents/A_RCNN/A_codeTemplate/pytorch-faster-rcnn$ pip install scipy
-----------------------------------------------------------------------------------
ImportError: cannot import name 'imresize'
cannot import name 'imresize' from 'scipy.misc'
问题:执行以下python命令,出现这样的错误 cannot import name 'imresize' from 'scipy.misc'
from scipy.misc import imresize
原因及解决办法:
缺少依赖项。网上出现类似问题,给出的答案是需要安装PIL,即
pip install pillow
但是如果你的scipy是通过conda安装的,conda会自动安装所需的依赖项,因此不需要额外安装PIL。
scipy版本问题。scipy>=1.0.0不再包含函数imresize,官网给出的解释及解决办法如下:
imresize is deprecated! imresize is deprecated in SciPy 1.0.0, and will be removed in 1.3.0. Use Pillow instead: numpy.array(Image.fromarray(arr).resize()).
参考scipy官网
(base) twinkle@twinkle-ubuntu:~$ cd /home/twinkle/Downloads/mypycharm/bin
(base) twinkle@twinkle-ubuntu:~/Downloads/mypycharm/bin$ ./pycharm.sh
-----------------------------------------------------------------------------------
UnicodeDecodeError: 'ascii' codec can't decode byte 0xb8 in position 2: ordinal not in range(128)
# 在data文件夹中创建名为voc_2007_trainval的文件夹,把下载并解压好的预训练模型文件放进去
# 以下命令在faster rcnn 根目录下操作
NET=vgg16
TRAIN_IMDB=voc_2007_trainval
mkdir -p output/${NET}/${TRAIN_IMDB}
cd output/${NET}/${TRAIN_IMDB}
ln -s ../../../data/voc_2007_trainval ./default
cd ../../..
NET=res101
TRAIN_IMDB=voc_2007_trainval+voc_2012_trainval
mkdir -p output/${NET}/${TRAIN_IMDB}
cd output/${NET}/${TRAIN_IMDB}
ln -s ../../../data/voc_2007_trainval+voc_2012_trainval ./default
cd ../../..
(rcnn36) twinkle@twinkle-ubuntu:~/Documents/A_RCNN/A_codeTemplate/pytorch-faster-rcnn$ python ./tools/demo.py
Traceback (most recent call last):
File "./tools/demo.py", line 164, in
torch.load(saved_model, map_location=lambda storage, loc: storage))
File "/home/twinkle/Documents/A_RCNN/A_codeTemplate/pytorch-faster-rcnn/tools/../lib/nets/network.py", line 522, in load_state_dict
for k, v in state_dict.items() if k in self.state_dict()}
File "/home/twinkle/Downloads/myanaconda3/anaconda3/envs/rcnn36/lib/python3.6/site-packages/torch/nn/modules/module.py", line 777, in load_state_dict
self.__class__.__name__, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for vgg16:
我使用的是Pytorch 4.0.1,但我加载的Jupyter Notebook使用4.0.0。所以我在load_state_dict()中添加了strict = False属性。
model.load_state_dict(checkpoint['state_dict'], strict=False)
增加之后,demo.py可以运行,但是识别明显有错误,所以又修改回来,最后发现是模型下载出错,错用了tf的模型。
下载了这个tf的
https://drive.google.com/drive/folders/0B1_fAEgxdnvJSmF3YUlZcHFqWTQ
而应该下载pytorch的
https://drive.google.com/drive/folders/0B7fNdx_jAqhtNE10TDZDbFRuU0E
--------------------------------------------------------------------------------
如果使用pth的模型,那么输入
GPU_ID=0
CUDA_VISIBLE_DEVICES=${GPU_ID} ./tools/demo.py
出现这个错误
RuntimeError: Error(s) in loading state_dict for vgg16:
发现昨天下载的是tf的模型,重新下载正确的pytorch的模型之后,提示
ModuleNotFoundError: No module named 'yaml'
下载正确的模型之后,上诉的问题消失。
****用于测试自定义图像的演示****:
GPU_ID=0
CUDA_VISIBLE_DEVICES=${GPU_ID} ./tools/demo.py
-------------------------------------------------------------------------------
****使用预训练模型跑测试集****:
输入
GPU_ID=0
./experiments/scripts/test_faster_rcnn.sh $GPU_ID pascal_voc vgg16
出现
ModuleNotFoundError: No module named 'yaml'
----------------------------------------------------------------------------------
(rcnn36) twinkle@twinkle-ubuntu:~/Documents/A_RCNN/A_codeTemplate/pytorch-faster-rcnn$ sudo -H pip install pyyaml --upgrade
Collecting pyyaml
Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('
Retrying (Retry(total=3, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('
--------------------------------------------------------------------------------------
如果要用tensorflow的模型转换,需要
新建环境tftorch36 能使用tensorflow-gpu
import tensorflow
import tensorflow as tf
matrix1 = tf.constant([[3., 3.]])
matrix2 = tf.constant([[2.],[2.]])
product = tf.matmul(matrix1, matrix2)
sess = tf.Session()
---------------------------------------------------------------------------------------
****使用预训练模型跑测试集****:
输入
GPU_ID=0
./experiments/scripts/test_faster_rcnn.sh $GPU_ID pascal_voc vgg16
出现
im_detect: 4952/4952 0.115s 0.006s
Evaluating detections
Writing aeroplane VOC results file
Writing bicycle VOC results file
Writing bird VOC results file
Writing boat VOC results file
Writing bottle VOC results file
Writing bus VOC results file
Writing car VOC results file
Writing cat VOC results file
Writing chair VOC results file
Writing cow VOC results file
Writing diningtable VOC results file
Writing dog VOC results file
Writing horse VOC results file
Writing motorbike VOC results file
Writing person VOC results file
Writing pottedplant VOC results file
Writing sheep VOC results file
Writing sofa VOC results file
Writing train VOC results file
Writing tvmonitor VOC results file
VOC07 metric? Yes
Reading annotation for 1/4952
。。。
Reading annotation for 4901/4952
Saving cached annotations to /home/twinkle/Documents/A_RCNN/A_codeTemplate/pytorch-faster-rcnn/data/VOCdevkit2007/VOC2007/ImageSets/Main/test.txt_annots.pkl
AP for aeroplane = 0.6400
AP for bicycle = 0.7051
AP for bird = 0.6265
AP for boat = 0.4797
AP for bottle = 0.5371
AP for bus = 0.6961
AP for car = 0.8145
AP for cat = 0.8036
AP for chair = 0.4351
AP for cow = 0.7608
AP for diningtable = 0.4895
AP for dog = 0.7412
AP for horse = 0.7771
AP for motorbike = 0.7003
AP for person = 0.7477
AP for pottedplant = 0.4081
AP for sheep = 0.6935
AP for sofa = 0.6323
AP for train = 0.6423
AP for tvmonitor = 0.7078
Mean AP = 0.6519
~~~~~~~~
Results:
0.640
0.705
0.627
0.480
0.537
0.696
0.814
0.804
0.435
0.761
0.489
0.741
0.777
0.700
0.748
0.408
0.693
0.632
0.642
0.708
0.652
~~~~~~~~
--------------------------------------------------------------
Results computed with the **unofficial** Python eval code.
Results should be very close to the official MATLAB eval code.
Recompute with `./tools/reval.py --matlab ...` for your paper.
-- Thanks, The Management
--------------------------------------------------------------
850.14user 117.08system 19:06.14elapsed 84%CPU (0avgtext+0avgdata 13603272maxresident)k
2003248inputs+80536outputs (13960major+4071345minor)pagefaults 0swaps
感动
Mean AP = 0.6519,好像说应该要0.70+,但是我觉得也还好。哈哈哈哈