pip install tensorflow-tensorboard
首先需要把https://github.com/tensorflow/models/tree/r1.5的代码下载下来
然后进入到https://github.com/tensorflow/models/tree/r1.5/research/object_detection,下面有个README.md
1.installation:
搭建GPU 环境,反正这个地方,搭建GPU的时候会遇到很多问题,自己多在百度上搜索和尝试,然后在终端依次输入以下命令:
pip install tensorflow-gpu
sudo apt-get install protobuf-compiler python-pil python-lxml
sudo pip install jupyter
sudo pip install matplotlib
sudo pip install pillow
sudo pip install lxml
sudo pip install jupyter
sudo pip install matplotlib
其次 # From tensorflow/models/research/,也就是在research下面的文件下输入:
protoc object_detection/protos/*.proto --python_out=.
export PYTHONPATH=$PYTHONPATH:`pwd`:`pwd`/slim
python object_detection/builders/model_builder_test.py
运行完成会出现
2. Configuring an object detection pipeline:
定义obeject_detection pipeline,就是给他一个配置文件,在配置文件里面写明需要用到的哪些model,可以参考object_detection/samples/configs,可以看到configs下面有很多标准的configs文件,可以根据这些成型的文件进行相应的修改。根据自己的经验来看,再进行自己数据训练之前,最好把官网的模型走一遍,以确定框架本身没有什么问题。
以faster_rcnn_resnet101_voc07.config为例子。
里面的model,包含了模型的本身的超参数
Train_config:训练时候所需要的超参数
train_input_reader:定义了数据所在的位置
eval_config:预测的时候给的超参数
eval_input_reader:测试数据所在的位置
3. Preparing inputs
直接从http://host.robots.ox.ac.uk/pascal/VOC/voc2012/VOCtrainval_11-May-2012.tar下载
然后tar -xvf VOCtrainval_11-May-2012.tar解压
然后分别按着以下命令的格式输入:
python object_detection/dataset_tools/create_pascal_tf_record.py \
--label_map_path=object_detection/data/pascal_label_map.pbtxt \
--data_dir=VOCdevkit --year=VOC2012 --set=train \
--output_path=pascal_train.record
python object_detection/dataset_tools/create_pascal_tf_record.py \
--label_map_path=object_detection/data/pascal_label_map.pbtxt \
--data_dir=VOCdevkit --year=VOC2012 --set=val \
--output_path=pascal_val.record
python object_detection/dataset_tools/create_pascal_tf_record.py \
--label_map_path=object_detection/data/pascal_label_map.pbtxt \
--data_dir=/home/shnu/demo/models-master/research/data/VOCdevkit --year=VOC2012 --set=train \
--output_path=pascal_train.record
~/demo/models-master/research$ python object_detection/dataset_tools/create_pascal_tf_record.py \
> --label_map_path=object_detection/data/pascal_label_map.pbtxt \
> --data_dir=/home/shnu/demo/models-master/research/data/VOCdevkit --year=VOC2012 --set=train \
> --output_path=pascal_train.record
Traceback (most recent call last):
File "object_detection/dataset_tools/create_pascal_tf_record.py", line 37, in
from object_detection.utils import dataset_util
ImportError: No module named object_detection.utils
需要重新把前面的命令再输入一次
python object_detection/dataset_tools/create_pascal_tf_record.py \
--label_map_path=object_detection/data/pascal_label_map.pbtxt \
--data_dir=/home/shnu/demo/models-master/research/data/VOCdevkit --year=VOC2012 --set=val \
--output_path=pascal_val.record
然后在research文件下就会看到以下record文件。
数据准备好以后就可以在本地上跑了
Running locally
Recommended Directory Structure for Training and Evaluation建议的目录的结构
Pascal_label_map.bptxt在文件夹object_detection/data下面
再修改
相应的路径信息:
然后训练:同样是在research的目录下输入以下的命令
# From the tensorflow/models/research/ directory
python object_detection/train.py \
--logtostderr \
--pipeline_config_path=${PATH_TO_YOUR_PIPELINE_CONFIG} \
--train_dir=${PATH_TO_TRAIN_DIR}
但是启动过程遇到一个坑
这个是自己的cudnn的版本不兼容的问题,重点看but后面的内容but source was compiled with 7004 (compatibility version 7000),可以首先查看自己的cuda和cudnn的版本,用以下的命令查看,https://developer.nvidia.com/rdp/cudnn-archive
显然版本过高了,需要把cudnn的版本降低,开始在没有找到linux的版本,选择换了一个7.0.5。
首先,需要把原来的cudnn版本卸载
sudo rm -rf /usr/local/cuda/include/cudnn.h
sudo rm -rf /usr/local/cuda/lib64/libcudnn*
然后按照 里面的安装
然后再次运行:
python object_detection/train.py \
--logtostderr \
--pipeline_config_path=${PATH_TO_YOUR_PIPELINE_CONFIG} \
--train_dir=${PATH_TO_TRAIN_DIR}
如果遇到
~/demo/models-master/research$ python object_detection/train.py \
> --logtostderr \
> --pipeline_config_path=${PATH_TO_YOUR_PIPELINE_CONFIG} \
> --train_dir=${PATH_TO_TRAIN_DIR}
Traceback (most recent call last):
File "object_detection/train.py", line 49, in
from object_detection import trainer
ImportError: No module named object_detection
得到以下训练的过程
Running the Evaluation Job对数据进行相应的验证
同样在research下
# From the tensorflow/models/research/ directory
python object_detection/eval.py \
--logtostderr \
--pipeline_config_path=${PATH_TO_YOUR_PIPELINE_CONFIG} \
--checkpoint_dir=${PATH_TO_TRAIN_DIR} \
--eval_dir=${PATH_TO_EVAL_DIR}
PATH_TO_EVAL_DIR=/home/shnu/demo/models-master/research/models/faster_rcnn_resnet101/eval
需要重新配置环境:
protoc object_detection/protos/*.proto --python_out=.
export PYTHONPATH=$PYTHONPATH:`pwd`:`pwd`/slim
python object_detection/builders/model_builder_test.py
占时只能解决到这了!!!!!!