目录
1.原材料
2.目录结构
3.生成文件说明
4.生成文件的脚本代码
4.1 生成 test.txt、trainval.txt
4.2生成test_name_size.txt 文件
4.3生成 trainval_lmdb 和 test_lmdb
0.1 caffe ssd 的github主页:
0.2 Pascal VOC数据集制作caffe目标检测用的lmdb
caffe ssd目标检测训练测用的lmdb数据库文件的制作
分为训练验证集trainval 和 测试集合test:
1.所有图像文件.jpg
2.所有标签文件,是按照pascal voc格式的 xml文件,一张图像对应一个xml文件,图片名与标签文件名相同;
在caffe目录下的data目录下:新建一个文件夹:insulator_detect(数据库的名字也是这个insulator_detect)
insulator_detect文件夹下面新建三个文件夹:Annotations、ImageSets、JPEGImages;
Annotations文件夹下有trainval、test 文件夹,所有训练验证的标签xml文件放在trainval文件夹下,所有测试的标签xml放在test文件夹下面;同时所以的训练验证的标签xml文件和所有测试的标签xml 都拷贝一份放在Annotations文件夹下(方便后续脚本写
JPEGImages文件夹下有trainval、test 文件夹,所有训练验证的图像文件放在trainval文件夹下,所有测试的图像放在test文件夹下面;
ImageSets下面是一些脚本代码文件;
ImageSets 下有:test.txt、trainval.txt、test_name_size.txt、labelmap_insulator.prototxt;
labelmap_insulator.prototxt文件如下:
item{
name:"none_of_the_above"
label:0
display_name:"background"
}
item{
name:"insulator1"
label:1
display_name:"insulator1"
}
item{
name:"insulator2"
label:2
display_name:"insulator2"
}
0:是background
我这里有两个目标:insulator1和insulator2
现在需要用代码文件生成test.txt、trainval.txt、test_name_size.txt三个文件:
test.txt的若干行如下:
insulator_detect/JPEGImages/022656248_K1052896_1155_1_22.jpg insulator_detect/Annotations/022656248_K1052896_1155_1_22.xml
insulator_detect/JPEGImages/022421887_K1050044_1003_1_09.jpg insulator_detect/Annotations/022421887_K1050044_1003_1_09.xml
insulator_detect/JPEGImages/022638387_K1052556_1141_1_22.jpg insulator_detect/Annotations/022638387_K1052556_1141_1_22.xml
test.txt的每一行是一张测试图片的相对路径(一个空格)该图对应的标签xml文件的相对路径;
trainval.txt也是每一行是一张训练测试图片的相对路径(一个空格)该图对应的标签xml文件的相对路径;
图像的数量多少自己划分,比如设置如下比例:trainval.txt : test.txt = 8 : 2 ;
其实不论存成什么路径,主要是和下面的代码文件的路径匹配上;
test_name_size.txt 文件与test.txt 文件匹配,每一行对于test.txt文件的每一行:
test_name_size.txt 如下:
023016690_K1056793_1327_1_09 4400 6600
022938850_K1056079_1295_1_09 4400 6600
022906708_K1055361_1257_1_22 4400 6600
test_name_size.txt 文件的每一行是一张测试图片的名字(不带后缀) 图片的高度 图片的宽度
图像文件与标签xml文件的对应关系:
图片文件:
xml标签文件:
xml文件格式:
012500997_K974833_1_1_05.xml :
insulator1
012500997_K974833_1_1_05.jpg
D:\CaiShilv_label\insulator1\012500997_K974833_1_1_05.jpg
6600
4400
1
0
4.1然后使用下面的这个python脚本来生成 test.txt、trainval.txt:
generate_trainval_text_txt.py:
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
Created on Tue Jan 8 10:43:19 2019
@author: yang
"""
#! /usr/bin/python
# -*- coding:UTF-8 -*-
#trainval.txt 和 test.txt 文件的每一行就算一张图片的 图片文件名 该张图片对应的XML文件名
import os
import glob
#训练集和测试集路径
trainval_dir = "/home/yang/caffe_ssd/caffe/data/insulator_detect/JPEGImages/trainval"
test_dir = "/home/yang/caffe_ssd/caffe/data/insulator_detect/JPEGImages/test"
trainval_img_lists = glob.glob(trainval_dir + '/*.jpg') #获取trainval中所有.jpg的文件
trainval_img_names = [] #获取名称
for item in trainval_img_lists:
temp1, temp2 = os.path.splitext(os.path.basename(item))
trainval_img_names.append(temp1)
test_img_lists = glob.glob(test_dir + '/*.jpg') #获取test中所有.jpg文件
test_img_names = []
for item in test_img_lists:
temp1, temp2 = os.path.splitext(os.path.basename(item))
test_img_names.append(temp1)
#图片路径和xml路径
dist_img_dir = "insulator_detect/JPEGImages" #需要写入txt的trainval和test路径,因为我们在GPEGImges目录下除了有trainval和test文件夹外还有所有图片,所以只用写到PNGImages
dist_anno_dir = "insulator_detect/Annotations" #需要写入的xml路径 !!!从caffe跟目录下第一个文件开始写
trainval_fd = open("/home/yang/caffe_ssd/caffe/data/insulator_detect/ImageSets/trainval.txt", 'w') #存到哪里,及存储的名称
test_fd = open("/home/yang/caffe_ssd/caffe/data/insulator_detect/ImageSets/test.txt", 'w')
for item in trainval_img_names:
trainval_fd.write(dist_img_dir + '/' + str(item) + '.jpg' + ' ' + dist_anno_dir + '/' + str(item) + '.xml\n')
for item in test_img_names:
test_fd.write(dist_img_dir + '/' + str(item) + '.jpg' + ' ' + dist_anno_dir + '/' + str(item) + '.xml\n')
用下面的python脚本生成test_name_size.txt 文件
generate_test_size.py:
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
Created on Tue Jan 8 16:01:17 2019
@author: yang
"""
#! /usr/bin/python
# -*- coding:UTF-8 -*-
import os
import glob
from PIL import Image #读图
#图的路径
img_dir = "/home/yang/caffe_ssd/caffe/data/insulator_detect/JPEGImages/test"
#获取制定路径下的所有jpg图片的名称
img_lists = glob.glob(img_dir + '/*.jpg')
#在指定路径下创建文件
test_name_size = open('/home/yang/caffe_ssd/caffe/data/insulator_detect/ImageSets/test_name_size_insulator.txt', 'w')
for item in img_lists:
img = Image.open(item)
width, height = img.size
temp1, temp2 = os.path.splitext(os.path.basename(item)) #basename()返回路径最后的文件名;;;os.path.splitext(“文件路径”) 分离文件名与扩展名;默认返回(fname,fextension)元组,可做分片操作
test_name_size.write(temp1 + ' ' + str(height) + ' ' + str(width) + '\n')
至此,这test.txt、trainval.txt、test_name_size.txt、labelmap_insulator.prototxt三个文件就生成好了;
下面要利用这四生成对应的lmdb文集 trainval_lmdb 和 test_lmdb
使用这个脚本:create_data.sh
# -*- coding: utf-8 -*-
cur_dir=$(cd $( dirname ${BASH_SOURCE[0]} ) && pwd )
root_dir='/home/yang/caffe_ssd/caffe' # bash是存放在~cafferoot~/data/VOC0712下,所以向上两级就是cafferoot
cd $root_dir
redo=1
data_root_dir="/home/yang/caffe_ssd/caffe/data"
txtFileDir="/home/yang/caffe_ssd/caffe/data/insulator_detect/ImageSets"
lmdbFile='/home/yang/caffe_ssd/caffe/data/insulator_detect/lmdb'
lmdbLinkDir='/home/yang/caffe_ssd/caffe/data/insulator_detect/lmdbLinkDir'
dataset_name="insulator_detect" #上下相连到最终VOC
mapfile="/home/yang/caffe_ssd/caffe/data/insulator_detect/ImageSets/labelmap_insulator.prototxt" #次文件定义了背景层0,以及分类层,下次直接定义labelmap.prototxt的直接路径即可
anno_type="detection"
db="lmdb"
min_dim=0
max_dim=0
width=0
height=0
extra_cmd="--encode-type=jpg --encoded"
if [ $redo ]
then
extra_cmd="$extra_cmd --redo"
fi
for subset in test trainval
do #下面的路径需要根据自己的情况修改,我们的就是这样
python $root_dir/scripts/create_annoset.py --anno-type=$anno_type --label-map-file=$mapfile --min-dim=$min_dim --max-dim=$max_dim --resize-width=$width --resize-height=$height --check-label $extra_cmd $data_root_dir $txtFileDir/$subset.txt $lmdbFile/$subset"_"$db $lmdbLinkDir
done
之前写的博文有个错误点:
特别注意这里面的:
width=0
height=0
,我一开始设置成了网络的输入尺寸:
width=300
height=300
;最后生成的lmdb文件是错误的(lmdb文件很小,与图片的总大小不匹配)
所以要设置为:
width=0
height=0
更正:
create_data.sh 脚本当中的
width=0
height=0
可以设置为最终网络训练时的输入图像的尺寸,
比如SSD最后训练需要输入300*300的图像,在制作数据集的时候就把图像尺寸都resize到300*300;
所以,create_data.sh脚本最终如下:
# -*- coding: utf-8 -*-
cur_dir=$(cd $( dirname ${BASH_SOURCE[0]} ) && pwd )
root_dir='/home/yang/caffe_ssd/caffe' # bash是存放在~cafferoot~/data/VOC0712下,所以向上两级就是cafferoot
cd $root_dir
redo=1
data_root_dir="/home/yang/caffe_ssd/caffe/data"
txtFileDir="/home/yang/caffe_ssd/caffe/data/insulator_detect/ImageSets"
lmdbFile='/home/yang/caffe_ssd/caffe/data/insulator_detect/lmdb'
lmdbLinkDir='/home/yang/caffe_ssd/caffe/data/insulator_detect/lmdbLinkDir'
dataset_name="insulator_detect" #上下相连到最终VOC
mapfile="/home/yang/caffe_ssd/caffe/data/insulator_detect/ImageSets/labelmap_insulator.prototxt" #次文件定义了背景层0,以及分类层,下次直接定义labelmap.prototxt的直接路径即可
anno_type="detection"
db="lmdb"
min_dim=0
max_dim=0
width=300
height=300
extra_cmd="--encode-type=jpg --encoded"
if [ $redo ]
then
extra_cmd="$extra_cmd --redo"
fi
for subset in test trainval
do #下面的路径需要根据自己的情况修改,我们的就是这样
python $root_dir/scripts/create_annoset.py --anno-type=$anno_type --label-map-file=$mapfile --min-dim=$min_dim --max-dim=$max_dim --resize-width=$width --resize-height=$height --check-label $extra_cmd $data_root_dir $txtFileDir/$subset.txt $lmdbFile/$subset"_"$db $lmdbLinkDir
done
图像尺寸resize 到300*300以后,如果原图很大的话,最后制作成的lmdb文件会比 所有原图的大小和要小;
因为还有一个图像的压缩编码方案:extra_cmd="--encode-type=jpg --encoded"
一个基本的常识计算:
一张1024*1024*3通道的图像,如果加载进内存,存为矩阵,所占用内存的大小为:
1024*1024*3B=3MB;
但经过jpeg编码,压缩比可达20倍,所有,经编码压缩的图像大小只有:3MB/20=153KB;就比较小了;
这里主要调用了ssd-caffe下的脚本:python $root_dir/scripts/create_annoset.py
create_annoset.py如下:
这个脚本的前面就算参数解析,所以上面的脚本代码与这里要将参数一一对应上就OK;
import argparse
import os
import shutil
import subprocess
import sys
from caffe.proto import caffe_pb2
from google.protobuf import text_format
if __name__ == "__main__":
parser = argparse.ArgumentParser(description="Create AnnotatedDatum database")
parser.add_argument("root",
help="The root directory which contains the images and annotations.")
parser.add_argument("listfile",
help="The file which contains image paths and annotation info.")
parser.add_argument("outdir",
help="The output directory which stores the database file.")
parser.add_argument("exampledir",
help="The directory to store the link of the database files.")
parser.add_argument("--redo", default = False, action = "store_true",
help="Recreate the database.")
parser.add_argument("--anno-type", default = "classification",
help="The type of annotation {classification, detection}.")
parser.add_argument("--label-type", default = "xml",
help="The type of label file format for detection {xml, json, txt}.")
parser.add_argument("--backend", default = "lmdb",
help="The backend {lmdb, leveldb} for storing the result")
parser.add_argument("--check-size", default = False, action = "store_true",
help="Check that all the datum have the same size.")
parser.add_argument("--encode-type", default = "",
help="What type should we encode the image as ('png','jpg',...).")
parser.add_argument("--encoded", default = False, action = "store_true",
help="The encoded image will be save in datum.")
parser.add_argument("--gray", default = False, action = "store_true",
help="Treat images as grayscale ones.")
parser.add_argument("--label-map-file", default = "",
help="A file with LabelMap protobuf message.")
parser.add_argument("--min-dim", default = 0, type = int,
help="Minimum dimension images are resized to.")
parser.add_argument("--max-dim", default = 0, type = int,
help="Maximum dimension images are resized to.")
parser.add_argument("--resize-height", default = 0, type = int,
help="Height images are resized to.")
parser.add_argument("--resize-width", default = 0, type = int,
help="Width images are resized to.")
parser.add_argument("--shuffle", default = False, action = "store_true",
help="Randomly shuffle the order of images and their labels.")
parser.add_argument("--check-label", default = False, action = "store_true",
help="Check that there is no duplicated name/label.")
args = parser.parse_args()
root_dir = args.root
list_file = args.listfile
out_dir = args.outdir
example_dir = args.exampledir
redo = args.redo
anno_type = args.anno_type
label_type = args.label_type
backend = args.backend
check_size = args.check_size
encode_type = args.encode_type
encoded = args.encoded
gray = args.gray
label_map_file = args.label_map_file
min_dim = args.min_dim
max_dim = args.max_dim
resize_height = args.resize_height
resize_width = args.resize_width
shuffle = args.shuffle
check_label = args.check_label
# check if root directory exists
if not os.path.exists(root_dir):
print("root directory: {} does not exist".format(root_dir))
sys.exit()
# add "/" to root directory if needed
if root_dir[-1] != "/":
root_dir += "/"
# check if list file exists
if not os.path.exists(list_file):
print("list file: {} does not exist".format(list_file))
sys.exit()
# check list file format is correct
with open(list_file, "r") as lf:
for line in lf.readlines():
img_file, anno = line.strip("\n").split(" ")
if not os.path.exists(root_dir + img_file):
print("image file: {} does not exist".format(root_dir + img_file))
if anno_type == "classification":
if not anno.isdigit():
print("annotation: {} is not an integer".format(anno))
elif anno_type == "detection":
if not os.path.exists(root_dir + anno):
print("annofation file: {} does not exist".format(root_dir + anno))
sys.exit()
break
# check if label map file exist
if anno_type == "detection":
if not os.path.exists(label_map_file):
print("label map file: {} does not exist".format(label_map_file))
sys.exit()
label_map = caffe_pb2.LabelMap()
lmf = open(label_map_file, "r")
try:
text_format.Merge(str(lmf.read()), label_map)
except:
print("Cannot parse label map file: {}".format(label_map_file))
sys.exit()
out_parent_dir = os.path.dirname(out_dir)
if not os.path.exists(out_parent_dir):
os.makedirs(out_parent_dir)
if os.path.exists(out_dir) and not redo:
print("{} already exists and I do not hear redo".format(out_dir))
sys.exit()
if os.path.exists(out_dir):
shutil.rmtree(out_dir)
# get caffe root directory
caffe_root = os.path.dirname(os.path.dirname(os.path.realpath(__file__)))
if anno_type == "detection":
cmd = "{}/build/tools/convert_annoset" \
" --anno_type={}" \
" --label_type={}" \
" --label_map_file={}" \
" --check_label={}" \
" --min_dim={}" \
" --max_dim={}" \
" --resize_height={}" \
" --resize_width={}" \
" --backend={}" \
" --shuffle={}" \
" --check_size={}" \
" --encode_type={}" \
" --encoded={}" \
" --gray={}" \
" {} {} {}" \
.format(caffe_root, anno_type, label_type, label_map_file, check_label,
min_dim, max_dim, resize_height, resize_width, backend, shuffle,
check_size, encode_type, encoded, gray, root_dir, list_file, out_dir)
elif anno_type == "classification":
cmd = "{}/build/tools/convert_annoset" \
" --anno_type={}" \
" --min_dim={}" \
" --max_dim={}" \
" --resize_height={}" \
" --resize_width={}" \
" --backend={}" \
" --shuffle={}" \
" --check_size={}" \
" --encode_type={}" \
" --encoded={}" \
" --gray={}" \
" {} {} {}" \
.format(caffe_root, anno_type, min_dim, max_dim, resize_height,
resize_width, backend, shuffle, check_size, encode_type, encoded,
gray, root_dir, list_file, out_dir)
print(cmd)
process = subprocess.Popen(cmd.split(), stdout=subprocess.PIPE)
output = process.communicate()[0]
if not os.path.exists(example_dir):
os.makedirs(example_dir)
link_dir = os.path.join(example_dir, os.path.basename(out_dir))
if os.path.exists(link_dir):
os.unlink(link_dir)
os.symlink(out_dir, link_dir)
0. caffe ssd 的github主页:https://github.com/weiliu89/caffe/tree/ssd
Download VOC2007 and VOC2012 dataset. By default, we assume the data is stored in $HOME/data/
# Download the data.
cd $HOME/data
wget http://host.robots.ox.ac.uk/pascal/VOC/voc2012/VOCtrainval_11-May-2012.tar
wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtrainval_06-Nov-2007.tar
wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtest_06-Nov-2007.tar
# Extract the data.
tar -xvf VOCtrainval_11-May-2012.tar
tar -xvf VOCtrainval_06-Nov-2007.tar
tar -xvf VOCtest_06-Nov-2007.tar
下载解压缩;
然后准备LMDB数据库:
Create the LMDB file:
cd $CAFFE_ROOT
# Create the trainval.txt, test.txt, and test_name_size.txt in data/VOC0712/
./data/VOC0712/create_list.sh
# You can modify the parameters in create_data.sh if needed.
# It will create lmdb files for trainval and test with encoded original image:
# - $HOME/data/VOCdevkit/VOC0712/lmdb/VOC0712_trainval_lmdb
# - $HOME/data/VOCdevkit/VOC0712/lmdb/VOC0712_test_lmdb
# and make soft links at examples/VOC0712/
./data/VOC0712/create_data.sh
目标检测的数据集一般按照pascal voc的格式进行转换。