下载darknet并编译(GPU版),编译这里就不说了。反正视频检测要依赖opencv,GPU版本要依赖CUDA和cuDNN。
以训练人手和人脸这两个类为例子:
1.将包含人手和人脸的图片放到build\darknet\x64\data\train_iamges和build\darknet\x64\data\val_images,train_iamges里是训练的图片,val_images里是用来验证的图片。
2.使用labelImg打标签,目的是生成每张图片对应的txt文件(保护框框的类序号,坐标)。
labelImg保存一张图片时生成的是xml文件,我们需要将xml转为txt:
xml转txt脚本:
# 此代码和data文件夹同目录
import glob
import xml.etree.ElementTree as ET
# 类名
class_names = ['human face', 'human hand']
# xml文件路径,train_images只需改为val_images就可以处理val_images的了
#path = 'data/train_images/'
path1 = 'data/train_images/'
path2 = 'data/val_images/'
# 转换一个xml文件为txt
def single_xml_to_txt(xml_file):
tree = ET.parse(xml_file)
root = tree.getroot()
# 保存的txt文件路径
txt_file = xml_file.split('.')[0]+'.txt'
with open(txt_file, 'w') as txt_file:
for member in root.findall('object'):
# filename = root.find('filename').text
picture_width = int(root.find('size')[0].text)
picture_height = int(root.find('size')[1].text)
class_name = member[0].text
# 类名对应的index
class_num = class_names.index(class_name)
box_x_min = int(member[4][0].text) # 左上角横坐标
box_y_min = int(member[4][1].text) # 左上角纵坐标
box_x_max = int(member[4][2].text) # 右下角横坐标
box_y_max = int(member[4][3].text) # 右下角纵坐标
# 转成相对位置和宽高
x_center = (box_x_min + box_x_max) / (2 * picture_width)
y_center = (box_y_min + box_y_max) / (2 * picture_height)
width = (box_x_max - box_x_min) / picture_width
height = (box_y_max - box_y_min) / picture_height
print(class_num, x_center, y_center, width, height)
txt_file.write(str(class_num) + ' ' + str(x_center) + ' ' + str(y_center) + ' ' + str(width) + ' ' + str(height) + '\n')
# 转换文件夹下的所有xml文件为txt
def dir_xml_to_txt(path):
for xml_file in glob.glob(path + '*.xml'):
single_xml_to_txt(xml_file)
#dir_xml_to_txt(path)
dir_xml_to_txt(path1)
dir_xml_to_txt(path2)
3.生成train.txt和val.txt文件,这两个文件里每一行都是图片的路径:
import os
def generate_train_and_val(image_path, txt_file):
image_files = []
os.chdir(os.path.join("data", image_path))
for filename in os.listdir(os.getcwd()):
if filename.endswith(".jpg"):
image_files.append("data/" + image_path + "/" + filename)
os.chdir("..")
with open(txt_file, "w") as outfile:
for image in image_files:
outfile.write(image)
outfile.write("\n")
outfile.close()
os.chdir("..")
generate_train_and_val("train_images", "train.txt")
generate_train_and_val("val_images", "val.txt")
4.新建 human.names文件,输入如下:(注意不要有制表符)
human face
human hand
5.新建 human.data文件,输入如下:
classes= 2
train = data/train.txt
valid = data/val.txt
names = data/human.names
backup = backup/
class=2代表训练两个类(人脸和人手)
6.新建human.cfg文件,里面的内容直接复制yolov3.cfg的,然后再修改:
如何修改参考https://github.com/AlexeyAB/darknet#how-to-train-to-detect-your-custom-objects
7.训练命令和检测命令:
训练:
darknet.exe detector train data/human.data cfg/human.cfg darknet53.conv.74
darknet53.conv.74是下载好的预训练文件。
检测图片:
darknet.exe detector test data/human.data cfg/human.cfg backup/human_final.weights data/test.jpg
检测视频:
darknet.exe detector demo data/human.data cfg/human.cfg backup/human_final.weights test.mp4
参考:
YOLO3 darknet训练自己的数据
https://github.com/AlexeyAB/darknet