Darknet安装官网
本文主要参考自https://www.cnblogs.com/nanzhao/p/Sailon.html
git clone https://github.com/pjreddie/darknet.git
cd darknet
修改Makefile文件相应的运行配置
GPU=1 #0或1
CUDNN=1 #0或1
OPENCV=0 #0或1
OPENMP = 0
DEBUG = 0
修改完成后make
进行编译
下载与训练权重
wget https://pjreddie.com/media/files/yolov3.weights
运行测试
./darknet detect cfg/yolov3.cfg yolov3.weights data/dog.jpg
检测输出实例
layer filters size input output
0 conv 32 3 x 3 / 1 416 x 416 x 3 -> 416 x 416 x 32 0.299 BFLOPs
1 conv 64 3 x 3 / 2 416 x 416 x 32 -> 208 x 208 x 64 1.595 BFLOPs
.......
105 conv 255 1 x 1 / 1 52 x 52 x 256 -> 52 x 52 x 255 0.353 BFLOPs
106 detection
truth_thresh: Using default '1.000000'
Loading weights from yolov3.weights...Done!
data/dog.jpg: Predicted in 0.029329 seconds.
dog: 99%
truck: 93%
bicycle: 99%
Darknet希望每个图像都有一个txt 文件,图像中的每个标注框都有一行,如下所示:
其中 x,y,width和height 相对于图像的宽度和高度。
要生成这些文件,我们将voc_label.py 在Darknet的scripts/ 目录中运行脚本
voc_label.py
共需修改以下四处:
import xml.etree.ElementTree as ET
import pickle
import os
from os import listdir, getcwd
from os.path import join
sets=[('2007', 'train'), ('2007', 'val'), ('2007', 'test')] #替换为自己的数据集
classes = ["head", "eye", "nose"] #修改为自己的类别
def convert(size, box):
dw = 1./(size[0])
dh = 1./(size[1])
x = (box[0] + box[1])/2.0 - 1
y = (box[2] + box[3])/2.0 - 1
w = box[1] - box[0]
h = box[3] - box[2]
x = x*dw
w = w*dw
y = y*dh
h = h*dh
return (x,y,w,h)
def convert_annotation(year, image_id):
in_file = open('VOCdevkit/VOC%s/Annotations/%s.xml'%(year, image_id)) #将数据集放于当前目录下
out_file = open('VOCdevkit/VOC%s/labels/%s.txt'%(year, image_id), 'w')
tree=ET.parse(in_file)
root = tree.getroot()
size = root.find('size')
w = int(size.find('width').text)
h = int(size.find('height').text)
for obj in root.iter('object'):
difficult = obj.find('difficult').text
cls = obj.find('name').text
if cls not in classes or int(difficult)==1:
continue
cls_id = classes.index(cls)
xmlbox = obj.find('bndbox')
b = (float(xmlbox.find('xmin').text), float(xmlbox.find('xmax').text), float(xmlbox.find('ymin').text), float(xmlbox.find('ymax').text))
bb = convert((w,h), b)
out_file.write(str(cls_id) + " " + " ".join([str(a) for a in bb]) + '\n')
wd = getcwd()
for year, image_set in sets:
if not os.path.exists('VOCdevkit/VOC%s/labels/'%(year)):
os.makedirs('VOCdevkit/VOC%s/labels/'%(year))
image_ids = open('VOCdevkit/VOC%s/ImageSets/Main/%s.txt'%(year, image_set)).read().strip().split()
list_file = open('%s_%s.txt'%(year, image_set), 'w')
for image_id in image_ids:
list_file.write('%s/VOCdevkit/VOC%s/JPEGImages/%s.jpg\n'%(wd, year, image_id))
convert_annotation(year, image_id)
list_file.close()
os.system("cat 2007_train.txt 2007_val.txt > train.txt") #修改为自己的数据集用作训练
运行
#wget https://pjreddie.com/media/files/voc_label.py
python voc_label.py
此脚本将生成所有必需的文件。大多数情况下,它会在VOCdevkit/VOC2007/labels/中 生成大量标签文件。
在目录中可以看到:
pc:~/darknet/scripts$ ls
2007_test.txt #0 dice_label.sh imagenet_label.sh VOCdevkit_original
2007_train.txt #1 gen_tactic.sh train.txt #3 voc_label.py
2007_val.txt #2 get_coco_dataset.sh VOCdevkit
文本文件 2007_train.txt 列出了该年份的图像文件和图像集。
Darknet需要一个文本文件,其中包含您要训练的所有图像。
修改cfg/voc.data配置文件以指向自己的数据:
pc:~/darknet/cfg$ cat voc.data
classes= 3 #修改为自己的类别数
train = /home/learner/darknet/data/voc/train.txt #修改为自己的路径 or /home/learner/darknet/scripts/2007_test.txt
valid = /home/learner/darknet/data/voc/2007_test.txt #修改为自己的路径 or /home/learner/darknet/scripts/2007_test.txt
names = /home/learner/darknet/data/voc.names #修改见voc.names
backup = /home/learner/darknet/backup #修改为自己的路径,输出的权重信息将存储其内
修改voc.names,每行一个类别,如:
head
eye
nose
替换放置VOC数据的目录为自己的VOC数据集
使用在Imagenet上预训练的darknet53模型权重
wget https://pjreddie.com/media/files/darknet53.conv.74
[net]
# Testing
batch=64
subdivisions=32 #每批训练的个数=batch/subvisions,根据自己GPU显存进行修改,显存不够改大一些
# Training
# batch=64
# subdivisions=16
width=416
height=416
channels=3
momentum=0.9
decay=0.0005
angle=0
saturation = 1.5
exposure = 1.5
hue=.1
learning_rate=0.001
burn_in=1000
max_batches = 50200 #训练步数
policy=steps
steps=40000,45000 #开始衰减的步数
scales=.1,.1
[convolutional]
batch_normalize=1
filters=32
size=3
stride=1
pad=1
activation=leaky
# Downsample
[convolutional]
batch_normalize=1
filters=64
size=3
stride=2
pad=1
activation=leaky
[convolutional]
batch_normalize=1
filters=32
size=1
stride=1
pad=1
activation=leaky
[convolutional]
batch_normalize=1
filters=64
size=3
stride=1
pad=1
activation=leaky
[shortcut]
from=-3
activation=linear
# Downsample
[convolutional]
batch_normalize=1
filters=128
size=3
stride=2
pad=1
activation=leaky
[convolutional]
batch_normalize=1
filters=64
size=1
stride=1
pad=1
activation=leaky
[convolutional]
batch_normalize=1
filters=128
size=3
stride=1
pad=1
activation=leaky
[shortcut]
from=-3
activation=linear
[convolutional]
batch_normalize=1
filters=64
size=1
stride=1
pad=1
activation=leaky
[convolutional]
batch_normalize=1
filters=128
size=3
stride=1
pad=1
activation=leaky
[shortcut]
from=-3
activation=linear
# Downsample
[convolutional]
batch_normalize=1
filters=256
size=3
stride=2
pad=1
activation=leaky
[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky
[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=leaky
[shortcut]
from=-3
activation=linear
[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky
[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=leaky
[shortcut]
from=-3
activation=linear
[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky
[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=leaky
[shortcut]
from=-3
activation=linear
[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky
[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=leaky
[shortcut]
from=-3
activation=linear
[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky
[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=leaky
[shortcut]
from=-3
activation=linear
[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky
[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=leaky
[shortcut]
from=-3
activation=linear
[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky
[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=leaky
[shortcut]
from=-3
activation=linear
[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky
[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=leaky
[shortcut]
from=-3
activation=linear
# Downsample
[convolutional]
batch_normalize=1
filters=512
size=3
stride=2
pad=1
activation=leaky
[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky
[convolutional]
batch_normalize=1
filters=512
size=3
stride=1
pad=1
activation=leaky
[shortcut]
from=-3
activation=linear
[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky
[convolutional]
batch_normalize=1
filters=512
size=3
stride=1
pad=1
activation=leaky
[shortcut]
from=-3
activation=linear
[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky
[convolutional]
batch_normalize=1
filters=512
size=3
stride=1
pad=1
activation=leaky
[shortcut]
from=-3
activation=linear
[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky
[convolutional]
batch_normalize=1
filters=512
size=3
stride=1
pad=1
activation=leaky
[shortcut]
from=-3
activation=linear
[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky
[convolutional]
batch_normalize=1
filters=512
size=3
stride=1
pad=1
activation=leaky
[shortcut]
from=-3
activation=linear
[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky
[convolutional]
batch_normalize=1
filters=512
size=3
stride=1
pad=1
activation=leaky
[shortcut]
from=-3
activation=linear
[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky
[convolutional]
batch_normalize=1
filters=512
size=3
stride=1
pad=1
activation=leaky
[shortcut]
from=-3
activation=linear
[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky
[convolutional]
batch_normalize=1
filters=512
size=3
stride=1
pad=1
activation=leaky
[shortcut]
from=-3
activation=linear
# Downsample
[convolutional]
batch_normalize=1
filters=1024
size=3
stride=2
pad=1
activation=leaky
[convolutional]
batch_normalize=1
filters=512
size=1
stride=1
pad=1
activation=leaky
[convolutional]
batch_normalize=1
filters=1024
size=3
stride=1
pad=1
activation=leaky
[shortcut]
from=-3
activation=linear
[convolutional]
batch_normalize=1
filters=512
size=1
stride=1
pad=1
activation=leaky
[convolutional]
batch_normalize=1
filters=1024
size=3
stride=1
pad=1
activation=leaky
[shortcut]
from=-3
activation=linear
[convolutional]
batch_normalize=1
filters=512
size=1
stride=1
pad=1
activation=leaky
[convolutional]
batch_normalize=1
filters=1024
size=3
stride=1
pad=1
activation=leaky
[shortcut]
from=-3
activation=linear
[convolutional]
batch_normalize=1
filters=512
size=1
stride=1
pad=1
activation=leaky
[convolutional]
batch_normalize=1
filters=1024
size=3
stride=1
pad=1
activation=leaky
[shortcut]
from=-3
activation=linear
######################
[convolutional]
batch_normalize=1
filters=512
size=1
stride=1
pad=1
activation=leaky
[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=1024
activation=leaky
[convolutional]
batch_normalize=1
filters=512
size=1
stride=1
pad=1
activation=leaky
[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=1024
activation=leaky
[convolutional]
batch_normalize=1
filters=512
size=1
stride=1
pad=1
activation=leaky
[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=1024
activation=leaky
[convolutional]
size=1
stride=1
pad=1
filters=24 #filters = 3 * ( classes + 5 ) here,filters=3*(3+5)
activation=linear
[yolo]
mask = 6,7,8
anchors = 10,13, 16,30, 33,23, 30,61, 62,45, 59,119, 116,90, 156,198, 373,326
classes=3 #修改为自己的类别数
num=9
jitter=.3
ignore_thresh = .5
truth_thresh = 1
random=1
[route]
layers = -4
[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky
[upsample]
stride=2
[route]
layers = -1, 61
[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky
[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=512
activation=leaky
[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky
[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=512
activation=leaky
[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky
[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=512
activation=leaky
[convolutional]
size=1
stride=1
pad=1
filters=24 #filters = 3 * ( classes + 5 ) here,filters=3*(3+5)
activation=linear
[yolo]
mask = 3,4,5
anchors = 10,13, 16,30, 33,23, 30,61, 62,45, 59,119, 116,90, 156,198, 373,326
classes=3 #修改为自己的类别数
num=9
jitter=.3
ignore_thresh = .5
truth_thresh = 1
random=1
[route]
layers = -4
[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky
[upsample]
stride=2
[route]
layers = -1, 36
[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky
[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=256
activation=leaky
[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky
[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=256
activation=leaky
[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky
[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=256
activation=leaky
[convolutional]
size=1
stride=1
pad=1
filters=24 #filters = 3 * ( classes + 5 ) here,filters=3*(3+5)
activation=linear
[yolo]
mask = 0,1,2
anchors = 10,13, 16,30, 33,23, 30,61, 62,45, 59,119, 116,90, 156,198, 373,326
classes=3 #修改为自己的类别数
num=9
jitter=.3
ignore_thresh = .5
truth_thresh = 1
random=1
单GPU训练:./darknet -i
./darknet detector train cfg/voc.data cfg/yolov3-voc.cfg darknet53.conv.74
多GPU训练,格式为0,1,2,3:./darknet detector train
./darknet detector train cfg/voc.data cfg/yolov3-voc.cfg darknet53.conv.74 -gpus 0,1,2,3
测试单张图片(自己的模型)
learner@learner-pc:~/darknet$ ./darknet detector test cfg/voc.data cfg/yolov3-voc.cfg backup/yolov3-voc_20000.weights Eminem.jpg layer filters size input output
conv 32 3 x 3 / 1 416 x 416 x 3 -> 416 x 416 x 32 0.299 BF LOPs
conv 64 3 x 3 / 2 416 x 416 x 32 -> 208 x 208 x 64 1.595 BF LOPs
conv 32 1 x 1 / 1 208 x 208 x 64 -> 208 x 208 x 32 0.177 BF LOPs
conv 64 3 x 3 / 1 208 x 208 x 32 -> 208 x 208 x 64 1.595 BF LOPs
res 1 208 x 208 x 64 -> 208 x 208 x 64
conv 128 3 x 3 / 2 208 x 208 x 64 -> 104 x 104 x 128 1.595 BF LOPs
conv 64 1 x 1 / 1 104 x 104 x 128 -> 104 x 104 x 64 0.177 BF LOPs
conv 128 3 x 3 / 1 104 x 104 x 64 -> 104 x 104 x 128 1.595 BF LOPs
res 5 104 x 104 x 128 -> 104 x 104 x 128
conv 64 1 x 1 / 1 104 x 104 x 128 -> 104 x 104 x 64 0.177 BF LOPs
conv 128 3 x 3 / 1 104 x 104 x 64 -> 104 x 104 x 128 1.595 BF LOPs
res 8 104 x 104 x 128 -> 104 x 104 x 128
conv 256 3 x 3 / 2 104 x 104 x 128 -> 52 x 52 x 256 1.595 BF LOPs
conv 128 1 x 1 / 1 52 x 52 x 256 -> 52 x 52 x 128 0.177 BF LOPs
conv 256 3 x 3 / 1 52 x 52 x 128 -> 52 x 52 x 256 1.595 BF LOPs
res 12 52 x 52 x 256 -> 52 x 52 x 256
conv 128 1 x 1 / 1 52 x 52 x 256 -> 52 x 52 x 128 0.177 BF LOPs
conv 256 3 x 3 / 1 52 x 52 x 128 -> 52 x 52 x 256 1.595 BF LOPs
res 15 52 x 52 x 256 -> 52 x 52 x 256
conv 128 1 x 1 / 1 52 x 52 x 256 -> 52 x 52 x 128 0.177 BF LOPs
conv 256 3 x 3 / 1 52 x 52 x 128 -> 52 x 52 x 256 1.595 BF LOPs
res 18 52 x 52 x 256 -> 52 x 52 x 256
conv 128 1 x 1 / 1 52 x 52 x 256 -> 52 x 52 x 128 0.177 BF LOPs
conv 256 3 x 3 / 1 52 x 52 x 128 -> 52 x 52 x 256 1.595 BF LOPs
res 21 52 x 52 x 256 -> 52 x 52 x 256
conv 128 1 x 1 / 1 52 x 52 x 256 -> 52 x 52 x 128 0.177 BF LOPs
conv 256 3 x 3 / 1 52 x 52 x 128 -> 52 x 52 x 256 1.595 BF LOPs
res 24 52 x 52 x 256 -> 52 x 52 x 256
conv 128 1 x 1 / 1 52 x 52 x 256 -> 52 x 52 x 128 0.177 BF LOPs
conv 256 3 x 3 / 1 52 x 52 x 128 -> 52 x 52 x 256 1.595 BF LOPs
res 27 52 x 52 x 256 -> 52 x 52 x 256
conv 128 1 x 1 / 1 52 x 52 x 256 -> 52 x 52 x 128 0.177 BF LOPs
conv 256 3 x 3 / 1 52 x 52 x 128 -> 52 x 52 x 256 1.595 BF LOPs
res 30 52 x 52 x 256 -> 52 x 52 x 256
conv 128 1 x 1 / 1 52 x 52 x 256 -> 52 x 52 x 128 0.177 BF LOPs
conv 256 3 x 3 / 1 52 x 52 x 128 -> 52 x 52 x 256 1.595 BF LOPs
res 33 52 x 52 x 256 -> 52 x 52 x 256
conv 512 3 x 3 / 2 52 x 52 x 256 -> 26 x 26 x 512 1.595 BF LOPs
conv 256 1 x 1 / 1 26 x 26 x 512 -> 26 x 26 x 256 0.177 BF LOPs
conv 512 3 x 3 / 1 26 x 26 x 256 -> 26 x 26 x 512 1.595 BF LOPs
res 37 26 x 26 x 512 -> 26 x 26 x 512
conv 256 1 x 1 / 1 26 x 26 x 512 -> 26 x 26 x 256 0.177 BF LOPs
conv 512 3 x 3 / 1 26 x 26 x 256 -> 26 x 26 x 512 1.595 BF LOPs
res 40 26 x 26 x 512 -> 26 x 26 x 512
conv 256 1 x 1 / 1 26 x 26 x 512 -> 26 x 26 x 256 0.177 BF LOPs
conv 512 3 x 3 / 1 26 x 26 x 256 -> 26 x 26 x 512 1.595 BF LOPs
res 43 26 x 26 x 512 -> 26 x 26 x 512
conv 256 1 x 1 / 1 26 x 26 x 512 -> 26 x 26 x 256 0.177 BF LOPs
conv 512 3 x 3 / 1 26 x 26 x 256 -> 26 x 26 x 512 1.595 BF LOPs
res 46 26 x 26 x 512 -> 26 x 26 x 512
conv 256 1 x 1 / 1 26 x 26 x 512 -> 26 x 26 x 256 0.177 BF LOPs
conv 512 3 x 3 / 1 26 x 26 x 256 -> 26 x 26 x 512 1.595 BF LOPs
res 49 26 x 26 x 512 -> 26 x 26 x 512
conv 256 1 x 1 / 1 26 x 26 x 512 -> 26 x 26 x 256 0.177 BF LOPs
conv 512 3 x 3 / 1 26 x 26 x 256 -> 26 x 26 x 512 1.595 BF LOPs
res 52 26 x 26 x 512 -> 26 x 26 x 512
conv 256 1 x 1 / 1 26 x 26 x 512 -> 26 x 26 x 256 0.177 BF LOPs
conv 512 3 x 3 / 1 26 x 26 x 256 -> 26 x 26 x 512 1.595 BF LOPs
res 55 26 x 26 x 512 -> 26 x 26 x 512
conv 256 1 x 1 / 1 26 x 26 x 512 -> 26 x 26 x 256 0.177 BF LOPs
conv 512 3 x 3 / 1 26 x 26 x 256 -> 26 x 26 x 512 1.595 BF LOPs
res 58 26 x 26 x 512 -> 26 x 26 x 512
conv 1024 3 x 3 / 2 26 x 26 x 512 -> 13 x 13 x1024 1.595 BF LOPs
conv 512 1 x 1 / 1 13 x 13 x1024 -> 13 x 13 x 512 0.177 BF LOPs
conv 1024 3 x 3 / 1 13 x 13 x 512 -> 13 x 13 x1024 1.595 BF LOPs
res 62 13 x 13 x1024 -> 13 x 13 x1024
conv 512 1 x 1 / 1 13 x 13 x1024 -> 13 x 13 x 512 0.177 BF LOPs
conv 1024 3 x 3 / 1 13 x 13 x 512 -> 13 x 13 x1024 1.595 BF LOPs
res 65 13 x 13 x1024 -> 13 x 13 x1024
conv 512 1 x 1 / 1 13 x 13 x1024 -> 13 x 13 x 512 0.177 BF LOPs
conv 1024 3 x 3 / 1 13 x 13 x 512 -> 13 x 13 x1024 1.595 BF LOPs
res 68 13 x 13 x1024 -> 13 x 13 x1024
conv 512 1 x 1 / 1 13 x 13 x1024 -> 13 x 13 x 512 0.177 BF LOPs
conv 1024 3 x 3 / 1 13 x 13 x 512 -> 13 x 13 x1024 1.595 BF LOPs
res 71 13 x 13 x1024 -> 13 x 13 x1024
conv 512 1 x 1 / 1 13 x 13 x1024 -> 13 x 13 x 512 0.177 BF LOPs
conv 1024 3 x 3 / 1 13 x 13 x 512 -> 13 x 13 x1024 1.595 BF LOPs
conv 512 1 x 1 / 1 13 x 13 x1024 -> 13 x 13 x 512 0.177 BF LOPs
conv 1024 3 x 3 / 1 13 x 13 x 512 -> 13 x 13 x1024 1.595 BF LOPs
conv 512 1 x 1 / 1 13 x 13 x1024 -> 13 x 13 x 512 0.177 BF LOPs
conv 1024 3 x 3 / 1 13 x 13 x 512 -> 13 x 13 x1024 1.595 BFLOPs
conv 24 1 x 1 / 1 13 x 13 x1024 -> 13 x 13 x 24 0.008 BFLOPs
yolo
route 79
conv 256 1 x 1 / 1 13 x 13 x 512 -> 13 x 13 x 256 0.044 BFLOPs
upsample 2x 13 x 13 x 256 -> 26 x 26 x 256
route 85 61
conv 256 1 x 1 / 1 26 x 26 x 768 -> 26 x 26 x 256 0.266 BFLOPs
conv 512 3 x 3 / 1 26 x 26 x 256 -> 26 x 26 x 512 1.595 BFLOPs
conv 256 1 x 1 / 1 26 x 26 x 512 -> 26 x 26 x 256 0.177 BFLOPs
conv 512 3 x 3 / 1 26 x 26 x 256 -> 26 x 26 x 512 1.595 BFLOPs
conv 256 1 x 1 / 1 26 x 26 x 512 -> 26 x 26 x 256 0.177 BFLOPs
conv 512 3 x 3 / 1 26 x 26 x 256 -> 26 x 26 x 512 1.595 BFLOPs
conv 24 1 x 1 / 1 26 x 26 x 512 -> 26 x 26 x 24 0.017 BFLOPs
yolo
route 91
conv 128 1 x 1 / 1 26 x 26 x 256 -> 26 x 26 x 128 0.044 BFLOPs
upsample 2x 26 x 26 x 128 -> 52 x 52 x 128
route 97 36
conv 128 1 x 1 / 1 52 x 52 x 384 -> 52 x 52 x 128 0.266 BFLOPs
conv 256 3 x 3 / 1 52 x 52 x 128 -> 52 x 52 x 256 1.595 BFLOPs
conv 128 1 x 1 / 1 52 x 52 x 256 -> 52 x 52 x 128 0.177 BFLOPs
conv 256 3 x 3 / 1 52 x 52 x 128 -> 52 x 52 x 256 1.595 BFLOPs
conv 128 1 x 1 / 1 52 x 52 x 256 -> 52 x 52 x 128 0.177 BFLOPs
conv 256 3 x 3 / 1 52 x 52 x 128 -> 52 x 52 x 256 1.595 BFLOPs
conv 24 1 x 1 / 1 52 x 52 x 256 -> 52 x 52 x 24 0.033 BFLOPs
yolo
Loading weights from backup/yolov3-voc_20000.weights...Done!
Eminem.jpg: Predicted in 0.049594 seconds.
eye: 91%
head: 83%
批量测试图片并保存在自定义文件夹下
yolov3-voc.cfg(cfg文件夹下)文件中batch和subdivisions两项必须为1。
用下面代码替换detector.c文件(example文件夹下)的void test_detector函数(注意有3处要改成自己的路径)
void test_detector(char *datacfg, char *cfgfile, char *weightfile, char *filename, float thresh, float hier_thresh, char *outfile, int fullscreen)
{
list *options = read_data_cfg(datacfg);
char *name_list = option_find_str(options, "names", "data/names.list");
char **names = get_labels(name_list);
image **alphabet = load_alphabet();
network *net = load_network(cfgfile, weightfile, 0);
set_batch_network(net, 1);
srand(2222222);
double time;
char buff[256];
char *input = buff;
float nms=.45;
int i=0;
while(1){
if(filename){
strncpy(input, filename, 256);
image im = load_image_color(input,0,0);
image sized = letterbox_image(im, net->w, net->h);
//image sized = resize_image(im, net->w, net->h);
//image sized2 = resize_max(im, net->w);
//image sized = crop_image(sized2, -((net->w - sized2.w)/2), -((net->h - sized2.h)/2), net->w, net->h);
//resize_network(net, sized.w, sized.h);
layer l = net->layers[net->n-1];
float *X = sized.data;
time=what_time_is_it_now();
network_predict(net, X);
printf("%s: Predicted in %f seconds.\n", input, what_time_is_it_now()-time);
int nboxes = 0;
detection *dets = get_network_boxes(net, im.w, im.h, thresh, hier_thresh, 0, 1, &nboxes);
//printf("%d\n", nboxes);
//if (nms) do_nms_obj(boxes, probs, l.w*l.h*l.n, l.classes, nms);
if (nms) do_nms_sort(dets, nboxes, l.classes, nms);
draw_detections(im, dets, nboxes, thresh, names, alphabet, l.classes);
free_detections(dets, nboxes);
if(outfile)
{
save_image(im, outfile);
}
else{
save_image(im, "predictions");
#ifdef OPENCV
cvNamedWindow("predictions", CV_WINDOW_NORMAL);
if(fullscreen){
cvSetWindowProperty("predictions", CV_WND_PROP_FULLSCREEN, CV_WINDOW_FULLSCREEN);
}
show_image(im, "predictions");
cvWaitKey(0);
cvDestroyAllWindows();
#endif
}
free_image(im);
free_image(sized);
if (filename) break;
}
else {
printf("Enter Image Path: ");
fflush(stdout);
input = fgets(input, 256, stdin);
if(!input) return;
strtok(input, "\n");
list *plist = get_paths(input);
char **paths = (char **)list_to_array(plist);
printf("Start Testing!\n");
int m = plist->size;
if(access("/home/learner/darknet/data/out",0)==-1)//"/home/learner/darknet/data"修改成自己的路径
{
if (mkdir("/home/learner/darknet/data/out",0777))//"/home/learner/darknet/data"修改成自己的路径
{
printf("creat file bag failed!!!");
}
}
for(i = 0; i < m; ++i){
char *path = paths[i];
image im = load_image_color(path,0,0);
image sized = letterbox_image(im, net->w, net->h);
//image sized = resize_image(im, net->w, net->h);
//image sized2 = resize_max(im, net->w);
//image sized = crop_image(sized2, -((net->w - sized2.w)/2), -((net->h - sized2.h)/2), net->w, net->h);
//resize_network(net, sized.w, sized.h);
layer l = net->layers[net->n-1];
float *X = sized.data;
time=what_time_is_it_now();
network_predict(net, X);
printf("Try Very Hard:");
printf("%s: Predicted in %f seconds.\n", path, what_time_is_it_now()-time);
int nboxes = 0;
detection *dets = get_network_boxes(net, im.w, im.h, thresh, hier_thresh, 0, 1, &nboxes);
//printf("%d\n", nboxes);
//if (nms) do_nms_obj(boxes, probs, l.w*l.h*l.n, l.classes, nms);
if (nms) do_nms_sort(dets, nboxes, l.classes, nms);
draw_detections(im, dets, nboxes, thresh, names, alphabet, l.classes);
free_detections(dets, nboxes);
if(outfile){
save_image(im, outfile);
}
else{
char b[2048];
sprintf(b,"/home/learner/darknet/data/out/%s",GetFilename(path));//"/home/leaner/darknet/data"修改成自己的路径
save_image(im, b);
printf("save %s successfully!\n",GetFilename(path));
#ifdef OPENCV
cvNamedWindow("predictions", CV_WINDOW_NORMAL);
if(fullscreen){
cvSetWindowProperty("predictions", CV_WND_PROP_FULLSCREEN, CV_WINDOW_FULLSCREEN);
}
show_image(im, "predictions");
cvWaitKey(0);
cvDestroyAllWindows();
#endif
}
free_image(im);
free_image(sized);
if (filename) break;
}
}
}
}
并且在detector.c中增加头文件如下:
#include <unistd.h> /* Many POSIX functions (but not all, by a large margin) */
#include <fcntl.h> /* open(), creat() - and fcntl() */
在前面添加*GetFilename(char *p)函数
#include "darknet.h"
#include <sys/stat.h> //需增加的头文件
#include<stdio.h>
#include<time.h>
#include<sys/types.h> //需增加的头文件
static int coco_ids[] = {1,2,3,4,5,6,7,8,9,10,11,13,14,15,16,17,18,19,20,21,22,23,24,25,27,28,31,32,33,34,35,36,37,38,39,40,41,42,43,44,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,67,70,72,73,74,75,76,77,78,79,80,81,82,84,85,86,87,88,89,90};
char *GetFilename(char *p)
{
static char name[20]={""};
char *q = strrchr(p,'/') + 1;
strncpy(name,q,6);
return name;
}
在darknet下重新make
make
make clean
执行批量测试命令如下
./darknet detector test cfg/voc.data cfg/yolov3-voc.cfg backup/yolov3-voc_20000.weights
layer filters size input output
conv 32 3 x 3 / 1 416 x 416 x 3 -> 416 x 416 x 32 0.299 BFLOPs
conv 64 3 x 3 / 2 416 x 416 x 32 -> 208 x 208 x 64 1.595 BFLOPs
.......
conv 256 3 x 3 / 1 52 x 52 x 128 -> 52 x 52 x 256 1.595 BFLOPs
conv 255 1 x 1 / 1 52 x 52 x 256 -> 52 x 52 x 255 0.353 BFLOPs
detection
Loading weights from yolov3.weights...Done!
Enter Image Path:
Enter Image Path:后面输入你的txt文件路径(你准备好的所有测试图片的路径全部存放在一个txt文件里),你可以复制voc.data文件里的valid后面的路径,就可以了,如:
/home/xxx/darknet/data/voc/2007_test.txt
然后所有的图片都保存在了data/out文件夹下
执行语句如下:/*在终端只返回用时,在./results/comp4_det_test_[类名].txt里保存测试结果*/
./darknet detector valid cfg/voc.data cfg/yolov3-voc.cfg backup/yolov3-voc_20000.weights
运行如下代码:
./darknet detector demo cfg/voc.data cfg/yolov3-voc.cfg backup/yolov3-voc_40000.weights Cow_video.mp4
视频检测及保存相关文件:https://blog.csdn.net/cgt19910923/article/details/80525366
测试视频主要是调用到detector demo,主要修改的是demo.c中的demo函数。
./darknet detector demo ./cfg/voc.data ./cfg/yolov3-voc.cfg ./results/yolov3-voc_final.weights 1.mp4 -gpus 0,1
在image.c中定义视频保存函数
void save_video(image p, CvVideoWriter *mVideoWriter)
{
image copy = copy_image(p);
if(p.c == 3) rgbgr_image(copy);
int x,y,k;
IplImage *disp = cvCreateImage(cvSize(p.w,p.h), IPL_DEPTH_8U, p.c);
int step = disp->widthStep;
for(y = 0; y < p.h; ++y){
for(x = 0; x < p.w; ++x){
for(k= 0; k < p.c; ++k){
disp->imageData[y*step + x*p.c + k] = (unsigned char)(get_pixel(copy,x,y,k)*255);
}
}
}
cvWriteFrame(mVideoWriter,disp);
cvReleaseImage(&disp);
free_image(copy);
}
修改demo.c:
#define DEMO 1
#define SAVEVIDEO
#ifdef OPENCV
#ifdef SAVEVIDEO
static CvVideoWriter *mVideoWriter;
#endif
void demo(char *cfgfile, char *weightfile, float thresh, int cam_index, const char *filename, char **names, int classes, int delay, char *prefix, int avg_frames, float hier, int w, int h, int frames, int fullscreen)
{
//demo_frame = avg_frames;
image **alphabet = load_alphabet();
demo_names = names;
demo_alphabet = alphabet;
demo_classes = classes;
demo_thresh = thresh;
demo_hier = hier;
printf("Demo\n");
net = load_network(cfgfile, weightfile, 0);
set_batch_network(net, 1);
pthread_t detect_thread;
pthread_t fetch_thread;
srand(2222222);
int i;
demo_total = size_network(net);
predictions = calloc(demo_frame, sizeof(float*));
for (i = 0; i < demo_frame; ++i){
predictions[i] = calloc(demo_total, sizeof(float));
}
avg = calloc(demo_total, sizeof(float));
if(filename){
printf("video file: %s\n", filename);
cap = cvCaptureFromFile(filename);
#ifdef SAVEVIDEO
if(cap){
//int mfps = cvGetCaptureProperty(cap,CV_CAP_PROP_FPS); //local video file,needn't change
int mfps = 200;
mVideoWriter=cvCreateVideoWriter("Output.avi",CV_FOURCC('M','J','P','G'),mfps,cvSize(cvGetCaptureProperty(cap,CV_CAP_PROP_FRAME_WIDTH),cvGetCaptureProperty(cap,CV_CAP_PROP_FRAME_HEIGHT)),1);
}
#endif
}else{
cap = cvCaptureFromCAM(cam_index);
#ifdef SAVEVIDEO
if(cap){
//int mfps = cvGetCaptureProperty(cap,CV_CAP_PROP_FPS); //webcam video file,need change.
int mfps = 25; //the output video FPS,you can set here.
mVideoWriter=cvCreateVideoWriter("Output_webcam.avi",CV_FOURCC('M','J','P','G'),mfps,cvSize(cvGetCaptureProperty(cap,CV_CAP_PROP_FRAME_WIDTH),cvGetCaptureProperty(cap,CV_CAP_PROP_FRAME_HEIGHT)),1);
}
#endif
if(w){
cvSetCaptureProperty(cap, CV_CAP_PROP_FRAME_WIDTH, w);
}
if(h){
cvSetCaptureProperty(cap, CV_CAP_PROP_FRAME_HEIGHT, h);
}
if(frames){
cvSetCaptureProperty(cap, CV_CAP_PROP_FPS, frames);
}
}
if(!cap) error("Couldn't connect to webcam.\n");
buff[0] = get_image_from_stream(cap);
buff[1] = copy_image(buff[0]);
buff[2] = copy_image(buff[0]);
buff_letter[0] = letterbox_image(buff[0], net->w, net->h);
buff_letter[1] = letterbox_image(buff[0], net->w, net->h);
buff_letter[2] = letterbox_image(buff[0], net->w, net->h);
ipl = cvCreateImage(cvSize(buff[0].w,buff[0].h), IPL_DEPTH_8U, buff[0].c);
int count = 0;
if(!prefix){
cvNamedWindow("Demo", CV_WINDOW_NORMAL);
if(fullscreen){
cvSetWindowProperty("Demo", CV_WND_PROP_FULLSCREEN, CV_WINDOW_FULLSCREEN);
} else {
cvMoveWindow("Demo", 0, 0);
cvResizeWindow("Demo", 1352, 1013);
}
}
demo_time = what_time_is_it_now();
while(!demo_done){
buff_index = (buff_index + 1) %3;
if(pthread_create(&fetch_thread, 0, fetch_in_thread, 0)) error("Thread creation failed");
if(pthread_create(&detect_thread, 0, detect_in_thread, 0)) error("Thread creation failed");
if(!prefix){
#ifdef SAVEVIDEO
save_video(buff[0],mVideoWriter);
#endif
fps = 1./(what_time_is_it_now() - demo_time);
demo_time = what_time_is_it_now();
display_in_thread(0);
}else{
char name[256];
sprintf(name, "%s_%08d", prefix, count);
#ifdef SAVEVIDEO
save_video(buff[0],mVideoWriter);
#else
save_image(buff[(buff_index + 1)%3], name);
#endif
}
pthread_join(fetch_thread, 0);
pthread_join(detect_thread, 0);
++count;
}
}
用来测试录像视频,其中mfps为保存的处理帧率,保存结果为Output.avi
替换examples/detector.c的validate_detector_recall函数如下,且调用时增加datacfg参数。
void validate_detector_recall(char *datacfg, char *cfgfile, char *weightfile)
{
/*
network net = parse_network_cfg_custom(cfgfile, 1); // set batch=1
if (weightfile) {
load_weights(&net, weightfile);
}
//set_batch_network(&net, 1);
fuse_conv_batchnorm(net);
srand(time(0));
*/
network *net = load_network(cfgfile, weightfile, 0);
set_batch_network(net, 1);
fprintf(stderr, "Learning Rate: %g, Momentum: %g, Decay: %g\n", net->learning_rate, net->momentum, net->decay);
srand(time(0));
//list *plist = get_paths("data/coco_val_5k.list");
list *options = read_data_cfg(datacfg);
char *valid_images = option_find_str(options, "valid", "data/train.txt");
list *plist = get_paths(valid_images);
char **paths = (char **)list_to_array(plist);
//layer l = net.layers[net.n - 1];
layer l = net->layers[net->n-1];
int j, k;
int m = plist->size;
int i = 0;
float thresh = .001;
float iou_thresh = .5;
float nms = .4;
int total = 0;
int correct = 0;
int proposals = 0;
float avg_iou = 0;
for (i = 0; i < m; ++i) {
char *path = paths[i];
image orig = load_image_color(path, 0, 0);
// image orig = load_image(path, 0, 0, net.c);
image sized = resize_image(orig, net->w, net->h);
char *id = basecfg(path);
network_predict(net, sized.data);
int nboxes = 0;
detection *dets = get_network_boxes(net, sized.w, sized.h, thresh, .5, 0, 1, &nboxes);
if (nms) do_nms_obj(dets, nboxes, 1, nms);
char labelpath[4096];
// replace_image_to_label(path, labelpath);
find_replace(path, "images", "labels", labelpath);
find_replace(labelpath, "JPEGImages", "labels", labelpath);
find_replace(labelpath, ".jpg", ".txt", labelpath);
find_replace(labelpath, ".JPEG", ".txt", labelpath);
int num_labels = 0;
box_label *truth = read_boxes(labelpath, &num_labels);
for (k = 0; k < nboxes; ++k) {
if (dets[k].objectness > thresh) {
++proposals;
}
}
for (j = 0; j < num_labels; ++j) {
++total;
box t = { truth[j].x, truth[j].y, truth[j].w, truth[j].h };
float best_iou = 0;
for (k = 0; k < nboxes; ++k) { //重点在这里
float iou = box_iou(dets[k].bbox, t);
if (dets[k].objectness > thresh && iou > best_iou) {
best_iou = iou;
}
}
avg_iou += best_iou;
if (best_iou > iou_thresh) {
++correct;
}
}
//fprintf(stderr, " %s - %s - ", paths[i], labelpath);
fprintf(stderr, "%5d %5d %5d\tRPs/Img: %.2f\tIOU: %.2f%%\tRecall:%.2f%%\n", i, correct, total, (float)proposals / (i + 1), avg_iou * 100 / total, 100.*correct / total);
free(id);
free_image(orig);
free_image(sized);
}
}
调用更换如下:
validate_detector_recall(datacfg, cfg, weights);
============================================
下载第三方库,重新编译运行
git clone https://github.com/LianjiLi/yolo-compute-map.git
修改darknet/examples/detector.c中validate_detector()
char *valid_images = option_find_str(options, "valid", "./data/2007_test.txt");//改成自己的测试文件路径
if(!outfile) outfile = "comp4_det_test_";
fps = calloc(classes, sizeof(FILE *));
for(j = 0; j < classes; ++j){
snprintf(buff, 1024, "%s/%s.txt", prefix, names[j]);//删除outfile参数以及对应的%s
fps[j] = fopen(buff, "w");
darknet文件夹下运行
./darknet detector valid cfg/voc.data cfg/yolov3-tiny.cfg backup/yolov3tiny_164000.weights(改为自己的模型路径)
在本文件夹下运行python compute_mAP.py
compute_mAP.py中的test.txt文件内容只有文件名字,不带绝对路径,不带后缀
YOLOv3批量测试图片并保存在自定义文件夹下
https://blog.csdn.net/mieleizhi0522/article/details/79989754
YOLO官网
https://pjreddie.com/darknet/yolo/
DarkNet-YOLOv3 训练自己的数据集 Ubuntu16.04+cuda8.0
https://zhuanlan.zhihu.com/p/35490655
DarkNet-Yolo使用指南
https://clavichord93.wordpress.com/2017/05/11/darknetyolo-shi-yong-zhi-nan/
源码出处
https://github.com/pjreddie/darknet
YOLOv3训练自己的数据(GPU版本)
https://blog.csdn.net/u012135425/article/details/80294884
较为全面
https://blog.csdn.net/runner668/article/details/80579063
视频检测
https://blog.csdn.net/cgt19910923/article/details/80525366
https://blog.csdn.net/sinat_33718563/article/details/79964758
yolov3 improve the final train and detect
https://github.com/AlexeyAB/darknet
recall实现参考
https://github.com/AlexeyAB/darknet#how-to-improve-object-detection
整体参考:
https://clavichord93.wordpress.com/2017/05/11/darknetyolo-shi-yong-zhi-nan/
https://blog.csdn.net/eloise_29/article/details/70215338
后续参考
https://pjreddie.com/darknet/yolo/#demo
https://blog.csdn.net/helloworld1213800/article/details/79749359
https://blog.csdn.net/cgt19910923/article/details/79725875
https://blog.csdn.net/jahonn/article/details/80824014
https://blog.csdn.net/lilai619/article/details/79695109
https://blog.csdn.net/sinat_33718563/article/details/79964758
https://maozezhong.github.io/2018/04/29/yolo系列/yolo系列(1):使用yolov3检测红绿灯/
https://zhuanlan.zhihu.com/p/35490655
https://clavichord93.wordpress.com/2017/05/11/darknetyolo-shi-yong-zhi-nan/
https://github.com/pjreddie/darknet
https://blog.csdn.net/Patrick_Lxc/article/details/80615433
https://blog.csdn.net/zhangjunbob/article/details/52769381
darknet的浅层特征可视化请参看:https://www.cnblogs.com/pprp/p/10146355.html
AlexyAB大神总结的优化经验请参看:https://www.cnblogs.com/pprp/p/10204480.html
如何使用Darknet进行分类请参看:https://www.cnblogs.com/pprp/p/10342335.html
Darknet loss可视化软件请参看:https://www.cnblogs.com/pprp/p/10248436.html
如何设计更改YOLO网络结构:https://pprp.github.io/2018/09/20/tricks.html
YOLO详细改进总结:https://pprp.github.io/2018/06/20/yolo.html
SSD关键源码解析
https://zhuanlan.zhihu.com/p/25100992