本项目基于MindSpre框架、YOLOv3-Darknet53、VisDrone数据集实现目标检测与计数。
1.项目地址
GitHub - whitewings-hub/mindspore-yolov3-vehicle_counting: training mindspore yolov3 model and counting vehicle
2.环境准备
MindSpore版本为1.5。
3.数据集处理
VisDrone数据集下载http://aiskyeye.com/download/object-detection-2/
需要将原始VisDrone数据集转换为coco格式,然后存放在本地目录
使用mindspore-yolov3-vehicle_counting/VisDrone2coco.py at main · whitewings-hub/mindspore-yolov3-vehicle_counting · GitHub来进行处理,python VisDrone2coco.py即可。
4.基于albumentations的数据增强
mindspore-yolov3-vehicle_counting/transforms.py at main · whitewings-hub/mindspore-yolov3-vehicle_counting · GitHub
使用了albumentations库中的RandomBrightnessContrast方法、HueSaturationValue方法、Cutout方法进行随机调整亮度和对比度、随机调整输入图像的色调饱和度、在图像中生成正方形区域来对图像数据进行增强,通过albumentations库中的Compose方法把这三个图像数据增强变换放在一起按顺序执行,并在后面读取图片进行图像数据增强。
transform = A.Compose([
A.RandomBrightnessContrast(p=0.5),
A.HueSaturationValue(),
A.Cutout(num_holes=10, max_h_size=20, max_w_size=20, fill_value=0, p=0.5)
])
5.DIoU-NMS
将普通NMS算法替换为DIoU-NMS算法。在传统NMS代码的基础上,前面按置信度得分从高到底排序并选择置信度得分最高的候选框的操作是一样的,主要增加的额外变量就是两框最小外接框的对角线长的平方以及两框中心点距离的平方,之后再根据DIoU计算的公式进行计算得到DIoU然后过滤掉DIoU高于阈值的框,保留置信度分数最高的目标框并对其余剩下的目标框进行递归。
def _diou_nms(self, dets, thresh=0.6):
x1 = dets[:, 0]
y1 = dets[:, 1]
x2 = x1 + dets[:, 2]
y2 = y1 + dets[:, 3]
scores = dets[:, 4]
areas = (x2 - x1 + 1) * (y2 - y1 + 1)
order = scores.argsort()[::-1]
keep = []
while order.size > 0:
i = order[0]
keep.append(i)
xx1 = np.maximum(x1[i], x1[order[1:]])
yy1 = np.maximum(y1[i], y1[order[1:]])
xx2 = np.minimum(x2[i], x2[order[1:]])
yy2 = np.minimum(y2[i], y2[order[1:]])
w = np.maximum(0.0, xx2 - xx1 + 1)
h = np.maximum(0.0, yy2 - yy1 + 1)
inter = w * h
ovr = inter / (areas[i] + areas[order[1:]] - inter)
center_x1 = (x1[i] + x2[i]) / 2
center_x2 = (x1[order[1:]] + x2[order[1:]]) / 2
center_y1 = (y1[i] + y2[i]) / 2
center_y2 = (y1[order[1:]] + y2[order[1:]]) / 2
inter_diag = (center_x2 - center_x1) ** 2 + (center_y2 - center_y1) ** 2
out_max_x = np.maximum(x2[i], x2[order[1:]])
out_max_y = np.maximum(y2[i], y2[order[1:]])
out_min_x = np.minimum(x1[i], x1[order[1:]])
out_min_y = np.minimum(y1[i], y1[order[1:]])
outer_diag = (out_max_x - out_min_x) ** 2 + (out_max_y - out_min_y) ** 2
diou = ovr - inter_diag / outer_diag
diou = np.clip(diou, -1, 1)
inds = np.where(diou <= thresh)[0]
order = order[inds + 1]
return keep
6.模型训练、验证、使用
模型训练、验证、使用分别对应train.py,eval.py,predict.py,详细方法参考mindspore-yolov3-vehicle_counting/README.md at main · whitewings-hub/mindspore-yolov3-vehicle_counting · GitHub
7.车辆检测与计数
计数功能实现。在前面从图像中检测出来的目标框出来并显示目标的类别的基础上,统计每个类别的目标的数量,即创建一个列表,每个目标类别对应一个项,初始值为0,在给物体画目标检测框的时候顺便在刚才创建列表中目标类别对应的项加1,在画框结束后,对于每种目标类别对应的数量也保存到了列表中,然后构建好表示目标类别和数量的字符串并通过cv2库中的putText方法将统计到的目标的种类及相对应的数量添加到图片当中。
def draw_boxes_in_image(self, img_path):
num_record = [0 for i in range(12)]
img = cv2.imread(img_path, 1)
for i in range(len(self.det_boxes)):
x = int(self.det_boxes[i]['bbox'][0])
y = int(self.det_boxes[i]['bbox'][1])
w = int(self.det_boxes[i]['bbox'][2])
h = int(self.det_boxes[i]['bbox'][3])
cv2.rectangle(img, (x, y), (x + w, y + h), (0, 225, 0), 1)
score = round(self.det_boxes[i]['score'], 3)
classname = self.det_boxes[i]['category_id']
text = self.det_boxes[i]['category_id'] + ', ' + str(score)
cv2.putText(img, text, (x, y), cv2.FONT_HERSHEY_PLAIN, 2, (0, 0, 225), 2)
num_record[label_list.index(classname)] = num_record[label_list.index(classname)] + 1
result_str = ""
for ii in range(12):
current_name = label_list[ii]
current_num = num_record[ii]
if current_num != 0:
result_str = result_str + "{}:{} ".format(current_name, current_num)
font = cv2.FONT_HERSHEY_SIMPLEX
img = cv2.putText(img, result_str, (20, 20), font, 0.5, (255, 0, 0), 2)
return img
详细代码运行方法参考mindspore-yolov3-vehicle_counting/README.md at main · whitewings-hub/mindspore-yolov3-vehicle_counting · GitHub
运行效果图如下:
检测前
检测后
8.项目参考
models: Models of MindSpore - Gitee.com
GitHub - leonwanghui/ms-yolov3-basketball: This is a tutorial for training MindSpore Yolov3-DarkNet53 model to detecting basktball games.