完整源码:https://download.csdn.net/download/bibinGee/12278566
YOLO相关原理及数据集可以通过这个链接查看:https://pjreddie.com/darknet/yolo/
本博文将介绍使用YOLO结合opencv进行目标检测,需要用到的的资源有coco.names/yolov3.cfg/yolov3.weights,这些文件都可以从darknet或者github上找到。GitHub资源通过这个链接找到:https://github.com/pjreddie/darknet
coco.names
yolov3.cfg
yolov3.weight
需要用的python库有:
import numpy as np
import argparse
import imutils
import time
import cv2
import os
项目的文件结构如下:
首先加载yolov3.weight和yolov3.cfg文件
# derive the paths to the YOLO weights and model configuration
weightsPath = os.path.sep.join([args["yolo"], "yolov3.weights"])
configPath = os.path.sep.join([args["yolo"], "yolov3.cfg"])
# load our YOLO object detector trained on COCO dataset (80 classes)
print("[INFO] loading YOLO from disk...")
net = cv2.dnn.readNetFromDarknet(configPath, weightsPath)
这里用到了opencv中的 dnn.readNetFromDarknet(),这个是opencv提供的深度神经网络学习的函数,有意思的是它似乎专门给darknet框架写的,如下关于这个函数的注释。
""
readNetFromDarknet(cfgFile[, darknetModel]) -> retval
. @brief Reads a network model stored in Darknet model files.
. * @param cfgFile path to the .cfg file with text description of the network architecture.
. * @param darknetModel path to the .weights file with learned network.
. * @returns Network object that ready to do forward, throw an exception in failure cases.
. * @returns Net object.
""
接着对输入的图像进行预处理:
blob = cv2.dnn.blobFromImage(image, 1 / 255.0, (416, 416), swapRB=True, crop=False)
net.setInput(blob)
这里用到blogFormImage()这个函数,这个函数主要执行下面3个功能:
1. 均值减法
2. 缩放
3. 频道交换
下面就可以对输入的图像进行分类识别了:
# loop over each of the layer outputs
for output in layerOutputs:
# loop over each of the detections
for detection in output:
# extract the class ID and confidence (i.e., probability) of
# the current object detection
scores = detection[5:]
classID = np.argmax(scores)
confidence = scores[classID]
# filter out weak predictions by ensuring the detected
# probability is greater than the minimum probability
if confidence > args["confidence"]:
# scale the bounding box coordinates back relative to the
# size of the image, keeping in mind that YOLO actually
# returns the center (x, y)-coordinates of the bounding
# box followed by the boxes' width and height
box = detection[0:4] * np.array([W, H, W, H])
(centerX, centerY, width, height) = box.astype("int")
# use the center (x, y)-coordinates to derive the top and
# and left corner of the bounding box
x = int(centerX - (width / 2))
y = int(centerY - (height / 2))
# update our list of bounding box coordinates, confidences,
# and class IDs
boxes.append([x, y, int(width), int(height)])
confidences.append(float(confidence))
classIDs.append(classID)
完成后就可以将目标标注出来了:
# ensure at least one detection exists
if len(idxs) > 0:
# loop over the indexes we are keeping
for i in idxs.flatten():
# extract the bounding box coordinates
(x, y) = (boxes[i][0], boxes[i][1])
(w, h) = (boxes[i][2], boxes[i][3])
# draw a bounding box rectangle and label on the image
color = [int(c) for c in COLORS[classIDs[i]]]
print(color)
cv2.rectangle(image, (x, y), (x + w, y + h), color, 2)
text = "{}: {:.4f}".format(LABELS[classIDs[i]], confidences[i])
cv2.putText(image, text, (x, y - 5), cv2.FONT_ITALIC, 0.5, [0, 0, 0], 2)
下面是利用yolo提供的数据集和图像例子识别出来的目标,准确率还很高。
然而事情也不是一帆风顺,当输入不在训练好的数据的目标的时,就会识别出错,比如下面两位仁兄就很搞笑了,关键时给出的准确度去到了99%和88%,不得不说有一个专用的训练集真的很重要。