目标跟踪的过程是:
此外,对象跟踪允许我们为每个跟踪对象应用唯一 ID,从而使我们能够计算视频中的唯一对象。对象跟踪对于构建人员计数器至关重要。
理想的目标跟踪算法是:
这对于任何计算机视觉或图像处理算法来说都是一项艰巨的任务,我们可以使用各种技巧来帮助改进我们的对象跟踪器。
在今天的博文中,你将学习如何使用OpenCV实现质心跟踪,质心跟踪是一种简单易懂但高效的跟踪算法。
质心跟踪依赖于视频中(1)已有的目标质心(即质心跟踪器已经见过的目标)与(2)后续帧之间的新目标质心之间的欧氏距离。
我们将在下一节更深入地回顾质心算法。从那里我们将实现一个 Python 类来包含我们的质心跟踪算法,然后创建一个 Python 脚本来实际运行对象跟踪器并将其应用于输入视频。
最后,我们将运行我们的对象跟踪器并检查结果,同时指出该算法的优点和缺点。
质心跟踪算法是一个多步骤的过程。我们将回顾本节中的每个跟踪步骤。
要使用质心跟踪构建简单的对象跟踪算法,第一步是从对象检测器获得边界框坐标并使用它们来计算质心。
质心跟踪算法假设我们为每一帧中的每个检测到的对象传入一组边界框 (x, y) 坐标。
这些边界框可以由您想要的任何类型的对象检测器(颜色阈值 + 轮廓提取、Haar 级联、HOG + 线性 SVM、SSD、Faster R-CNN 等)生成。
一旦我们有了边界框坐标,我们就必须计算“质心”,或者更简单地说,计算边界框的中心 (x, y) 坐标。上面的图演示了接受一组边界框坐标并计算质心。
由于这些是呈现给我们算法的第一组初始边界框,我们将为它们分配唯一的 ID。
此图像中存在三个对象。我们需要计算每对原始质心(紫色)和新质心(黄色)之间的欧几里得距离。
对于视频流中随后的每一帧,我们应用步骤#1
计算对象质心;然而,我们首先需要确定是否可以将新的对象质心(黄色)与旧的对象质心(紫色)相关联,而不是为每个检测到的对象分配一个新的唯一ID(这将违背对象跟踪的目的)。为了完成这个过程,我们计算每对现有对象质心和输入对象质心之间的欧几里德距离(绿色或红色箭头突出显示)。
然后我们计算每对原始质心(紫色)和新质心(黄色)之间的欧几里得距离。但是我们如何使用这些点之间的欧几里得距离来实际匹配它们并关联它们呢?
答案在Step #3
。
我们简单的质心对象跟踪方法将对象与最小的对象距离相关联。我们如何处理上图左下角的对象呢?
质心跟踪算法的主要假设是给定对象可能会在后续帧之间移动,但帧的质心之间的距离将小于对象之间的所有其他距离。
因此,如果我们选择将质心与后续帧之间的最小距离相关联,我们可以构建我们的对象跟踪器。
但是左下角的孤独点呢?
在我们使用 Python 和 OpenCV 进行对象跟踪的示例中,我们有一个与现有对象不匹配的新对象,因此它被注册为对象 ID #3。
如果输入检测比跟踪的现有对象多,我们需要注册新对象。 “注册”只是意味着我们通过以下方式将新对象添加到我们的跟踪对象列表中:
然后我们可以返回到步骤#2
,并为视频流中的每一帧重复步骤管道。
任何合理的对象跟踪算法都需要能够处理对象丢失、消失或离开视野的情况。
您如何处理这些情况实际上取决于您的对象跟踪器的部署位置,但是对于此实现,当旧对象无法与任何现有对象匹配总共 N 个后续帧时,我们将取消注册。
要在终端中查看今天的项目结构,只需使用 tree
命令:
$ tree --dirsfirst
.
├── pyimagesearch
│ ├── __init__.py
│ └── centroidtracker.py
├── object_tracker.py
├── deploy.prototxt
└── res10_300x300_ssd_iter_140000.caffemodel
在我们可以对输入视频流应用对象跟踪之前,我们首先需要实现质心跟踪算法。当您消化这个质心跟踪器脚本时,请记住上面的步骤 1-5
,并根据需要查看这些步骤。
正如您将看到的,将步骤转换为代码需要很多思考,虽然我们执行所有步骤,但由于我们各种数据结构和代码结构的性质,它们不是线性的。
我会建议 :
一旦你确定你理解了质心跟踪算法的步骤,打开 pyimagesearch
模块中的 centroidtracker.py
,让我们回顾一下代码:
# import the necessary packages
from scipy.spatial import distance as dist
from collections import OrderedDict
import numpy as np
class CentroidTracker():
def __init__(self, maxDisappeared=50):
# initialize the next unique object ID along with two ordered
# dictionaries used to keep track of mapping a given object
# ID to its centroid and number of consecutive frames it has
# been marked as "disappeared", respectively
self.nextObjectID = 0
self.objects = OrderedDict()
self.disappeared = OrderedDict()
# store the number of maximum consecutive frames a given
# object is allowed to be marked as "disappeared" until we
# need to deregister the object from tracking
self.maxDisappeared = maxDisappeared
我们导入我们需要的包和模块—— distance 、 OrderedDict 和 numpy 。
首先我们定义CentroidTracker 类。构造函数接受一个参数,即跟踪器可以容忍的给定对象丢失/消失的最大连续帧数。
我们的构造函数构建了四个类变量:
nextObjectID
:用于为每个对象分配唯一 ID 的计数器。如果对象离开帧并且在 maxDisappeared
帧中没有返回,则将分配一个新的(下一个)对象 ID。objects
:对象 ID 作为键和质心 (x, y) 坐标作为值的字典disappeared
:保存特定对象 ID(键)已被标记为“丢失”的连续帧数(值)maxDisappeared
:在我们取消注册该对象之前,允许将对象标记为“丢失/消失”的连续帧数。让我们定义负责向我们的跟踪器添加新对象的 register
方法:
def register(self, centroid):
# when registering an object we use the next available object
# ID to store the centroid
self.objects[self.nextObjectID] = centroid
self.disappeared[self.nextObjectID] = 0
self.nextObjectID += 1
定义register 方法,它接受一个质心centroid
,然后使用下一个可用的对象 ID 将其添加到objects
字典中。 对象消失的次数在disappeared
字典中初始化为 0
。 最后,我们递增 nextObjectID
,这样如果一个新对象进入视野,它将与一个唯一 ID 相关联。 与我们的register
方法类似,我们也需要一个deregister
方法:
def deregister(self, objectID):
# to deregister an object ID we delete the object ID from
# both of our respective dictionaries
del self.objects[objectID]
del self.disappeared[objectID]
就像我们可以向跟踪器添加新对象一样,我们还需要能够从输入帧中删除丢失或消失的旧对象。
定义deregister
方法,它简单地分别删除objects
和disappeared
字典中的 objectID
。
我们的质心跟踪器实现的核心位于update
方法中:
def update(self, rects):
# check to see if the list of input bounding box rectangles
# is empty
if len(rects) == 0:
# loop over any existing tracked objects and mark them
# as disappeared
for objectID in list(self.disappeared.keys()):
self.disappeared[objectID] += 1
# if we have reached a maximum number of consecutive
# frames where a given object has been marked as
# missing, deregister it
if self.disappeared[objectID] > self.maxDisappeared:
self.deregister(objectID)
# return early as there are no centroids or tracking info
# to update
return self.objects
定义的更新方法接受边界框矩形列表,可能来自对象检测器(Haar 级联、HOG + 线性 SVM、SSD、Faster R-CNN 等)。 rects
参数的格式假定为具有以下结构的元组: (startX, startY, endX, endY)
。
如果没有检测到,我们将遍历所有对象 ID 并增加它们的disappeared
计数。我们还将检查是否已达到给定对象被标记为丢失的最大连续帧数。如果是这种情况,我们需要将其从我们的跟踪系统中删除。由于没有要更新的跟踪信息,我们继续前进并提前return
。
否则,在接下来的7个update
方法的代码块中,我们有很多工作要做:
# initialize an array of input centroids for the current frame
inputCentroids = np.zeros((len(rects), 2), dtype="int")
# loop over the bounding box rectangles
for (i, (startX, startY, endX, endY)) in enumerate(rects):
# use the bounding box coordinates to derive the centroid
cX = int((startX + endX) / 2.0)
cY = int((startY + endY) / 2.0)
inputCentroids[i] = (cX, cY)
我们将初始化一个 NumPy 数组inputCentroids
来存储每个 rect
的质心。
然后,我们遍历边界框矩形并计算质心并将其存储在 inputCentroids
列表中。
如果当前没有我们正在跟踪的对象,我们将注册每个新对象:
# if we are currently not tracking any objects take the input
# centroids and register each of them
if len(self.objects) == 0:
for i in range(0, len(inputCentroids)):
self.register(inputCentroids[i])
否则,我们需要根据最小化它们之间欧几里得距离的质心位置来更新任何现有对象 (x, y) 坐标:
# otherwise, are are currently tracking objects so we need to
# try to match the input centroids to existing object
# centroids
else:
# grab the set of object IDs and corresponding centroids
objectIDs = list(self.objects.keys())
objectCentroids = list(self.objects.values())
# compute the distance between each pair of object
# centroids and input centroids, respectively -- our
# goal will be to match an input centroid to an existing
# object centroid
D = dist.cdist(np.array(objectCentroids), inputCentroids)
# in order to perform this matching we must (1) find the
# smallest value in each row and then (2) sort the row
# indexes based on their minimum values so that the row
# with the smallest value is at the *front* of the index
# list
rows = D.min(axis=1).argsort()
# next, we perform a similar process on the columns by
# finding the smallest value in each column and then
# sorting using the previously computed row index list
cols = D.argmin(axis=1)[rows]
对现有跟踪对象的更新从else 开始。目标是跟踪对象并保持正确的对象 ID——这个过程是通过计算所有 objectCentroids
和 inputCentroids
对之间的欧几里德距离来完成的,然后关联最小化欧几里得距离的对象 ID。
在else 块中,我们将:
objectID
和 objectCentroid
值D
的输出形状将是 (# of object centroids, # of input centroids)
。下一步是使用距离来查看我们是否可以关联对象 ID:
# in order to determine if we need to update, register,
# or deregister an object we need to keep track of which
# of the rows and column indexes we have already examined
usedRows = set()
usedCols = set()
# loop over the combination of the (row, column) index
# tuples
for (row, col) in zip(rows, cols):
# if we have already examined either the row or
# column value before, ignore it
# val
if row in usedRows or col in usedCols:
continue
# otherwise, grab the object ID for the current row,
# set its new centroid, and reset the disappeared
# counter
objectID = objectIDs[row]
self.objects[objectID] = inputCentroids[col]
self.disappeared[objectID] = 0
# indicate that we have examined each of the row and
# column indexes, respectively
usedRows.add(row)
usedCols.add(col)
在上面的代码块中,我们:
(row, col)
索引元组的组合以更新我们的对象质心:
row 和 col
添加到它们各自的 usedRows
和 usedCols
集中在我们的 usedRows + usedCols
集合中可能有我们尚未检查的索引:
# compute both the row and column index we have NOT yet
# examined
unusedRows = set(range(0, D.shape[0])).difference(usedRows)
unusedCols = set(range(0, D.shape[1])).difference(usedCols)
所以我们必须确定哪些质心索引我们还没有检查,并将它们存储在两个新的集合中(unusedRows 和unusedCols
)
我们的最后处理任何丢失或可能消失的对象:
# in the event that the number of object centroids is
# equal or greater than the number of input centroids
# we need to check and see if some of these objects have
# potentially disappeared
if D.shape[0] >= D.shape[1]:
# loop over the unused row indexes
for row in unusedRows:
# grab the object ID for the corresponding row
# index and increment the disappeared counter
objectID = objectIDs[row]
self.disappeared[objectID] += 1
# check to see if the number of consecutive
# frames the object has been marked "disappeared"
# for warrants deregistering the object
if self.disappeared[objectID] > self.maxDisappeared:
self.deregister(objectID)
最后:
disappeared
次数。disappeared
计数是否超过 maxDisappeared
阈值,如果是,我们将注销该对象。否则,输入质心的数量大于现有对象质心的数量,因此我们有新的对象要注册和跟踪:
# otherwise, if the number of input centroids is greater
# than the number of existing object centroids we need to
# register each new input centroid as a trackable object
else:
for col in unusedCols:
self.register(inputCentroids[col])
# return the set of trackable objects
return self.objects
我们循环遍历unusedCols
索引并注册每个新质心。最后,我们将可跟踪对象集返回给调用方法。
我们的质心跟踪实现很长,诚然,这是算法中最令人困惑的方面。
如果您在跟随该代码的操作时遇到问题,您应该考虑打开 Python shell 并执行以下实验:
>>> from scipy.spatial import distance as dist
>>> import numpy as np
>>> np.random.seed(42)
>>> objectCentroids = np.random.uniform(size=(2, 2))
>>> centroids = np.random.uniform(size=(3, 2))
>>> D = dist.cdist(objectCentroids, centroids)
>>> D
array([[0.82421549, 0.32755369, 0.33198071],
[0.72642889, 0.72506609, 0.17058938]])
结果是具有两行(# of existing object centroids
)和三列(# of new input centroids
)的距离矩阵 D。
就像我们之前在脚本中所做的那样,让我们找到每行中的最小距离并根据该值对索引进行排序:
>>> D.min(axis=1)
array([0.32755369, 0.17058938])
>>> rows = D.min(axis=1).argsort()
>>> rows
array([1, 0])
首先,我们找到每一行的最小值,让我们能够确定哪个现有对象最接近新的输入质心。然后对这些值进行排序,我们可以获得这些行的索引。
对列使用类似的过程:
>>> D.argmin(axis=1)
array([1, 2])
>>> cols = D.argmin(axis=1)[rows]
>>> cols
array([2, 1])
我们首先检查列中的值并找到具有最小列的值的索引。然后,我们使用现有的rows
对这些值排序。
让我们打印结果并分析它们:
>>> print(list(zip(rows, cols)))
[(1, 2), (0, 1)]
分析结果,我们发现:
D[1, 2]
具有最小的欧几里得距离,这意味着第二个现有对象将与第三个输入质心匹配。D[0, 1
] 具有下一个最小的欧几里德距离,这意味着第一个现有对象将与第二个输入质心匹配。现在我们已经实现了 CentroidTracker
类,让我们将其与对象跟踪程序脚本一起使用。
在程序脚本中,您可以使用自己喜欢的对象检测器,前提是它生成一组包围框。这可以是Haar级联,HOG +线性支持向量机,YOLO, SSD, Faster R-CNN等。对于这个示例脚本,我将使用OpenCV的深度学习人脸检测器,但您可以自行制作实现不同检测器的脚本版本。
在这个脚本中,我们将:
VideoStream
对象从您的网络摄像头中抓取帧CentroidTracker
并使用它来跟踪视频流中的人脸对象当你准备好了,打开object_tracker.py
,然后继续:
# import the necessary packages
from pyimagesearch.centroidtracker import CentroidTracker
from imutils.video import VideoStream
import numpy as np
import argparse
import imutils
import time
import cv2
# construct the argument parse and parse the arguments
ap = argparse.ArgumentParser()
ap.add_argument("-p", "--prototxt", required=True,
help="path to Caffe 'deploy' prototxt file")
ap.add_argument("-m", "--model", required=True,
help="path to Caffe pre-trained model")
ap.add_argument("-c", "--confidence", type=float, default=0.5,
help="minimum probability to filter weak detections")
args = vars(ap.parse_args())
首先,我们指定我们的导入。最值得注意的是,我们正在使用我们刚刚回顾过的 CentroidTracker
类。我们还将使用来自 imutils
和 OpenCV
的 VideoStream
。
我们有三个命令行参数,它们都与我们的深度学习人脸检测器相关:
--prototxt
:Caffe 部署prototxt 的路径。--model
:预训练模型模型的路径。--confidence
:我们过滤弱检测的概率阈值。我发现默认值 0.5 就足够了。接下来,让我们执行我们的初始化:
# initialize our centroid tracker and frame dimensions
ct = CentroidTracker()
(H, W) = (None, None)
# load our serialized model from disk
print("[INFO] loading model...")
net = cv2.dnn.readNetFromCaffe(args["prototxt"], args["model"])
# initialize the video stream and allow the camera sensor to warmup
print("[INFO] starting video stream...")
vs = VideoStream(src=0).start()
time.sleep(2.0)
在上面的块中,我们:
CentroidTracker
, ct
。回想一下上一节的解释,这个对象有三个方法:(1) register , (2) deregister ,和 (3) update
。我们只会使用 update
方法,因为它会自动注册和注销对象。我们还将 H
和 W
(我们的帧尺寸)初始化为 None
。VideoStream
, vs
。使用 vs
,我们将能够在下一个 while
循环中从我们的相机中捕获帧。我们将让我们的相机预热 2.0 秒。现在让我们开始我们的 while
循环并开始跟踪面部对象:
# loop over the frames from the video stream
while True:
# read the next frame from the video stream and resize it
frame = vs.read()
frame = imutils.resize(frame, width=400)
# if the frame dimensions are None, grab them
if W is None or H is None:
(H, W) = frame.shape[:2]
# construct a blob from the frame, pass it through the network,
# obtain our output predictions, and initialize the list of
# bounding box rectangles
blob = cv2.dnn.blobFromImage(frame, 1.0, (W, H),
(104.0, 177.0, 123.0))
net.setInput(blob)
detections = net.forward()
rects = []
我们遍历帧并将它们调整为固定宽度(同时保持纵横比)。我们的帧尺寸根据需要设置。
然后我们将帧通过 CNN 对象检测器来获得预测和对象位置。我们初始化一个矩形列表来保存我们的边界框矩形。
# loop over the detections
for i in range(0, detections.shape[2]):
# filter out weak detections by ensuring the predicted
# probability is greater than a minimum threshold
if detections[0, 0, i, 2] > args["confidence"]:
# compute the (x, y)-coordinates of the bounding box for
# the object, then update the bounding box rectangles list
box = detections[0, 0, i, 3:7] * np.array([W, H, W, H])
rects.append(box.astype("int"))
# draw a bounding box surrounding the object so we can
# visualize it
(startX, startY, endX, endY) = box.astype("int")
cv2.rectangle(frame, (startX, startY), (endX, endY),
(0, 255, 0), 2)
我们开始循环检测。如果检测结果超过我们的置信度阈值,表明检测有效,我们:
rects
列表中最后,让我们在质心跟踪器对象 ct
上调用 update
:
# update our centroid tracker using the computed set of bounding
# box rectangles
objects = ct.update(rects)
# loop over the tracked objects
for (objectID, centroid) in objects.items():
# draw both the ID of the object and the centroid of the
# object on the output frame
text = "ID {}".format(objectID)
cv2.putText(frame, text, (centroid[0] - 10, centroid[1] - 10),
cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 255, 0), 2)
cv2.circle(frame, (centroid[0], centroid[1]), 4, (0, 255, 0), -1)
# show the output frame
cv2.imshow("Frame", frame)
key = cv2.waitKey(1) & 0xFF
# if the `q` key was pressed, break from the loop
if key == ord("q"):
break
# do a bit of cleanup
cv2.destroyAllWindows()
vs.stop()
ct.update
调用处理了我们使用 Python 和 OpenCV 脚本实现的简单对象跟踪器中的繁重工作。 如果我们不关心可视化,我们将在这里完成并准备循环。
我们将质心显示为一个填充的圆和唯一的对象ID号文本。现在,我们将能够可视化结果,并检查CentroidTracker
是否通过将正确的ID与视频流中的对象相关联来正确地跟踪对象。
我们显示帧,直到按下退出键(“q”)。如果按下退出键,我们只需中断并执行清理。
centroidtracker.py
# import the necessary packages
from scipy.spatial import distance as dist
from collections import OrderedDict
import numpy as np
class CentroidTracker():
def __init__(self, maxDisappeared=50):
# initialize the next unique object ID along with two ordered
# dictionaries used to keep track of mapping a given object
# ID to its centroid and number of consecutive frames it has
# been marked as "disappeared", respectively
self.nextObjectID = 0
self.objects = OrderedDict()
self.disappeared = OrderedDict()
# store the number of maximum consecutive frames a given
# object is allowed to be marked as "disappeared" until we
# need to deregister the object from tracking
self.maxDisappeared = maxDisappeared
def register(self, centroid):
# when registering an object we use the next available object
# ID to store the centroid
self.objects[self.nextObjectID] = centroid
self.disappeared[self.nextObjectID] = 0
self.nextObjectID += 1
def deregister(self, objectID):
# to deregister an object ID we delete the object ID from
# both of our respective dictionaries
del self.objects[objectID]
del self.disappeared[objectID]
def update(self, rects):
# check to see if the list of input bounding box rectangles
# is empty
if len(rects) == 0:
# loop over any existing tracked objects and mark them
# as disappeared
for objectID in self.disappeared.keys():
self.disappeared[objectID] += 1
# if we have reached a maximum number of consecutive
# frames where a given object has been marked as
# missing, deregister it
if self.disappeared[objectID] > self.maxDisappeared:
self.deregister(objectID)
# return early as there are no centroids or tracking info
# to update
return self.objects
# initialize an array of input centroids for the current frame
inputCentroids = np.zeros((len(rects), 2), dtype="int")
# loop over the bounding box rectangles
for (i, (startX, startY, endX, endY)) in enumerate(rects):
# use the bounding box coordinates to derive the centroid
cX = int((startX + endX) / 2.0)
cY = int((startY + endY) / 2.0)
inputCentroids[i] = (cX, cY)
# if we are currently not tracking any objects take the input
# centroids and register each of them
if len(self.objects) == 0:
for i in range(0, len(inputCentroids)):
self.register(inputCentroids[i])
# otherwise, are are currently tracking objects so we need to
# try to match the input centroids to existing object
# centroids
else:
# grab the set of object IDs and corresponding centroids
objectIDs = list(self.objects.keys())
objectCentroids = list(self.objects.values())
# compute the distance between each pair of object
# centroids and input centroids, respectively -- our
# goal will be to match an input centroid to an existing
# object centroid
D = dist.cdist(np.array(objectCentroids), inputCentroids)
# in order to perform this matching we must (1) find the
# smallest value in each row and then (2) sort the row
# indexes based on their minimum values so that the row
# with the smallest value as at the *front* of the index
# list
rows = D.min(axis=1).argsort()
# next, we perform a similar process on the columns by
# finding the smallest value in each column and then
# sorting using the previously computed row index list
cols = D.argmin(axis=1)[rows]
# in order to determine if we need to update, register,
# or deregister an object we need to keep track of which
# of the rows and column indexes we have already examined
usedRows = set()
usedCols = set()
# loop over the combination of the (row, column) index
# tuples
for (row, col) in zip(rows, cols):
# if we have already examined either the row or
# column value before, ignore it
# val
if row in usedRows or col in usedCols:
continue
# otherwise, grab the object ID for the current row,
# set its new centroid, and reset the disappeared
# counter
objectID = objectIDs[row]
self.objects[objectID] = inputCentroids[col]
self.disappeared[objectID] = 0
# indicate that we have examined each of the row and
# column indexes, respectively
usedRows.add(row)
usedCols.add(col)
# compute both the row and column index we have NOT yet
# examined
unusedRows = set(range(0, D.shape[0])).difference(usedRows)
unusedCols = set(range(0, D.shape[1])).difference(usedCols)
# in the event that the number of object centroids is
# equal or greater than the number of input centroids
# we need to check and see if some of these objects have
# potentially disappeared
if D.shape[0] >= D.shape[1]:
# loop over the unused row indexes
for row in unusedRows:
# grab the object ID for the corresponding row
# index and increment the disappeared counter
objectID = objectIDs[row]
self.disappeared[objectID] += 1
# check to see if the number of consecutive
# frames the object has been marked "disappeared"
# for warrants deregistering the object
if self.disappeared[objectID] > self.maxDisappeared:
self.deregister(objectID)
# otherwise, if the number of input centroids is greater
# than the number of existing object centroids we need to
# register each new input centroid as a trackable object
else:
for col in unusedCols:
self.register(inputCentroids[col])
# return the set of trackable objects
return self.objects
object_tracker.py
# USAGE
# python object_tracker.py --prototxt deploy.prototxt --model res10_300x300_ssd_iter_140000.caffemodel
# import the necessary packages
from pyimagesearch.centroidtracker import CentroidTracker
from imutils.video import VideoStream
import numpy as np
import argparse
import imutils
import time
import cv2
# construct the argument parse and parse the arguments
ap = argparse.ArgumentParser()
ap.add_argument("-p", "--prototxt", required=True,
help="path to Caffe 'deploy' prototxt file")
ap.add_argument("-m", "--model", required=True,
help="path to Caffe pre-trained model")
ap.add_argument("-c", "--confidence", type=float, default=0.5,
help="minimum probability to filter weak detections")
args = vars(ap.parse_args())
# initialize our centroid tracker and frame dimensions
ct = CentroidTracker()
(H, W) = (None, None)
# load our serialized model from disk
print("[INFO] loading model...")
net = cv2.dnn.readNetFromCaffe(args["prototxt"], args["model"])
# initialize the video stream and allow the camera sensor to warmup
print("[INFO] starting video stream...")
vs = VideoStream(src=0).start()
time.sleep(2.0)
# loop over the frames from the video stream
while True:
# read the next frame from the video stream and resize it
frame = vs.read()
frame = imutils.resize(frame, width=400)
# if the frame dimensions are None, grab them
if W is None or H is None:
(H, W) = frame.shape[:2]
# construct a blob from the frame, pass it through the network,
# obtain our output predictions, and initialize the list of
# bounding box rectangles
blob = cv2.dnn.blobFromImage(frame, 1.0, (W, H),
(104.0, 177.0, 123.0))
net.setInput(blob)
detections = net.forward()
rects = []
# loop over the detections
for i in range(0, detections.shape[2]):
# filter out weak detections by ensuring the predicted
# probability is greater than a minimum threshold
if detections[0, 0, i, 2] > args["confidence"]:
# compute the (x, y)-coordinates of the bounding box for
# the object, then update the bounding box rectangles list
box = detections[0, 0, i, 3:7] * np.array([W, H, W, H])
rects.append(box.astype("int"))
# draw a bounding box surrounding the object so we can
# visualize it
(startX, startY, endX, endY) = box.astype("int")
cv2.rectangle(frame, (startX, startY), (endX, endY),
(0, 255, 0), 2)
# update our centroid tracker using the computed set of bounding
# box rectangles
objects = ct.update(rects)
# loop over the tracked objects
for (objectID, centroid) in objects.items():
# draw both the ID of the object and the centroid of the
# object on the output frame
text = "ID {}".format(objectID)
cv2.putText(frame, text, (centroid[0] - 10, centroid[1] - 10),
cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 255, 0), 2)
cv2.circle(frame, (centroid[0], centroid[1]), 4, (0, 255, 0), -1)
# show the output frame
cv2.imshow("Frame", frame)
key = cv2.waitKey(1) & 0xFF
# if the `q` key was pressed, break from the loop
if key == ord("q"):
break
# do a bit of cleanup
cv2.destroyAllWindows()
vs.stop()
打开终端并执行以下命令:
$ python object_tracker.py --prototxt deploy.prototxt \
--model res10_300x300_ssd_iter_140000.caffemodel
[INFO] loading model...
[INFO] starting video stream...
请注意,即使当我将书籍封面移到相机视野之外时,第二张脸“丢失”了,我们的对象跟踪也能够在它进入视野时再次将其重新拾起。如果面部在视野之外存在超过 50 帧
,则该对象将被取消注册。
虽然我们的质心跟踪器在这个例子中工作得很好,但这种对象跟踪算法有两个主要缺点。
首先是它要求在输入视频的每一帧上运行对象检测步骤。
第二个缺点与质心跟踪算法本身的基本假设有关——质心必须在后续帧之间靠得很近。
只要您在使用质心跟踪时牢记这些假设和限制,该算法就会非常适合您。
以下实现基于YOLOV3和质心跟踪算法的多目标跟踪
# import the necessary packages
from CentroidTracking.centroidtracker import CentroidTracker
from imutils.video import VideoStream
import numpy as np
import argparse
import imutils
import time
import cv2
import os
# construct the argument parse and parse the arguments
ap = argparse.ArgumentParser()
ap.add_argument("-i", "--input", required=True,
help="path to input video")
# ap.add_argument("-o", "--output", required=True,
# help="path to output video")
ap.add_argument("-c", "--confidence", type=float, default=0.5,
help="minimum probability to filter weak detections")
ap.add_argument("-t", "--threshold", type=float, default=0.3,
help="threshold when applying non-maxima suppression")
args = vars(ap.parse_args())
ct = CentroidTracker()
# load the COCO class labels, our YOLO model was trained on
labelsPath = os.path.sep.join(["yolo-coco", "coco.names"])
LABELS = open(labelsPath).read().strip().split("\n")
# initialize a list of colors to represent each possible class label
np.random.seed(42)
COLORS = np.random.randint(0, 255, size=(len(LABELS), 3),dtype="uint8")
# derive the paths to the YOLO weights and model configuration
weightsPath = os.path.sep.join(["yolo-coco", "yolov3.weights"])
configPath = os.path.sep.join(["yolo-coco", "yolov3.cfg"])
# load our YOLO object detector trained on COCO dataset (80 classes)
print("[INFO] loading YOLO from disk...")
net = cv2.dnn.readNetFromDarknet(configPath, weightsPath)
net.setPreferableBackend(cv2.dnn.DNN_BACKEND_OPENCV)
writer = None
if args["input"] == 'camera':
cap = cv2.VideoCapture(0)
else:
cap = cv2.VideoCapture(args["input"])
# try to determine the total number of frames in the video file
try:
prop = cv2.cv.CV_CAP_PROP_FRAME_COUNT if imutils.is_cv2() \
else cv2.CAP_PROP_FRAME_COUNT
total = int(vs.get(prop))
print("[INFO] {} total frames in video".format(total))
# an error occurred while trying to determine the total
# number of frames in the video file
except:
print("[INFO] could not determine # of frames in video")
print("[INFO] no approx. completion time can be provided")
total = -1
print(cap.isOpened())
print("starting-----------------------------------------------------------")
begin = time.time()
while (cap.isOpened()):
ret, image = cap.read()
# load our input image and grab its spatial dimension
if ret == True:
(H, W) = image.shape[:2]
# determine only the *output* layer names that we need from YOLO
ln = net.getLayerNames()
ln = [ln[i[0] - 1] for i in net.getUnconnectedOutLayers()]
# construct a blob from the input image and then perform a forward
# pass of the YOLO object detector, giving us our bounding boxes and
# associated probabilities
blob = cv2.dnn.blobFromImage(image, 1 / 255.0, (416, 416),
swapRB=True, crop=False)
net.setInput(blob)
start = time.time()
layerOutputs = net.forward(ln)
end = time.time()
# show timing information on YOLO
print("[INFO] YOLO took {:.6f} seconds".format(end - start))
# initialize our lists of detected bounding boxes, confidences, and
# class IDs, respectively
boxes = []
boxes_c = []
confidences = []
classIDs = []
rects = []
# loop over each of the layer outputs
for output in layerOutputs:
# loop over each of the detections
for detection in output:
# extract the class ID and confidence (i.e., probability) of
# the current object detection
scores = detection[5:]
classID = np.argmax(scores)
confidence = scores[classID]
# filter out weak predictions by ensuring the detected
# probability is greater than the minimum probability
if confidence > args["confidence"]:
# scale the bounding box coordinates back relative to the
# size of the image, keeping in mind that YOLO actually
# returns the center (x, y)-coordinates of the bounding
# box followed by the boxes' width and height
box = detection[0:4] * np.array([W, H, W, H])
(centerX, centerY, width, height) = box.astype("int")
# use the center (x, y)-coordinates to derive the top and
# and left corner of the bounding box
x = int(centerX - (width / 2))
y = int(centerY - (height / 2))
# update our list of bounding box coordinates, confidences,
# and class IDs
boxes.append([x, y, int(width), int(height)])
boxes_c.append([centerX - int(width/2), centerY - int(height/2), centerX + int(width/2), centerY + int(height/2)])
confidences.append(float(confidence))
classIDs.append(classID)
# apply non-maxima suppression to suppress weak, overlapping bounding
# boxes
idxs = cv2.dnn.NMSBoxes(boxes, confidences, args["confidence"],args["threshold"])
if len(idxs) > 0:
for i in idxs.flatten():
rects.append(boxes_c[i])
# update our centroid tracker using the computed set of bounding
# box rectangles
objects = ct.update(rects)
# loop over the tracked objects
for (objectID, centroid) in objects.items():
# draw both the ID of the object and the centroid of the
# object on the output frame
text = "ID {}".format(objectID)
cv2.putText(image, text, (centroid[0] - 10, centroid[1] - 10),cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 255, 0), 2)
cv2.circle(image, (centroid[0], centroid[1]), 4, (0, 255, 0), -1)
# ensure at least one detection exists
if len(idxs) > 0:
# loop over the indexes we are keeping
for i in idxs.flatten():
# extract the bounding box coordinates
(x, y) = (boxes[i][0], boxes[i][1])
(w, h) = (boxes[i][2], boxes[i][3])
# draw a bounding box rectangle and label on the image
color = [int(c) for c in COLORS[classIDs[i]]]
cv2.rectangle(image, (x, y), (x + w, y + h), color, 2)
text = "{}: {:.4f}".format(LABELS[classIDs[i]], confidences[i])
cv2.putText(image, text, (x, y - 5), cv2.FONT_HERSHEY_SIMPLEX,
0.5, color, 2)
# writer.write(image)
# # show the output image
# cv2.imshow("Image", image)
# check if the video writer is None
if writer is None:
# initialize our video writer
fourcc = cv2.VideoWriter_fourcc(*"MJPG")
writer = cv2.VideoWriter("output.avi", fourcc, 30,(image.shape[1], image.shape[0]), True)
# some information on processing single frame
if total > 0:
elap = (end - start)
print("[INFO] single frame took {:.4f} seconds".format(elap))
print("[INFO] estimated total time to finish: {:.4f}".format(elap * total))
cv2.imshow("Live", image)
# write the output frame to disk
writer.write(image)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
else:
break
# release the file pointers
print("[INFO] cleaning up...")
writer.release()
cap.release()
cv2.destroyAllWindows()
finish = time.time()
print(f"Total time taken : {finish - begin}")
链接:https://pan.baidu.com/s/1UX_HmwwJLtHJ9e5tx6hwOg?pwd=123a
提取码:123a
https://pyimagesearch.com/2018/07/23/simple-object-tracking-with-opencv/