目录
[1]. Babenko, B., M. Yang and S. Belongie. Visual tracking with online multiple instance learning. in Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on. 2009: IEEE.
[2]. Babenko, B., M. Yang and S. Belongie, Robust object tracking with online multiple instance learning. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2011. 33(8): p. 1619-1632.
SVM的输出空间就是Score空间,每个patch块提取完特征,直接经过核函数乘以W,就可以得到Score空间下的值了。Online-SVM通过SMO进行训练,速度较快。一下是pipline,可以感受一下:
[1]. Hare, S., A. Saffari and P.H. Torr. Struck: Structured output tracking with kernels. in 2011 International Conference on Computer Vision. 2011: IEEE.
[2]. Hare, S., et al., Struck: Structured output tracking with kernels. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015. 38(10): p. 2096-2109.
[1]. Zhang, J., S. Ma and S. Sclaroff. MEEM: robust tracking via multiple experts using entropy minimization. in European Conference on Computer Vision. 2014: Springer.
KCF
KCF是很重要的算法。 相关滤波被用于图像跟踪领域始于2010年的MOSSE算法,而KCF算法将相关滤波的方法用到极致,其核心思想是利用循环矩阵乘以图片,使图像产生位移,从而得到大量样本。把位移的样本存在一个矩阵中会组成一个循环矩阵。
DFT中,循环矩阵具有一系列美好的性质,根据一系列推导,将样本空间变换到DFT空间内,可以实现大量样本的快速训练和学习。循环卷积在DFT变换后变成了矩阵元素的点积,这样就将算法的时间复杂度从O(n^3)减少到O(n*log(n)),KCF算法不仅速度快、效果好,被后来许多图像跟踪算法采用,产生了许多变种。
更细节的说,KCF通过平移产生大量样本,并且给每个样本赋予一个标签,这个标签根据离中心的距离,使用高斯分布描述,可以理解为置信度。另外,样本在做平移之前需要通过cos窗口加权,这样做的目的是避免在平移过程中,边缘太过强烈,引发不必要的麻烦。
关于DFT(离散傅里叶变换)和循环矩阵部分的推导是这个算法的核心,在Paper后面也有推导,这篇paper一定要多看几遍,后面很多方法都和这个有关。
[1]. Henriques, J.F., et al. Exploiting the circulant structure of tracking-by-detection with kernels. in European conference on computer vision. 2012: Springer.
[2]. Henriques, J.F., et al., High-speed tracking with kernelized correlation filters. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015. 37(3): p. 583-596.
改算法通过空间正则化来提高跟踪模型的质量。通过一个高斯分布的空间惩罚因子,对不同位置加入不同权重的惩罚,这就是空间正则化。对空间正则化和输出进行可视化,在边界处的输出被明显抑制了。SRDCF将空间正则化的表达融合进loss里面,然后直接最小二乘回归就可以了,很方便。
第二篇paper,deepSRDCF就是把特征提取部分换成CNN来做,其余部分和SRDCF一样。这篇paper着重分析了使用各层特征带来的影响,发现使用第一层的特征就可以了,后面的层并不重要。这点和有些文献中的结论不一致,这条结论博主也持保留态度。
[1]. Danelljan, M., et al. Learning spatially regularized correlation filters for visual tracking. in Proceedings of the IEEE International Conference on Computer Vision. 2015.
[2]. Danelljan, M., et al. Convolutional features for correlation filter based visual tracking. in Proceedings of the IEEE International Conference on Computer Vision Workshops. 2015.
[1]. Danelljan, M., et al. Adaptive decontamination of the training set: A unified formulation for discriminative visual tracking. 2016: CVPR.
[1]. Bertinetto, L., et al. Staple: Complementary learners for real-time tracking. in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016.
[1]. Zhu, G., F. Porikli and H. Li. Beyond local search: Tracking objects everywhere with instance-specific proposals. in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016.
三、深度学习用于目标跟踪
[1]. Wang, N. and D. Yeung. Learning a deep compact image representation for visual tracking. in Advances in neural information processing systems. 2013.
博主测试了原文中的算法,在博主的GTX1070上跑到23fps,如果只用CPU的话6600k超频到4.24GHz也只有6fps,远未达到实时要求。另外,博主修改了下代码,分别只使用Conv3、Conv4、Conv5层,发现准确率下降的并不多。博主在想是否一定要这么多层的特征呢?这个问题目前是存在争论的,不过介绍这篇Paper是为了介绍接下来一篇Paper的。
[1]. Ma, C., et al. Hierarchical convolutional features for visual tracking. in Proceedings of the IEEE International Conference on Computer Vision. 2015.
[1]. Qi, Y., et al. Hedged Deep Tracking. in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2016.
[1]. Wang, L., et al. Stct: Sequentially training convolutional networks for visual tracking. in Computer Vision and Pattern Recognition (CVPR), 2016 IEEE Conference on. 2016.
[1]. Danelljan, M., et al. Beyond correlation filters: Learning continuous convolution operators for visual tracking. in European Conference on Computer Vision. 2016: Springer.
这篇论文已经被CVPR17接受了,下面的参考文献没有指向CVPR17
[1].Danelljan, M., et al., ECO: Efficient Convolution Operators for Tracking. arXiv preprint arXiv:1611.09224, 2016.
综上所述,因为已经识别了人体目标只需要判定在后续画面中的人对应的是前面画面当中的哪一个人即可,所以采用CNN结合HOG特征的方法来判定目标归属。
附上 opencv 实现 kcf目标跟踪算法 代码:
import cv2
import sys
(major_ver, minor_ver, subminor_ver) = (cv2.__version__).split('.')
if __name__ == '__main__':
# Set up tracker.
# Instead of MIL, you can also use
tracker_types = ['BOOSTING', 'MIL', 'KCF', 'TLD', 'MEDIANFLOW', 'GOTURN']
tracker_type = tracker_types[2]
if int(minor_ver) < 3:
tracker = cv2.Tracker_create(tracker_type)
else:
if tracker_type == 'BOOSTING':
tracker = cv2.TrackerBoosting_create()
if tracker_type == 'MIL':
tracker = cv2.TrackerMIL_create()
if tracker_type == 'KCF':
tracker = cv2.TrackerKCF_create()
if tracker_type == 'TLD':
tracker = cv2.TrackerTLD_create()
if tracker_type == 'MEDIANFLOW':
tracker = cv2.TrackerMedianFlow_create()
if tracker_type == 'GOTURN':
tracker = cv2.TrackerGOTURN_create()
# Read video
video = cv2.VideoCapture("E:/sample/qiusai.mp4")
# Exit if video not opened.
if not video.isOpened():
print
"Could not open video"
sys.exit()
# Read first frame.
ok, frame = video.read()
if not ok:
print
'Cannot read video file'
sys.exit()
# Define an initial bounding box
#bbox = (52, 28, 184, 189)#这里四个参数分别为 起始坐标xy 和 宽 高
# Uncomment the line below to select a different bounding box
bbox = cv2.selectROI(frame, False)
# Initialize tracker with first frame and bounding box
ok = tracker.init(frame, bbox)
num = 0
while True:
num = num + 1
if num % 500 == 0:
tracker = cv2.TrackerKCF_create()
bbox = cv2.selectROI(frame, False)
# Initialize tracker with first frame and bounding box
ok = tracker.init(frame, bbox)
# Read a new frame
ok, frame = video.read()
if not ok:
break
# Start timer
timer = cv2.getTickCount()
# Update tracker
ok, bbox = tracker.update(frame)
# Calculate Frames per second (FPS)
fps = cv2.getTickFrequency() / (cv2.getTickCount() - timer);
# Draw bounding box
if ok:
# Tracking success
p1 = (int(bbox[0]), int(bbox[1]))
p2 = (int(bbox[0] + bbox[2]), int(bbox[1] + bbox[3]))
cv2.rectangle(frame, p1, p2, (255, 0, 0), 2, 1)
else:
# Tracking failure
cv2.putText(frame, "Tracking failure detected", (100, 80), cv2.FONT_HERSHEY_SIMPLEX, 0.75, (0, 0, 255), 2)
# Display tracker type on frame
cv2.putText(frame, tracker_type + " Tracker", (100, 20), cv2.FONT_HERSHEY_SIMPLEX, 0.75, (50, 170, 50), 2);
# Display FPS on frame
cv2.putText(frame, "FPS : " + str(int(fps)), (100, 50), cv2.FONT_HERSHEY_SIMPLEX, 0.75, (50, 170, 50), 2);
# Display result
cv2.imshow("Tracking", frame)
# Exit if ESC pressed
k = cv2.waitKey(1) & 0xff
if k == 27: break
转载于:https://blog.51cto.com/yixianwei/2096041