非极大抑制算法应用相当广泛,其主要目的是消除多余的框,找到最佳的物体检测位置。
其实现的思想主要是将各个框的置信度进行排序,然后选择其中置信度最高的框A,将其作为标准选择其他框,同时设置一个阈值,当其他框B与A的重合程度超过阈值就将B舍弃掉,然后在剩余的框中选择置信度最大的框,重复上述操作。
import numpy as np
dets = np.array([
[204, 102, 358, 250, 0.5],
[257, 118, 380, 250, 0.7],
[280, 135, 400, 250, 0.6],
[255, 118, 360, 235, 0.7]])
thresh = 0.3
其中,二维数组dets的每一维分别表示为xmin, ymin, xmax, ymax, confidence。
对于Python代码而言,NMS实现的核心是在lib/nms/py_cpu_nms.py的代码。因此接下来就其细节进行详尽的分析:
import numpy as np
def py_cpu_nms(dets, thresh):
"""Pure Python NMS baseline."""
x1 = dets[:, 0] #xmin
y1 = dets[:, 1] #ymin
x2 = dets[:, 2] #xmax
y2 = dets[:, 3] #ymax
scores = dets[:, 4] #confidence
areas = (x2 - x1 + 1) * (y2 - y1 + 1) #the size of bbox
order = scores.argsort()[::-1] #sort bounding boxes by decreasing order, returning array([3, 1, 2, 0])
keep = [] # store the final bounding boxes
while order.size > 0:
i = order[0] #the index of the bbox with highest confidence
keep.append(i) #save it to keep
xx1 = np.maximum(x1[i], x1[order[1:]]) #array([ 257., 280., 255.])
yy1 = np.maximum(y1[i], y1[order[1:]]) #array([ 118., 135., 118.])
xx2 = np.minimum(x2[i], x2[order[1:]]) #array([ 360., 360., 358.])
yy2 = np.minimum(y2[i], y2[order[1:]]) #array([ 235., 235., 235.])
w = np.maximum(0.0, xx2 - xx1 + 1) #array([ 104., 81., 104.])
h = np.maximum(0.0, yy2 - yy1 + 1) #array([ 118., 101., 118.])
inter = w * h #array([ 12272., 8181., 12272.])
# Cross Area / (bbox + particular area - Cross Area)
ovr = inter / (areas[i] + areas[order[1:]] - inter)
#reserve all the boundingbox whose ovr less than thresh
inds = np.where(ovr <= thresh)[0]
order = order[inds + 1]
return keep
Faster RCNN – NMS
NMS解析