Precise ROI Pooling(PrRoI Pooling)方法由旷视科技于ECCV 2018上提出,作为他们的论文Acquisition of Localization Con dence for Accurate Object Detection中的一部分。其主要思想如下:
给定一个图像特征图 F {\mathcal F} F,令 ( i , j ) (i,j) (i,j)为特征图上的坐标, w i , j w_{i,j} wi,j为其特征图对应位置 ( i , j ) (i,j) (i,j)的权值。采用双线性插值以避免量化操作,那么可以认为特征图也是连续的,有:
f ( x , y ) = ∑ i , j I C ( x , y , i , j ) × w i , j f(x, y)=\sum_{i, j} I C(x, y, i, j) \times w_{i, j} f(x,y)=i,j∑IC(x,y,i,j)×wi,j
其中, I C ( x , y , i , j ) = max ( 0 , 1 − ∣ x − i ∣ ) × max ( 0 , 1 − ∣ y − j ∣ ) I C(x, y, i, j)=\max (0,1-|x-i|) \times \max (0,1-|y-j|) IC(x,y,i,j)=max(0,1−∣x−i∣)×max(0,1−∣y−j∣),它是插值系数。注意,这里面的 w w w和 f f f其实都是特征图,只不过 w w w是离散的特征图(一般的特征图都是离散的), f f f是经过插值后连续的特征图。
为什么这里非要将离散的特征图转换为连续的特征图呢?这里主要是为了解决量化操作带来的累积误差问题,具体可以参考 https://www.jianshu.com/p/2a5ffca8b861 这篇文章,分析得比较仔细。
现在我们为某个RoI指定它的一个bin, b i n = { ( x 1 , y 1 ) , ( x 2 , y 2 ) } {bin}=\left\{\left(x_{1}, y_{1}\right),\left(x_{2}, y_{2}\right)\right\} bin={(x1,y1),(x2,y2)},其中 ( x 1 , y 1 ) \left(x_{1}, y_{1}\right) (x1,y1)和 ( x 2 , y 2 ) \left(x_{2}, y_{2}\right) (x2,y2)分别表示矩形框的左上角和右下角坐标,如下图所示(当然,这个坐标值是连续数值,因为已经做过插值了):
根据上述得到的bin(其坐标是连续数值)和原始的特征图 F {\mathcal F} F,我们可以进行一种pooling操作,它涉及到二重积分:
PrPool ( b i n , F ) = ∫ y 1 y 2 ∫ x 1 x 2 f ( x , y ) d x d y ( x 2 − x 1 ) × ( y 2 − y 1 ) \operatorname{PrPool}(b i n, \mathcal{F})=\frac{\int_{y 1}^{y 2} \int_{x 1}^{x 2} f(x, y) d x d y}{\left(x_{2}-x_{1}\right) \times\left(y_{2}-y_{1}\right)} PrPool(bin,F)=(x2−x1)×(y2−y1)∫y1y2∫x1x2f(x,y)dxdy
从上面的公式可以看出,PrPool的主要计算思想是对bin区域内的数值进行求和,然后除以bin的面积。
这里贴一张论文中关于描述RoI Pooling、RoI Align和PrRoI Pooling的对比示意图:
上图中红色虚线表示候选图像在特征图中的位置。从图中可以看出,RoI Pooling的思路最为基础,其方法是直接做了取整处理,损失了精度。RoI Align方法则首先进行插值,然后将候选图像区域分为若干个子区域(图中的示例是4个子区域,表现为4个实心红点),最后pooling的时候对这4个子区域做均值处理。与RoI Align方法类似,PrRoI Pooling也做了插值处理,将离散的特征图数据映射到一个连续空间,但与RoI Align不同之处在于,它并没有再划分子区域,而是使用二重积分再求均值的方式实现pooling。相比于RoI Align方法,PrRoI Pooling主要解决了N的取值难以自适应的问题。
此外我们还可以发现, PrPool ( b i n , F ) \operatorname{PrPool}(b i n, \mathcal{F}) PrPool(bin,F)是可以求偏导数的,比如对 x 1 x_1 x1求偏导数,有:
∂ PrPool ( b i n , F ) ∂ x 1 = ∂ ∫ y 1 y 2 ∫ x 1 x 2 f ( x , y ) d x d y ( x 2 − x 1 ) × ( y 2 − y 1 ) ∂ x 1 = ∂ ∫ y 1 y 2 ∫ x 1 x 2 f ( x , y ) d x d y ∂ x 1 × ( x 2 − x 1 ) × ( y 2 − y 1 ) − ∫ y 1 y 2 ∫ x 1 x 2 f ( x , y ) d x d y × ∂ ( x 2 − x 1 ) × ( y 2 − y 1 ) ∂ x 1 [ ( x 2 − x 1 ) × ( y 2 − y 1 ) ] 2 = ∂ ∫ y 1 y 2 ∫ x 1 x 2 f ( x , y ) d x d y ∂ x 1 × ( x 2 − x 1 ) × ( y 2 − y 1 ) [ ( x 2 − x 1 ) × ( y 2 − y 1 ) ] 2 − ∫ y 1 y 2 ∫ x 1 x 2 f ( x , y ) d x d y × ∂ ( x 2 − x 1 ) × ( y 2 − y 1 ) ∂ x 1 [ ( x 2 − x 1 ) × ( y 2 − y 1 ) ] 2 = ∂ ∫ y 1 y 2 ∫ x 1 x 2 f ( x , y ) d x d y ∂ x 1 ( x 2 − x 1 ) × ( y 2 − y 1 ) − ∫ y 1 y 2 ∫ x 1 x 2 f ( x , y ) d x d y × [ − 1 × ( y 2 − y 1 ) ] [ ( x 2 − x 1 ) × ( y 2 − y 1 ) ] 2 = ∂ ∫ y 1 y 2 ∫ x 1 x 2 f ( x , y ) d x d y ∂ x 1 ( x 2 − x 1 ) × ( y 2 − y 1 ) − ∫ y 1 y 2 ∫ x 1 x 2 f ( x , y ) d x d y ( x 2 − x 1 ) × ( y 2 − y 1 ) × [ − 1 × ( y 2 − y 1 ) ] ( x 2 − x 1 ) × ( y 2 − y 1 ) = ∂ ∫ y 1 y 2 ∫ x 1 x 2 f ( x , y ) d x d y ∂ x 1 ( x 2 − x 1 ) × ( y 2 − y 1 ) − PrPool ( b i n , F ) × − 1 × ( y 2 − y 1 ) ( x 2 − x 1 ) × ( y 2 − y 1 ) = ∂ ∫ y 1 y 2 ∫ x 1 x 2 f ( x , y ) d x d y ∂ x 1 ( x 2 − x 1 ) × ( y 2 − y 1 ) + PrPool ( b i n , F ) ( x 2 − x 1 ) = − ∫ y 1 y 2 f ( x , y ) d y ( x 2 − x 1 ) × ( y 2 − y 1 ) + PrPool ( b i n , F ) ( x 2 − x 1 ) = PrPool ( b i n , F ) x 2 − x 1 − ∫ y 1 y 2 f ( x 1 , y ) d y ( x 2 − x 1 ) × ( y 2 − y 1 ) \begin{aligned} \frac{{\partial \operatorname{PrPool} (bin,\mathcal{F})}}{{\partial {x_1}}} &= \frac{{\partial \frac{{\int_{y1}^{y2} {\int_{x1}^{x2} f } (x,y){\text{d}}x{\text{d}}y}}{{\left( {{x_2} - {x_1}} \right) \times \left( {{y_2} - {y_1}} \right)}}}}{{\partial {x_1}}} \\ &= \frac{{\frac{{\partial \int_{y1}^{y2} {\int_{x1}^{x2} f } (x,y){\text{d}}x{\text{d}}y}}{{\partial {x_1}}} \times \left( {{x_2} - {x_1}} \right) \times \left( {{y_2} - {y_1}} \right) - \int_{y1}^{y2} {\int_{x1}^{x2} f } (x,y){\text{d}}x{\text{d}}y \times \frac{{\partial \left( {{x_2} - {x_1}} \right) \times \left( {{y_2} - {y_1}} \right)}}{{\partial {x_1}}}}}{{{{\left[ {\left( {{x_2} - {x_1}} \right) \times \left( {{y_2} - {y_1}} \right)} \right]}^2}}} \\ &= \frac{{\frac{{\partial \int_{y1}^{y2} {\int_{x1}^{x2} f } (x,y){\text{d}}x{\text{d}}y}}{{\partial {x_1}}} \times \left( {{x_2} - {x_1}} \right) \times \left( {{y_2} - {y_1}} \right)}}{{{{\left[ {\left( {{x_2} - {x_1}} \right) \times \left( {{y_2} - {y_1}} \right)} \right]}^2}}} - \frac{{\int_{y1}^{y2} {\int_{x1}^{x2} f } (x,y){\text{d}}x{\text{d}}y \times \frac{{\partial \left( {{x_2} - {x_1}} \right) \times \left( {{y_2} - {y_1}} \right)}}{{\partial {x_1}}}}}{{{{\left[ {\left( {{x_2} - {x_1}} \right) \times \left( {{y_2} - {y_1}} \right)} \right]}^2}}} \\ &= \frac{{\frac{{\partial \int_{y1}^{y2} {\int_{x1}^{x2} f } (x,y){\text{d}}x{\text{d}}y}}{{\partial {x_1}}}}}{{\left( {{x_2} - {x_1}} \right) \times \left( {{y_2} - {y_1}} \right)}} - \frac{{\int_{y1}^{y2} {\int_{x1}^{x2} f } (x,y){\text{d}}x{\text{d}}y \times \left[ { - 1 \times \left( {{y_2} - {y_1}} \right)} \right]}}{{{{\left[ {\left( {{x_2} - {x_1}} \right) \times \left( {{y_2} - {y_1}} \right)} \right]}^2}}} \\ &= \frac{{\frac{{\partial \int_{y1}^{y2} {\int_{x1}^{x2} f } (x,y){\text{d}}x{\text{d}}y}}{{\partial {x_1}}}}}{{\left( {{x_2} - {x_1}} \right) \times \left( {{y_2} - {y_1}} \right)}} - \frac{{\int_{y1}^{y2} {\int_{x1}^{x2} f } (x,y){\text{d}}x{\text{d}}y}}{{\left( {{x_2} - {x_1}} \right) \times \left( {{y_2} - {y_1}} \right)}} \times \frac{{\left[ { - 1 \times \left( {{y_2} - {y_1}} \right)} \right]}}{{\left( {{x_2} - {x_1}} \right) \times \left( {{y_2} - {y_1}} \right)}} \\ &= \frac{{\frac{{\partial \int_{y1}^{y2} {\int_{x1}^{x2} f } (x,y){\text{d}}x{\text{d}}y}}{{\partial {x_1}}}}}{{\left( {{x_2} - {x_1}} \right) \times \left( {{y_2} - {y_1}} \right)}} - \operatorname{PrPool} (bin,\mathcal{F}) \times \frac{{ - 1 \times \bcancel{{\left( {{y_2} - {y_1}} \right)}}}}{{\left( {{x_2} - {x_1}} \right) \times \bcancel{{\left( {{y_2} - {y_1}} \right)}}}} \\ &= \frac{{\frac{{\partial \int_{y1}^{y2} {\int_{x1}^{x2} f } (x,y){\text{d}}x{\text{d}}y}}{{\partial {x_1}}}}}{{\left( {{x_2} - {x_1}} \right) \times \left( {{y_2} - {y_1}} \right)}} + \frac{{\operatorname{PrPool} (bin,\mathcal{F})}}{{\left( {{x_2} - {x_1}} \right)}} \\ &= - \frac{{\int_{y1}^{y2} {f(x,y){\text{d}}y} }}{{\left( {{x_2} - {x_1}} \right) \times \left( {{y_2} - {y_1}} \right)}} + \frac{{\operatorname{PrPool} (bin,\mathcal{F})}}{{\left( {{x_2} - {x_1}} \right)}} \\ &= \frac{{\operatorname{PrPool} (bin,\mathcal{F})}}{{{x_2} - {x_1}}} - \frac{{\int_{y1}^{y2} f \left( {{x_1},y} \right){\text{d}}y}}{{\left( {{x_2} - {x_1}} \right) \times \left( {{y_2} - {y_1}} \right)}} \\ \end{aligned} ∂x1∂PrPool(bin,F)=∂x1∂(x2−x1)×(y2−y1)∫y1y2∫x1x2f(x,y)dxdy=[(x2−x1)×(y2−y1)]2∂x1∂∫y1y2∫x1x2f(x,y)dxdy×(x2−x1)×(y2−y1)−∫y1y2∫x1x2f(x,y)dxdy×∂x1∂(x2−x1)×(y2−y1)=[(x2−x1)×(y2−y1)]2∂x1∂∫y1y2∫x1x2f(x,y)dxdy×(x2−x1)×(y2−y1)−[(x2−x1)×(y2−y1)]2∫y1y2∫x1x2f(x,y)dxdy×∂x1∂(x2−x1)×(y2−y1)=(x2−x1)×(y2−y1)∂x1∂∫y1y2∫x1x2f(x,y)dxdy−[(x2−x1)×(y2−y1)]2∫y1y2∫x1x2f(x,y)dxdy×[−1×(y2−y1)]=(x2−x1)×(y2−y1)∂x1∂∫y1y2∫x1x2f(x,y)dxdy−(x2−x1)×(y2−y1)∫y1y2∫x1x2f(x,y)dxdy×(x2−x1)×(y2−y1)[−1×(y2−y1)]=(x2−x1)×(y2−y1)∂x1∂∫y1y2∫x1x2f(x,y)dxdy−PrPool(bin,F)×(x2−x1)×(y2−y1) −1×(y2−y1) =(x2−x1)×(y2−y1)∂x1∂∫y1y2∫x1x2f(x,y)dxdy+(x2−x1)PrPool(bin,F)=−(x2−x1)×(y2−y1)∫y1y2f(x,y)dy+(x2−x1)PrPool(bin,F)=x2−x1PrPool(bin,F)−(x2−x1)×(y2−y1)∫y1y2f(x1,y)dy
参考资料: