本文记录了学习DB的过程,并给出了DB的基于Pytorch的代码。
Fine-grained Recognition: Accounting for Subtle Differences between Similar Classes
读完论文还是觉得文章中 “抑制最显著特征迫使网络学习其他特征” 的思想很有启发性,但是遗憾的是文章对应的代码没有开源。
DB(diversification block)的作用是抑制显著性最强的区域,迫使网络学习其他部位。
Peak Suppression随机抑制高峰位置,对于特征图最大的位置,使用伯努利分布来判断是否遮罩。
(论文中的Ppeak=1意思是一定抑制吗?)
Patch Suppression的作用是还有一些虽然不是高峰值但是也是值得抑制的部位,所以先对特征图分patch,然后同样按伯努利随机的对每个patch进行抑制。
最后通过Activation Suppression决定是否抑制和抑制强度,公式如下。
https://github.com/JerryMazeyu/fine-grained2019AAAI
参考代码中,并没有用到pk(也就是论文中的Ppeak),同时也没有体现出论文中的Ppatch以及伯努利随机概率。
import os
project_index = os.getcwd().find('fine-grained2019AAAI')
root = os.getcwd()[0:project_index] + 'fine-grained2019AAAI'
import sys
sys.path.append(root)
import torch
from torch import nn
import numpy as np
class DiversificationBlock(nn.Module):
def __init__(self, pk=0.5, r=3, c=4):
"""
实现论文中的diversificationblock, 接受一个三维的feature map,返回一个numpy的列表,作为遮罩
:param pk: pk是bc'中随机遮罩的概率
:param r: bc''中行分成几块
:param c: bc''中列分成几块
"""
super(DiversificationBlock, self).__init__()
self.pk = pk
self.r = r
self.c = c
def forward(self, feature_maps):
def helperb1(feature_map):
row, col = torch.where(feature_map == torch.max(feature_map))
print(row, col)
b1 = torch.zeros_like(feature_map)
for i in range(len(row)):
r, c = int(row[i]), int(col[i])
b1[r, c] = 1
return b1
def from_num_to_block(mat, r, c, num):
assert len(mat.shape) == 2, ValueError("Feature map shape is wrong!")
res = np.zeros_like(mat)
row, col = mat.shape
block_r, block_c = int(row / r), int(col / c)
index = np.arange(r * c) + 1
index = index.reshape(r, c)
index_r, index_c = np.argwhere(index == num)[0]
if index_c + 1 == c:
end_c = c + 1
else:
end_c = (index_c + 1) * block_c
if index_r + 1 == r:
end_r = r + 1
else:
end_r = (index_r + 1) * block_r
res[index_r * block_r: end_r, index_c * block_c:end_c] = 1
return res
if len(feature_maps.shape) == 3:
resb1 = []
resb2 = []
feature_maps_list = torch.split(feature_maps, 1)
for feature_map in feature_maps_list:
feature_map = feature_map.squeeze()
tmp = helperb1(feature_map)
resb1.append(tmp)
tmp1 = from_num_to_block(feature_map, self.r, self.c, 3)
resb2.append(tmp1)
elif len(feature_maps.shape) == 2:
tmp = helperb1(feature_maps)
tmp1 = from_num_to_block(feature_maps, self.r, self.c, 3)
resb1 = [tmp]
resb2 = [tmp1]
else:
raise ValueError
res = [np.clip(resb1[x].numpy() + resb2[x], 0, 1) for x in range(len(resb1))]
return res
if __name__ == '__main__':
feature_maps = torch.rand([3,3,4])
print("feature maps is: ", feature_maps)
db = DiversificationBlock()
res = db(feature_maps)
print(res, len(res))
将feature_maps按照维度dim=1切分为dim_num个tensor。
求出feature_map中最大值的坐标。
得到第一个与num相等的index值的坐标位置。
>>> x = np.arange(6).reshape(2,3)
>>> x
array([[0, 1, 2],
[3, 4, 5]])
>>> np.argwhere(x>1)
array([[0, 2],
[1, 0],
[1, 1],
[1, 2]])