以下链接是个人关于fast-reid(BoT行人重识别) 所有见解,如有错误欢迎大家指出,我会第一时间纠正。有兴趣的朋友可以加微信:a944284742相互讨论技术。若是帮助到了你什么,一定要记得点赞!因为这是对我最大的鼓励。
行人重识别02-00:fast-reid(BoT)-目录-史上最新无死角讲解
在 fastreid\engine\train_loop.py 文件中,找到类 class SimpleTrainer(TrainerBase),可以看到如下代码:
class SimpleTrainer(TrainerBase):
......
def run_step(self):
# 进行前向传播
outputs, targets = self.model(data)
# Compute loss,计算loss
if isinstance(self.model, DistributedDataParallel):
loss_dict = self.model.module.losses(outputs, targets)
else:
loss_dict = self.model.losses(outputs, targets)
这里看到,首先把数据送入到构建的模型之中,然后进行前向传播,获得预测的结果之后计算loos。那么这里的模型是那个模型?是如何构建的呢?其实,这里的 model 就是 fastreid\modeling\meta_arch\baseline.py 文件中 class Baseline(nn.Module) 创建的对象。
本人对于 class Baseline(nn.Module) 的注释如下:
@META_ARCH_REGISTRY.register()
class Baseline(nn.Module):
def __init__(self, cfg):
super().__init__()
self._cfg = cfg
# 获得数据预处理的参数
assert len(cfg.MODEL.PIXEL_MEAN) == len(cfg.MODEL.PIXEL_STD)
self.register_buffer("pixel_mean", torch.tensor(cfg.MODEL.PIXEL_MEAN).view(1, -1, 1, 1))
self.register_buffer("pixel_std", torch.tensor(cfg.MODEL.PIXEL_STD).view(1, -1, 1, 1))
# backbone,根据参数构建主干网络,如Resnet50等等
self.backbone = build_backbone(cfg)
# head,获得头部模型 pool 的类型,然后构建对应的 pool 方式
pool_type = cfg.MODEL.HEADS.POOL_LAYER
if pool_type == 'fastavgpool': pool_layer = FastGlobalAvgPool2d()
elif pool_type == 'avgpool': pool_layer = nn.AdaptiveAvgPool2d(1)
elif pool_type == 'maxpool': pool_layer = nn.AdaptiveMaxPool2d(1)
elif pool_type == 'gempool': pool_layer = GeneralizedMeanPoolingP()
elif pool_type == "avgmaxpool": pool_layer = AdaptiveAvgMaxPool2d()
elif pool_type == "identity": pool_layer = nn.Identity()
else:
raise KeyError(f"{pool_type} is invalid, please choose from "
f"'avgpool', 'maxpool', 'gempool', 'avgmaxpool' and 'identity'.")
# 获得头部模型的输入通道数,以及全链接层输出的类别数目
in_feat = cfg.MODEL.HEADS.IN_FEAT
num_classes = cfg.MODEL.HEADS.NUM_CLASSES
# 根据参数构建头部模型
self.heads = build_reid_heads(cfg, in_feat, num_classes, pool_layer)
@property
def device(self):
return self.pixel_mean.device
def forward(self, batched_inputs):
# 进行数据预处理
images = self.preprocess_image(batched_inputs)
# 通过主干网络提取特征
features = self.backbone(images)
# 如果是进行训练
if self.training:
assert "targets" in batched_inputs, "Person ID annotation are missing in training!"
# 获取标签
targets = batched_inputs["targets"].long().to(self.device)
# PreciseBN flag, When do preciseBN on different dataset, the number of classes in new dataset
# may be larger than that in the original dataset, so the circle/arcface will
# throw an error. We just set all the targets to 0 to avoid this problem.
if targets.sum() < 0: targets.zero_()
# 把主干网络提取出来的特征送入到头部网络,然后获得ID,分值,特征向量,即:
# cls_outputs, pred_class_logits, feat
return self.heads(features, targets), targets
else:
return self.heads(features)
def preprocess_image(self, batched_inputs):
"""
对输入数据进行预处理,做一个正则化操纵
Normalize and batch the input images.
"""
if isinstance(batched_inputs, dict):
images = batched_inputs["images"].to(self.device)
elif isinstance(batched_inputs, torch.Tensor):
images = batched_inputs.to(self.device)
images.sub_(self.pixel_mean).div_(self.pixel_std)
return images
def losses(self, outputs, gt_labels):
r"""
Compute loss from modeling's outputs, the loss function input arguments
must be the same as the outputs of the model forwarding.
"""
# 获得模型的输出结果,cls_outputs表示身份ID,pred_class_logits表示每个类别的得分值,pred_features预测的特征向量
cls_outputs, pred_class_logits, pred_features = outputs
loss_dict = {}
# 获得计算loss的名字
loss_names = self._cfg.MODEL.LOSSES.NAME
# Log prediction accuracy,计算预测的准确率,保存到 log 之中
CrossEntropyLoss.log_accuracy(pred_class_logits.detach(), gt_labels)
# 交叉损失熵loss
if "CrossEntropyLoss" in loss_names:
loss_dict['loss_cls'] = CrossEntropyLoss(self._cfg)(cls_outputs, gt_labels)
# 计算TripletLoss
if "TripletLoss" in loss_names:
loss_dict['loss_triplet'] = TripletLoss(self._cfg)(pred_features, gt_labels)
# 计算CircleLoss
if "CircleLoss" in loss_names:
loss_dict['loss_circle'] = CircleLoss(self._cfg)(pred_features, gt_labels)
return loss_dict
其上的结构十分的简单,这里就不做讲解了。但是对于:
self.heads = build_reid_heads(cfg, in_feat, num_classes, pool_layer)
构建的self.heads本人注释如下。
self.heads 是 fastreid\modeling\heads\bnneck_head.py 中 class BNneckHead(nn.Module) 创建出来的对象:
@REID_HEADS_REGISTRY.register()
class BNneckHead(nn.Module):
def __init__(self, cfg, in_feat, num_classes, pool_layer):
super().__init__()
# 标识是否使用 BNNeck 结构,根据参数
self.neck_feat = cfg.MODEL.HEADS.NECK_FEAT
# 把 pool 层进行赋值,又外面传递进来
self.pool_layer = pool_layer
# 根据参数,进行正则化处理,可以理解为论文中的 BN layers
self.bnneck = get_norm(cfg.MODEL.HEADS.NORM, in_feat, cfg.MODEL.HEADS.NORM_SPLIT, bias_freeze=True)
# 对 bnneck 进行权重初始化操作
self.bnneck.apply(weights_init_kaiming)
# identity classification layer,根据配置参数选择不同的分类方式
cls_type = cfg.MODEL.HEADS.CLS_LAYER
if cls_type == 'linear': self.classifier = nn.Linear(in_feat, num_classes, bias=False)
elif cls_type == 'arcSoftmax': self.classifier = ArcSoftmax(cfg, in_feat, num_classes)
elif cls_type == 'circleSoftmax': self.classifier = CircleSoftmax(cfg, in_feat, num_classes)
elif cls_type == 'amSoftmax': self.classifier = AMSoftmax(cfg, in_feat, num_classes)
else:
raise KeyError(f"{cls_type} is invalid, please choose from "
f"'linear', 'arcSoftmax', 'amSoftmax' and 'circleSoftmax'.")
# 对分类方式进行初始化
self.classifier.apply(weights_init_classifier)
def forward(self, features, targets=None):
"""
See :class:`ReIDHeads.forward`.
"""
# 把从主干网络获得特征进行pool操作.这里的可以理解为论文中的ft
global_feat = self.pool_layer(features)
# 送入到 bnneck 进行正则化操作,对应论文中的BN layers
bn_feat = self.bnneck(global_feat)
bn_feat = bn_feat[..., 0, 0]
# Evaluation,如果是评估模式则直接返回bn_feat
if not self.training: return bn_feat
# Training,如果为训练模式,则进行把 bn_feat 送入到分类器之中
try: cls_outputs = self.classifier(bn_feat)
except TypeError: cls_outputs = self.classifier(bn_feat, targets)
# 通过全链接层进行身份ID的预测
pred_class_logits = F.linear(bn_feat, self.classifier.weight)
# 如果使用 self.neck_feat == "before", 则返回global_feat,为论文中的ft
if self.neck_feat == "before": feat = global_feat[..., 0, 0]
# 如果使用 self.neck_feat == "after",则返回bn_feat,为论文中的fi
elif self.neck_feat == "after": feat = bn_feat
else:
raise KeyError("MODEL.HEADS.NECK_FEAT value is invalid, must choose from ('after' & 'before')")
return cls_outputs, pred_class_logits, feat
这样我们就明白了BoT网络的构建过程。