frootguo

【pytorch】Mask-RCNN官方源码剖析(Ⅲ)

模型定义(modeling)-关键部分

无论是在前面的训练文件中还是测试文件中，都使用了build_detection_model(cfg)函数来创建模型，该函数可以通过配置文件组合出不同类型的模型，为了了解模型的内部定义细节，需对./maskrcnn_benchmark/modeling/下的文件进行分析：

detector 模型定义了入口

detectors.py 文件解析：
根据给定的配置信息实例化一个class GeneralizedRCNN的对象

from .generalized_rcnn import GeneralizedRCNN

_DETECTION_META_ARCHITECTURES = {"GeneralizedRCNN": GeneralizedRCNN}
# 该函数是创建模型的入口函数，也是唯一的模型创建函数
def build_detection_model(cfg):
# 构建一个模型字典，虽然只有一对键值，但是方便后续的扩展
    meta_arch = _DETECTION_META_ARCHITECTURES[cfg.MODEL.META_ARCHITECTURE]
    return meta_arch(cfg)
    # 上面的语句等价于
    # return GeneralizedRCNN(cfg)

上面代码利用配置信息 cfg 实例化了一个class GeneralizedRCNN 类，该类定义在./maskrcnn_benchmark/modeling/detector/generalized_rcnn.py文件中。

generalized_rcnn.py文件解析：

import torch
from torch import nn

from maskrcnn_benchmark.structures.image_list import to_image_list

from ..backbone import build_backbone
from ..rpn.rpn import build_rpn
from ..roi_heads.roi_heads import build_roi_heads


class GeneralizedRCNN(nn.Module):
    """
    Main class for Generalized R-CNN. Currently supports boxes and masks.
    该类是rcnn模型的共同抽象，目前支持 boxes 和 masks 两种形式的标签
    It consists of three main parts:
    - backbone
    - rpn
    - heads: takes the features + the proposals from the RPN and computes
        detections / masks from it.利用前面网络输出的 features 和 proposals 来计算 detections/masks
    """

    def __init__(self, cfg): # 根据配置信息初始化模型
        super(GeneralizedRCNN, self).__init__()

        self.backbone = build_backbone(cfg) # 根据配置信息创建 backbone 网络
        self.rpn = build_rpn(cfg, self.backbone.out_channels) # 根据配置信息创建 rpn 网络
        self.roi_heads = build_roi_heads(cfg, self.backbone.out_channels) # 根据配置信息创建 roi_heads

    def forward(self, images, targets=None):
        """
        定义模型的前向传播过程
        Arguments:
            images (list[Tensor] or ImageList): images to be processed
            targets (list[BoxList]): ground-truth boxes present in the image (optional)
        Returns:
            result (list[BoxList] or dict[Tensor]): the output from the model.
                During training, it returns a dict[Tensor] which contains the losses.
                During testing, it returns list[BoxList] contains additional fields
                like `scores`, `labels` and `mask` (for Mask R-CNN models).
                在训练阶段，返回包含模型损失的字典；在推理阶段，返回模型的预测结果
        """
        # 当 training 设置为 true 时，必须提供targets
        if self.training and targets is None:
            raise ValueError("In training mode, targets should be passed")
        images = to_image_list(images) # 将图片的数据类型转换成 imagelist
        features = self.backbone(images.tensors) # 利用 backbone网络获取图片的features
        # 利用 rpn 网络获取 proposals 和相应的loss
        proposals, proposal_losses = self.rpn(images, features, targets)
        if self.roi_heads: # 如果 roi_heads 不为 none 的话，就直接计算其输出的结果
            x, result, detector_losses = self.roi_heads(features, proposals, targets)
        else:
            # RPN-only models don't have roi_heads
            x = features
            result = proposals
            detector_losses = {}

        if self.training: # 训练模式下，输出其损失值
            losses = {}
            losses.update(detector_losses)
            losses.update(proposal_losses)
            return losses

        return result # 如果不在训练模式下，则输出模型的预测结果

可以看出, MaskrcnnBenchmark 模型的创建主要依赖于三个函数, 即 build_backbone(cfg), build_rpn(cfg), build_roi_heads(cfg).

backbone目录下关于模型骨架的定义

backbone.py文件解析：

from collections import OrderedDict # 导入有序字典

from torch import nn

# 注册器，用于管理module的注册，使得可以像使用字典一样使用module
from maskrcnn_benchmark.modeling import registry
from maskrcnn_benchmark.modeling.make_layers import conv_with_kaiming_uniform
from . import fpn as fpn_module
from . import resnet

# 创建 resnet 骨架网络，根据配置信息会被后面的build_backbone()函数调用
@registry.BACKBONES.register("R-50-C4")
@registry.BACKBONES.register("R-50-C5")
@registry.BACKBONES.register("R-101-C4")
@registry.BACKBONES.register("R-101-C5")
def build_resnet_backbone(cfg):
    body = resnet.ResNet(cfg) # resnet.py 文件中的class Resnet(cfg)
    model = nn.Sequential(OrderedDict([("body", body)])) # 利用 nn.Sequential 定义模型
    model.out_channels = cfg.MODEL.RESNETS.BACKBONE_OUT_CHANNELS
    return model

# 创建 fpn 网络，根据配置信息会被下面的 build_backbone 函数调用
@registry.BACKBONES.register("R-50-FPN")
@registry.BACKBONES.register("R-101-FPN")
@registry.BACKBONES.register("R-152-FPN")
def build_resnet_fpn_backbone(cfg):
    body = resnet.ResNet(cfg) # 先创建 resnet 网络
    # 获取 fpn 所需要的 channels 参数
    in_channels_stage2 = cfg.MODEL.RESNETS.RES2_OUT_CHANNELS
    out_channels = cfg.MODEL.RESNETS.BACKBONE_OUT_CHANNELS
    fpn = fpn_module.FPN( # 利用 fpn.py 文件夹的class FPN 创建 fpn 网络
        in_channels_list=[
            in_channels_stage2,
            in_channels_stage2 * 2,
            in_channels_stage2 * 4,
            in_channels_stage2 * 8,
        ],
        out_channels=out_channels,
        conv_block=conv_with_kaiming_uniform(
            cfg.MODEL.FPN.USE_GN, cfg.MODEL.FPN.USE_RELU
        ),
        top_blocks=fpn_module.LastLevelMaxPool(),
    )
    model = nn.Sequential(OrderedDict([("body", body), ("fpn", fpn)]))
    model.out_channels = out_channels
    return model


@registry.BACKBONES.register("R-50-FPN-RETINANET")
@registry.BACKBONES.register("R-101-FPN-RETINANET")
def build_resnet_fpn_p3p7_backbone(cfg):
    body = resnet.ResNet(cfg)
    in_channels_stage2 = cfg.MODEL.RESNETS.RES2_OUT_CHANNELS
    out_channels = cfg.MODEL.RESNETS.BACKBONE_OUT_CHANNELS
    in_channels_p6p7 = in_channels_stage2 * 8 if cfg.MODEL.RETINANET.USE_C5 \
        else out_channels
    fpn = fpn_module.FPN(
        in_channels_list=[
            0,
            in_channels_stage2 * 2,
            in_channels_stage2 * 4,
            in_channels_stage2 * 8,
        ],
        out_channels=out_channels,
        conv_block=conv_with_kaiming_uniform(
            cfg.MODEL.FPN.USE_GN, cfg.MODEL.FPN.USE_RELU
        ),
        top_blocks=fpn_module.LastLevelP6P7(in_channels_p6p7, out_channels),
    )
    model = nn.Sequential(OrderedDict([("body", body), ("fpn", fpn)]))
    model.out_channels = out_channels
    return model

# 利用上述函数来进行模型创建
def build_backbone(cfg):
    assert cfg.MODEL.BACKBONE.CONV_BODY in registry.BACKBONES, \
        "cfg.MODEL.BACKBONE.CONV_BODY: {} are not registered in registry".format(
            cfg.MODEL.BACKBONE.CONV_BODY
        )
    return registry.BACKBONES[cfg.MODEL.BACKBONE.CONV_BODY](cfg)

resnet.py 文件解析：

from collections import namedtuple

import torch
import torch.nn.functional as F
from torch import nn

from maskrcnn_benchmark.layers import FrozenBatchNorm2d
from maskrcnn_benchmark.layers import Conv2d
from maskrcnn_benchmark.layers import DFConv2d
from maskrcnn_benchmark.modeling.make_layers import group_norm
from maskrcnn_benchmark.utils.registry import Registry


# ResNet stage specification
StageSpec = namedtuple(
    "StageSpec",
    [
        "index",  # Index of the stage, eg 1, 2, ..,. 5
        "block_count",  # Number of residual blocks in the stage stage当中的残差块数量
        "return_features",  # True => return the last feature map from this stage
    ],
)

# -----------------------------------------------------------------------------
# Standard ResNet models
# -----------------------------------------------------------------------------
# ResNet-50 (including all stages)
ResNet50StagesTo5 = tuple(
    StageSpec(index=i, block_count=c, return_features=r)
    for (i, c, r) in ((1, 3, False), (2, 4, False), (3, 6, False), (4, 3, True))
)
# ResNet-50 up to stage 4 (excludes stage 5)
ResNet50StagesTo4 = tuple(
    StageSpec(index=i, block_count=c, return_features=r)
    for (i, c, r) in ((1, 3, False), (2, 4, False), (3, 6, True))
)
# ResNet-101 (including all stages)
ResNet101StagesTo5 = tuple(
    StageSpec(index=i, block_count=c, return_features=r)
    for (i, c, r) in ((1, 3, False), (2, 4, False), (3, 23, False), (4, 3, True))
)
# ResNet-101 up to stage 4 (excludes stage 5)
ResNet101StagesTo4 = tuple(
    StageSpec(index=i, block_count=c, return_features=r)
    for (i, c, r) in ((1, 3, False), (2, 4, False), (3, 23, True))
)
# ResNet-50-FPN (including all stages)
ResNet50FPNStagesTo5 = tuple(
    StageSpec(index=i, block_count=c, return_features=r)
    for (i, c, r) in ((1, 3, True), (2, 4, True), (3, 6, True), (4, 3, True))
)
# ResNet-101-FPN (including all stages)
ResNet101FPNStagesTo5 = tuple(
    StageSpec(index=i, block_count=c, return_features=r)
    for (i, c, r) in ((1, 3, True), (2, 4, True), (3, 23, True), (4, 3, True))
)
# ResNet-152-FPN (including all stages)
ResNet152FPNStagesTo5 = tuple(
    StageSpec(index=i, block_count=c, return_features=r)
    for (i, c, r) in ((1, 3, True), (2, 8, True), (3, 36, True), (4, 3, True))
)

class ResNet(nn.Module):
# 初始化
    def __init__(self, cfg):
        super(ResNet, self).__init__()
        # 如果希望在 forward 函数中使用 cfg，那么就应该创建一个副本以其使用
        # self.cfg = cfg.clone()

        # 将配置文件中的字符串转化成具体的实现，下面三个分别使用了对应的注册模块，定义在文件的最后
        # stem的实现，也就是 resnet 的第一阶段 conv1
        # cfg.MODEL.RESNETS.STEM_FUNC = 'StemWithFixedBatchNorm'
        stem_module = _STEM_MODULES[cfg.MODEL.RESNETS.STEM_FUNC]
        # resnet conv2_x~conv5_x 的实现
        # eg: cfg.MODEL.CONV_BODY="R-50-FPN"
        stage_specs = _STAGE_SPECS[cfg.MODEL.BACKBONE.CONV_BODY]
        # residual transformation function
        # cfg.MODEL.RESNETS.TRANS_FUNC="BottleneckWithFixedBatchNorm"
        transformation_module = _TRANSFORMATION_MODULES[cfg.MODEL.RESNETS.TRANS_FUNC]

        # 获取上面各个部分的组成实现后，可以利用上述实现来构建模型
        # Construct the stem module，构建stem module（就是 resnet 的 stage1，或者conv1）
        self.stem = stem_module(cfg)
        

        # 获取相应的信息来构建 resnet 的其他 stages的卷积层
        # 当 num_groups = 1 时为resnet，>1 时为 resnext
        num_groups = cfg.MODEL.RESNETS.NUM_GROUPS
        width_per_group = cfg.MODEL.RESNETS.WIDTH_PER_GROUP
        # in_channels 指的是向后面的第二阶段输入时特征图的通道数，也就是 stem 的输出通道数，默认为 64
        in_channels = cfg.MODEL.RESNETS.STEM_OUT_CHANNELS
        stage2_bottleneck_channels = num_groups * width_per_group
        # 第二阶段的输出，resnet 系列标准的模型，可以从 resnet第二阶段的输出通道数判断后续的通道数
        # 默认为 256，则后续分别为 512,1024,2048，若为64，则后续分别为128,256,512
        stage2_out_channels = cfg.MODEL.RESNETS.RES2_OUT_CHANNELS
        # 创建一个空的stages列表和对应的特征图字典
        self.stages = []
        self.return_features = {}
        for stage_spec in stage_specs:
            name = "layer" + str(stage_spec.index)
            # 计算每一个 stage的输出通道数，每经过一个stage，通道数都会加倍
            stage2_relative_factor = 2 ** (stage_spec.index - 1)
            # 计算输入特征图的通道数
            bottleneck_channels = stage2_bottleneck_channels * stage2_relative_factor
            # 计算输出特征图的通道数
            out_channels = stage2_out_channels * stage2_relative_factor
            stage_with_dcn = cfg.MODEL.RESNETS.STAGE_WITH_DCN[stage_spec.index -1]
            # 该函数可以根据传入的参数创建对应 stage 的模块
            # 当获取到所有需要的参数以后，调用本文件的'_make_stage'函数
            module = _make_stage(
                transformation_module,
                in_channels, # 输入的通道数
                bottleneck_channels, # 压缩后的通道数
                out_channels, # 输出的通道数
                stage_spec.block_count, # 当前stage 的卷积层数量
                num_groups, # resnet 时为1，resnext时>1
                cfg.MODEL.RESNETS.STRIDE_IN_1X1,
                # 当处于 stage3~5时，需要在开始的时候使用 stride=2 来 downsize
                first_stride=int(stage_spec.index > 1) + 1,
                dcn_config={
                    "stage_with_dcn": stage_with_dcn,
                    "with_modulated_dcn": cfg.MODEL.RESNETS.WITH_MODULATED_DCN,
                    "deformable_groups": cfg.MODEL.RESNETS.DEFORMABLE_GROUPS,
                }
            )
            # 下一个 stage的输入通道即为当前 stage 的输入通道数
            in_channels = out_channels
            # 将当前 stage 模块添加到模型中
            self.add_module(name, module)
            # 将 stage 的名称添加到列表中
            self.stages.append(name)
            # 将 stage 的布尔值添加到字典中
            self.return_features[name] = stage_spec.return_features

        # Optionally freeze (requires_grad=False) parts of the backbone
        # 根据配置文件参数选择性的冻结某些层（requires_grad=false）
        self._freeze_backbone(cfg.MODEL.BACKBONE.FREEZE_CONV_BODY_AT)

    # 将指定的参数置为： requires_grad = false
    def _freeze_backbone(self, freeze_at):
    # 根据给定的参数冻结某些层的参数更新
        if freeze_at < 0:
            return
        for stage_index in range(freeze_at):
            if stage_index == 0:
                m = self.stem  # stage 0 is the stem
            else:
                m = getattr(self, "layer" + str(stage_index))
            # 将 m 中的所有参数置为不更新的状态
            for p in m.parameters():
                p.requires_grad = False

    # 定义resnet 前向传播过程
    def forward(self, x):
        outputs = []
        x = self.stem(x) # 先经过 stem(stage 1)
        # 再依次计算 stage2~5的结果
        for stage_name in self.stages:
            x = getattr(self, stage_name)(x)
            if self.return_features[stage_name]:
                # 将stage 2~5 的计算结果（特征图）以列表的形式保存
                outputs.append(x)
        # 将结果返回，outputs为列表形式，元素为各个stage的特征图，正好作为FPN 的输入
        return outputs


class ResNetHead(nn.Module):
    def __init__(
        self,
        block_module,
        stages,
        num_groups=1,
        width_per_group=64,
        stride_in_1x1=True,
        stride_init=None,
        res2_out_channels=256,
        dilation=1,
        dcn_config={}
    ):
        super(ResNetHead, self).__init__()
        # 获取不同 stage 的通道数相对于 stage2 的倍数
        stage2_relative_factor = 2 ** (stages[0].index - 1)
        # 获取压缩后的 stage2 的 channels
        stage2_bottleneck_channels = num_groups * width_per_group
        # 获取输出的 channels
        out_channels = res2_out_channels * stage2_relative_factor
        # 获取输入的 channels
        in_channels = out_channels // 2
        # 获取压缩后的 channels
        bottleneck_channels = stage2_bottleneck_channels * stage2_relative_factor

        block_module = _TRANSFORMATION_MODULES[block_module]

        self.stages = []
        stride = stride_init
        for stage in stages:
            name = "layer" + str(stage.index)
            if not stride:
            # 当处于 stage3~5时，需要在开始时候使用 stride=2 来 downsize
                stride = int(stage.index > 1) + 1
            module = _make_stage(
                block_module,
                in_channels,
                bottleneck_channels,
                out_channels,
                stage.block_count,
                num_groups,
                stride_in_1x1,
                first_stride=stride,
                dilation=dilation,
                dcn_config=dcn_config
            )
            stride = None
            self.add_module(name, module)
            self.stages.append(name)
        self.out_channels = out_channels

# 定义前向传播
    def forward(self, x):
        for stage in self.stages:
            x = getattr(self, stage)(x)
        return x

# 创建 resnet 的 residual-block
def _make_stage(
    transformation_module,
    in_channels,
    bottleneck_channels,
    out_channels,
    block_count,
    num_groups,
    stride_in_1x1,
    first_stride,
    dilation=1,
    dcn_config={}
):
    blocks = []
    stride = first_stride
    for _ in range(block_count):
        blocks.append(
            transformation_module(
                in_channels,
                bottleneck_channels,
                out_channels,
                num_groups,
                stride_in_1x1,
                stride,
                dilation=dilation,
                dcn_config=dcn_config
            )
        )
        stride = 1
        in_channels = out_channels
    return nn.Sequential(*blocks)

# 定义每一个resnet-bottleneck
# 对于 resnet50 来说,  stage2~5每一个阶段的 bottleneck block 的数量分别为 3,4,6,3, 并且各个相邻 stage 之间的通道数都是两倍的关系, 所以可以很容易的从一个 stage 的通道数推知另一个 stage 的通道数
class Bottleneck(nn.Module):
    def __init__(
        self,
        in_channels, # bottleneck 的输入 channels
        bottleneck_channels, # bottleneck 压缩后的channels
        out_channels, # 当前stage的输出 channels
        num_groups,
        stride_in_1x1,
        stride,
        dilation,
        norm_func,
        dcn_config
    ):
        super(Bottleneck, self).__init__()
        # downsample：当 bottleneck 的输入和输出 channels 不相等时，则需要采取一定的策略
        # 即在输入输出通道数不相等时才使用 projection shortcuts
        # 也就是利用参数矩阵映射使得输入输出的channels 相等
        self.downsample = None
        # 当输入输出通道数不同时，额外添加一个 1x1 的卷积层使得输入通道映射成输出通道数
        if in_channels != out_channels:
            down_stride = stride if dilation == 1 else 1
            self.downsample = nn.Sequential(
                Conv2d(
                    in_channels, out_channels,
                    kernel_size=1, stride=down_stride, bias=False
                ),
                norm_func(out_channels), # 后接固定参数的bn层
            )
            for modules in [self.downsample,]:
                for l in modules.modules():
                    if isinstance(l, Conv2d):
                        nn.init.kaiming_uniform_(l.weight, a=1)

        if dilation > 1:
            stride = 1 # reset to be 1

        # The original MSRA ResNet models have stride in the first 1x1 conv
        # The subsequent fb.torch.resnet and Caffe2 ResNe[X]t implementations have
        # stride in the 3x3 conv
        stride_1x1, stride_3x3 = (stride, 1) if stride_in_1x1 else (1, stride)
        # 获取到当前stage所需的参数后，就创建相应的卷积层
        self.conv1 = Conv2d(
            in_channels,
            bottleneck_channels,
            kernel_size=1,
            stride=stride_1x1,
            bias=False,
        )
        self.bn1 = norm_func(bottleneck_channels) # 后接一个固定参数的 bn 层
        # TODO: specify init for the above dcn层的问题
        with_dcn = dcn_config.get("stage_with_dcn", False)
        if with_dcn:
            deformable_groups = dcn_config.get("deformable_groups", 1)
            with_modulated_dcn = dcn_config.get("with_modulated_dcn", False)
            self.conv2 = DFConv2d(
                bottleneck_channels,
                bottleneck_channels,
                with_modulated_dcn=with_modulated_dcn,
                kernel_size=3,
                stride=stride_3x3,
                groups=num_groups,
                dilation=dilation,
                deformable_groups=deformable_groups,
                bias=False
            )
        else:
        # 创建 bottleneck 的第二层卷积层
            self.conv2 = Conv2d(
                bottleneck_channels,
                bottleneck_channels,
                kernel_size=3,
                stride=stride_3x3,
                padding=dilation,
                bias=False,
                groups=num_groups,
                dilation=dilation
            )
            nn.init.kaiming_uniform_(self.conv2.weight, a=1)

        self.bn2 = norm_func(bottleneck_channels) # 后接一个BN层

        # 创建 bottleneck 的最后一个卷积层，padding默认为 1
        self.conv3 = Conv2d(
            bottleneck_channels, out_channels, kernel_size=1, bias=False
        )
        self.bn3 = norm_func(out_channels)

        for l in [self.conv1, self.conv3,]:
            nn.init.kaiming_uniform_(l.weight, a=1)

    def forward(self, x):
    # 执行一次 forward，相当于执行一次 bottleneck
    # 默认情况下，具有三个卷积层，一个恒等连接，每个卷积层之后都带有 bn 和 relu激活
    # 注意： 最后一个激活要放在恒等连接之后
        identity = x # 恒等连接，直接令残差等于 x 即可
        # conv1， bn1
        out = self.conv1(x)
        out = self.bn1(out)
        out = F.relu_(out)
        # conv2， bn2
        out = self.conv2(out)
        out = self.bn2(out)
        out = F.relu_(out)
        # conv3，bn3
        out = self.conv3(out)
        out = self.bn3(out)

        if self.downsample is not None:
        # 如果输入输出的通道数不同，则必须要通过映射使之相同
            identity = self.downsample(x)

        out += identity
        out = F.relu_(out) # 最后再进行激活

        return out

# resnet的第一阶段，在resnet 50 中，该阶段主要包含一个7x7大小的卷积核，在maskrcnnbenchmark的视线中，为了方便，将第二阶段最开始的max pooling层也放在了stem中的forward函数中实现（一般不带参数网络层的都放在forward中）
class BaseStem(nn.Module):
    def __init__(self, cfg, norm_func):
        super(BaseStem, self).__init__()
        # resnet-50，out_channels=64
        out_channels = cfg.MODEL.RESNETS.STEM_OUT_CHANNELS
        # 输入的channels 为 3 ，输出为64
        self.conv1 = Conv2d(
            3, out_channels, kernel_size=7, stride=2, padding=3, bias=False
        )
        # 使用固定参数的bn层
        self.bn1 = norm_func(out_channels)
        # 权重初始化方式
        for l in [self.conv1,]:
            nn.init.kaiming_uniform_(l.weight, a=1)
# 定义前向传播过程
    def forward(self, x):
        x = self.conv1(x)
        x = self.bn1(x)
        x = F.relu_(x) # 原地激活，因为不含参数，因此不放在模型定义中，而放在 forward中实现
        x = F.max_pool2d(x, kernel_size=3, stride=2, padding=1)
        return x

#使用固定的BN
class BottleneckWithFixedBatchNorm(Bottleneck):
    def __init__(
        self,
        in_channels,
        bottleneck_channels,
        out_channels,
        num_groups=1,
        stride_in_1x1=True,
        stride=1,
        dilation=1,
        dcn_config={}
    ):
        super(BottleneckWithFixedBatchNorm, self).__init__(
            in_channels=in_channels,
            bottleneck_channels=bottleneck_channels,
            out_channels=out_channels,
            num_groups=num_groups,
            stride_in_1x1=stride_in_1x1,
            stride=stride,
            dilation=dilation,
            norm_func=FrozenBatchNorm2d,
            dcn_config=dcn_config
        )


class StemWithFixedBatchNorm(BaseStem):
    def __init__(self, cfg):
        super(StemWithFixedBatchNorm, self).__init__(
            cfg, norm_func=FrozenBatchNorm2d
        )


class BottleneckWithGN(Bottleneck):
    def __init__(
        self,
        in_channels,
        bottleneck_channels,
        out_channels,
        num_groups=1,
        stride_in_1x1=True,
        stride=1,
        dilation=1,
        dcn_config={}
    ):
        super(BottleneckWithGN, self).__init__(
            in_channels=in_channels,
            bottleneck_channels=bottleneck_channels,
            out_channels=out_channels,
            num_groups=num_groups,
            stride_in_1x1=stride_in_1x1,
            stride=stride,
            dilation=dilation,
            norm_func=group_norm,
            dcn_config=dcn_config
        )


class StemWithGN(BaseStem):
    def __init__(self, cfg):
        super(StemWithGN, self).__init__(cfg, norm_func=group_norm)


# 文件注册的各个模块, 这些模块会通过配置文件中的字符串信息来决定调用哪一个类或者参数
_TRANSFORMATION_MODULES = Registry({
    "BottleneckWithFixedBatchNorm": BottleneckWithFixedBatchNorm,
    "BottleneckWithGN": BottleneckWithGN,
})

_STEM_MODULES = Registry({
    "StemWithFixedBatchNorm": StemWithFixedBatchNorm,
    "StemWithGN": StemWithGN,
})

_STAGE_SPECS = Registry({
    "R-50-C4": ResNet50StagesTo4,
    "R-50-C5": ResNet50StagesTo5,
    "R-101-C4": ResNet101StagesTo4,
    "R-101-C5": ResNet101StagesTo5,
    "R-50-FPN": ResNet50FPNStagesTo5,
    "R-50-FPN-RETINANET": ResNet50FPNStagesTo5,
    "R-101-FPN": ResNet101FPNStagesTo5,
    "R-101-FPN-RETINANET": ResNet101FPNStagesTo5,
    "R-152-FPN": ResNet152FPNStagesTo5,
})

fpn.py 文件解析：

在 backbone.py 文件中的 build_resnet_fpn_backbone(cfg) 函数中, 使用了 fpn = fpn_module.FPN(...) 来创建一个 FPN 类的实例对象, 并且利用 nn.Sequential() 将 ResNet 和 FPN 组合在一起形成一个模型, 并将其返回, 下面, 我们就来看看 FPN 网络的具体实现, 实例代码位于 ./maskrcnn_benchmark/modeling/backbone/fpn.py 文件中, 解析如下:

import torch
import torch.nn.functional as F
from torch import nn


class FPN(nn.Module):
    """
    在一系列的 feature map（实际上就是 stage2~5 的最后一层输出）添加FPN
    这些 feature maps 的depth 假定是不断递增的，并且feature maps必须是连续的（从stage角度）
    """

    def __init__(
        self, in_channels_list, out_channels, conv_block, top_blocks=None
    ):
        """
        Arguments:
            in_channels_list (list[int]): number of channels for each feature map that
                will be fed 指定了送入 fpn 的每个 feature map 的通道数
            out_channels (int): number of channels of the FPN representation 
            fpn表征的通道数，所有的特征图最终都会转换成这个通道数的大小
            top_blocks (nn.Module or None): if provided, an extra operation will
                be performed on the output of the last (smallest resolution)
                FPN output, and the result will extend the result list
                当提供了 top_blocks 时，就会在 fpn的最后输出上进行一个额外的 op，然后result会扩展成 result list 返回
        """
        super(FPN, self).__init__()
        # 创建两个空列表
        self.inner_blocks = []
        self.layer_blocks = []
        # 假设我们使用的是 resnet-50-fpn 和配置，则 in_channels_list 的值为：
        # [256,512,1024,2048]
        for idx, in_channels in enumerate(in_channels_list, 1): # 下标从1 开始
            # 用下标起名：fpn_inner1,fpn_inner2, fpn_inner3, fpn_inner4
            inner_block = "fpn_inner{}".format(idx)
            # fpn_layer1, fpn_layer2, fpn_layer3, fpn_layer4
            layer_block = "fpn_layer{}".format(idx)

            if in_channels == 0:
                continue
            # 创建 inner_block 模块，这里 in_channels 为各个stage 输出的通道数
            # out_channels 为 256， 定义在用户配置文件中
            # 这里的卷积核大小为1，该卷积的主要作用是改变通道数到 out_channels（降维）
            inner_block_module = conv_block(in_channels, out_channels, 1)
            # 改变 channels 后，在每个 stage 的特征图上再进行 3x3 的卷积计算，通道数不变
            layer_block_module = conv_block(out_channels, out_channels, 3, 1)
            # 在当前特征图上添加 fpn
            self.add_module(inner_block, inner_block_module)
            self.add_module(layer_block, layer_block_module)
            # 将当前 stage 的fpn模块的名字添加到对应的列表当中
            self.inner_blocks.append(inner_block)
            self.layer_blocks.append(layer_block)
        # 将 top_blocks 作为 FPN 类的成员变量
        self.top_blocks = top_blocks

    def forward(self, x):
        """
        Arguments:
            x (list[Tensor]): feature maps for each feature level.
            resnet的计算结果正好满足 fpn 的输入要求，因此可以使用nn.Sequential直接将两者结合
        Returns:
            results (tuple[Tensor]): feature maps after FPN layers.
                They are ordered from highest resolution first.
                经过fpn后的特征图组成的列表，排列顺序是高分辨率的在前
        """
        # 先计算最后一层（分辨率最低）特征图的fpn结果
        last_inner = getattr(self, self.inner_blocks[-1])(x[-1])
        # 创建一个空的结果列表
        results = []
        # 将最后一层的计算结果添加到 results 中
        results.append(getattr(self, self.layer_blocks[-1])(last_inner))
        #[:-1]获取了前三项，[::-1]代表从头到尾切片，步长为-1，效果为列表逆置
        # 举例来说，zip里的操作 self.inner_block[:-1][::-1]的运行结果为
        # [fpn_inner3,fpn_inner2,fpn_inner1],相当于对列表进行了逆置
        for feature, inner_block, layer_block in zip(
            x[:-1][::-1], self.inner_blocks[:-1][::-1], self.layer_blocks[:-1][::-1]
        ):
            if not inner_block:
                continue
            # 根据给定的 scale 参数对特征图进行放大/缩小，这里scale=2，所以是放大
            inner_top_down = F.interpolate(last_inner, scale_factor=2, mode="nearest")
            # 获取 inner_block 的计算结果
            inner_lateral = getattr(self, inner_block)(feature)
            # TODO use size instead of scale to make it robust to different sizes
            # inner_top_down = F.upsample(last_inner, size=inner_lateral.shape[-2:],
            # mode='bilinear', align_corners=False)
            # 将二者叠加作为当前stage的输出，同时作为下一个stage的输入
            last_inner = inner_lateral + inner_top_down
            # 将当前 stage 输出添加到结果列表中，注意还要用 layer_block 执行卷积计算
            # 同时为了使得分辨率最大的在前，我们需要将结果插入到0位置
            results.insert(0, getattr(self, layer_block)(last_inner))

        # 如果 top_blocks 不为空，则需要执行如下的额外的op
        if isinstance(self.top_blocks, LastLevelP6P7):
            last_results = self.top_blocks(x[-1], results[-1])
            results.extend(last_results)
        elif isinstance(self.top_blocks, LastLevelMaxPool):
            last_results = self.top_blocks(results[-1])
            results.extend(last_results) # 将新的计算结果追加进列表中
        # 以元组（只读）形式返回
        return tuple(results)

# 最后一级的 max pool层
class LastLevelMaxPool(nn.Module):
    def forward(self, x):
        return [F.max_pool2d(x, 1, 2, 0)]


class LastLevelP6P7(nn.Module):
    """
    This module is used in RetinaNet to generate extra layers, P6 and P7.
    如果该模型采用retinanet需要采用多的p6和p7层
    """
    def __init__(self, in_channels, out_channels):
        super(LastLevelP6P7, self).__init__()
        self.p6 = nn.Conv2d(in_channels, out_channels, 3, 2, 1)
        self.p7 = nn.Conv2d(out_channels, out_channels, 3, 2, 1)
        for module in [self.p6, self.p7]:
            nn.init.kaiming_uniform_(module.weight, a=1)
            nn.init.constant_(module.bias, 0)
        self.use_P5 = in_channels == out_channels

    def forward(self, c5, p5):
        x = p5 if self.use_P5 else c5
        p6 = self.p6(x)
        p7 = self.p7(F.relu(p6))
        return [p6, p7]

roi_heads

当使用 backbone 和 rpn 构建后特征图谱的生成结构以后, 我们就需要在特征图谱上划分相应的 RoI, 该模块的定义入口就是roi_heads/roi_heads.py中build_roi_heads函数

入口函数build_roi_heads：

def build_roi_heads(cfg, in_channels):
    # individually create the heads, that will be combined together
    # afterwards
    roi_heads = []
    if cfg.MODEL.RETINANET_ON:
        return []
    # 从概念上，下面的 roi 可以同时开启，互不影响，但通常只会开启其中一个
    if not cfg.MODEL.RPN_ONLY: # 使用 rpn
        roi_heads.append(("box", build_roi_box_head(cfg, in_channels)))
    if cfg.MODEL.MASK_ON: # 使用 mask
        roi_heads.append(("mask", build_roi_mask_head(cfg, in_channels)))
    if cfg.MODEL.KEYPOINT_ON: # 使用 key point
        roi_heads.append(("keypoint", build_roi_keypoint_head(cfg, in_channels)))

    # combine individual heads in a single module
    if roi_heads:
        roi_heads = CombinedROIHeads(cfg, roi_heads)

    return roi_heads

roi_heads/box_head/box_head.py 文件：

class ROIBoxHead(torch.nn.Module):
    """
    Generic Box Head class.
    """

    def __init__(self, cfg, in_channels):
        super(ROIBoxHead, self).__init__()
        # 定义在 roi_box_feature_extractors.py文件中
        self.feature_extractor = make_roi_box_feature_extractor(cfg, in_channels)
        # 函数定义在roi_box_predictors.py
        self.predictor = make_roi_box_predictor(
            cfg, self.feature_extractor.out_channels)
        self.post_processor = make_roi_box_post_processor(cfg) # 定义在 inference.py 文件中
        self.loss_evaluator = make_roi_box_loss_evaluator(cfg) # 定义在 loss.py

    def forward(self, features, proposals, targets=None):
        """
        Arguments:
            features (list[Tensor]): feature-maps from possibly several levels
            proposals (list[BoxList]): proposal boxes
            targets (list[BoxList], optional): the ground-truth targets.
        Returns:
            x (Tensor): the result of the feature extractor
            proposals (list[BoxList]): during training, the subsampled proposals
                are returned. During testing, the predicted boxlists are returned
            losses (dict[Tensor]): During training, returns the losses for the
                head. During testing, returns an empty dict.
        """

        if self.training:
            # Faster R-CNN subsamples during training the proposals with a fixed
            # positive / negative ratio
            with torch.no_grad():
                proposals = self.loss_evaluator.subsample(proposals, targets)

        # extract features that will be fed to the final classifier. The
        # feature_extractor generally corresponds to the pooler + heads
        x = self.feature_extractor(features, proposals)
        # final classifier that converts the features into predictions
        class_logits, box_regression = self.predictor(x)

        if not self.training:
            result = self.post_processor((class_logits, box_regression), proposals)
            return x, result, {}

        loss_classifier, loss_box_reg = self.loss_evaluator(
            [class_logits], [box_regression]
        )
        return (
            x,
            proposals,
            dict(loss_classifier=loss_classifier, loss_box_reg=loss_box_reg),
        )

模型定义(modeling)–RPN网络

在 Faster R-CNN 中, 首次提出了 RPN 网络, 该网络用于生成目标检测任务所需要候选区域框, 在MaskrcnnBenchmark 中, 关于 RPN 网络的定义位于 ./maskrcnn_benchmark/modeling/rpn/ 文件夹中, 该文件夹包含以下四个文件:rpn.py、anchor_generator.py、inference.py、loss.py，在 class GeneralizedRCNN(nn.Module) 类中, 会通过 self.rpn = build_rpn(cfg) 函数来创建 RPN 网络, 该函数位于 ./maskrcnn_benchmark/modeling/rpn/rpn.py 文件中。

rpn.py 文件：

def build_fpn(cfg):
    return RPNModule(cfg)

构建 RPN 网络的核心定义在 class RPNModule 中：

class RPNModule(torch.nn.Module):
    """
    Module for RPN computation. Takes feature maps from the backbone and outputs 
    RPN proposals and losses. Works for both FPN and non-FPN.
    从backbone中获取特征图用于计算，输出proposals和损失值
    """

    def __init__(self, cfg, in_channels):
        super(RPNModule, self).__init__()

        self.cfg = cfg.clone()
        # 根据配置文件的信息输出对应的anchor
        anchor_generator = make_anchor_generator(cfg)
       # 创建 rpn heads
        rpn_head = registry.RPN_HEADS[cfg.MODEL.RPN.RPN_HEAD]
        head = rpn_head(
            cfg, in_channels, anchor_generator.num_anchors_per_location()[0]
        )
        # 主要功能是将 bounding boxes 的表示形式编码成易于训练的形式
        rpn_box_coder = BoxCoder(weights=(1.0, 1.0, 1.0, 1.0))
       # 根据配置信息对候选框进行后处理，选取合适的框进行训练
        box_selector_train = make_rpn_postprocessor(cfg, rpn_box_coder, is_train=True)
        # 选取合适的框用于测试
        box_selector_test = make_rpn_postprocessor(cfg, rpn_box_coder, is_train=False)
        # 利用得到的box 获取损失函数
        loss_evaluator = make_rpn_loss_evaluator(cfg, rpn_box_coder)
        # 设置相应的成员
        self.anchor_generator = anchor_generator
        self.head = head
        self.box_selector_train = box_selector_train
        self.box_selector_test = box_selector_test
        self.loss_evaluator = loss_evaluator

# 定义前向传播的过程
    def forward(self, images, features, targets=None):
        """
        Arguments:
            images (ImageList): images for which we want to compute the predictions
            features (list[Tensor]): features computed from the images that are
                used for computing the predictions. Each tensor in the list
                correspond to different feature levels
            targets (list[BoxList): ground-truth boxes present in the image (optional)
        Returns:
            boxes (list[BoxList]): the predicted boxes from the RPN, one BoxList per
                image.
            losses (dict[Tensor]): the losses for the model during training. During
                testing, it is an empty dict.
        """
        # 利用给定的特征图谱计算相应的 rpn 结果
        objectness, rpn_box_regression = self.head(features)
        # 在图片上生成 anchors
        anchors = self.anchor_generator(images, features)
        # 当处在训练状态时，调用_foward_train(),当处于推理状态时，调用_forward_test()
        if self.training:
            return self._forward_train(anchors, objectness, rpn_box_regression, targets)
        else:
            return self._forward_test(anchors, objectness, rpn_box_regression)
    # 训练状态时的前向传播函数
    def _forward_train(self, anchors, objectness, rpn_box_regression, targets):
        if self.cfg.MODEL.RPN_ONLY:
            # When training an RPN-only model, the loss is determined by the
            # predicted objectness and rpn_box_regression values and there is
            # no need to transform the anchors into predicted boxes; this is an
            # optimization that avoids the unnecessary transformation.
            boxes = anchors
        else:
            # For end-to-end models, anchors must be transformed into boxes and
            # sampled into a training batch.（注意此时不更新网络参数）
            # 对于 end-to-end 模型来说, anchors 必须被转化成 boxes,
            # 然后采样到目标检测网络的 batch 中用于训练, 注意此时不更新网络参数
            with torch.no_grad():
                boxes = self.box_selector_train(
                    anchors, objectness, rpn_box_regression, targets
                )
        # 获取损失函数的结果
        loss_objectness, loss_rpn_box_reg = self.loss_evaluator(
            anchors, objectness, rpn_box_regression, targets
        )
        losses = {
            "loss_objectness": loss_objectness,
            "loss_rpn_box_reg": loss_rpn_box_reg,
        }
        return boxes, losses
   # 测试状态时的前向传播函数
    def _forward_test(self, anchors, objectness, rpn_box_regression):
        # 将 anchors 转化成对应的 boxes
        boxes = self.box_selector_test(anchors, objectness, rpn_box_regression)
        if self.cfg.MODEL.RPN_ONLY:
            # For end-to-end models, the RPN proposals are an intermediate state
            # and don't bother to sort them in decreasing score order. For RPN-only
            # models, the proposals are the final output and we return them in
            # high-to-low confidence order.
            # 对于端到端模型来说，RPN proposal仅仅只是网络的一个中间状态，无需将它用降序的顺序排序，直接返回
            # RPN结果即可
            # 但是对于RPN-only 的模式，RPN的输出就是最终结果，需要以置信度从高到低的顺序保存结果并返回
            inds = [
                box.get_field("objectness").sort(descending=True)[1] for box in boxes
            ]
            boxes = [box[ind] for box, ind in zip(boxes, inds)]
        return boxes, {}

在 class RPNModule 中, 使用了 class RPNHead 作为其头部：

@registry.RPN_HEADS.register("SingleConvRPNHead")
class RPNHead(nn.Module):
    """
    Adds a simple RPN Head with classification and regression heads
    添加 classification 和 regression heads
    """

    def __init__(self, cfg, in_channels, num_anchors):
        """
        Arguments:
            cfg              : config 配置信息
            in_channels (int): number of channels of the input feature 输入特征的通道数
            num_anchors (int): number of anchors to be predicted # 需要预测的anchors数量
        """
        super(RPNHead, self).__init__()
        # 维持通道数不变
        self.conv = nn.Conv2d(
            in_channels, in_channels, kernel_size=3, stride=1, padding=1
        )
        # objectness 预测层，输出的channels 数为 anchors 的数量。（每一点对应K个anchors）
        self.cls_logits = nn.Conv2d(in_channels, num_anchors, kernel_size=1, stride=1)
        # 预测 box 回归的网络层
        self.bbox_pred = nn.Conv2d(
            in_channels, num_anchors * 4, kernel_size=1, stride=1
        )
        # 对定义的网络层参数进行初始化
        for l in [self.conv, self.cls_logits, self.bbox_pred]:
            torch.nn.init.normal_(l.weight, std=0.01)
            torch.nn.init.constant_(l.bias, 0)
    # 定义 rpn head 的前向传播过程
    def forward(self, x):
        logits = []
        bbox_reg = []
        
        for feature in x:
            # 先执行卷积+激活
            t = F.relu(self.conv(feature))
            # 根据卷积+激活后的结果预测 objectness
            logits.append(self.cls_logits(t))
            # 根据卷积+激活后的结果预测bbox
            bbox_reg.append(self.bbox_pred(t))
        return logits, bbox_reg

在定义 RPNModule 时, 分别使用了 make_anchor_generator(), make_rpn_postprocessor() 和 make_rpn_loss_evaluator() 函数来构建模型的 anchor_generator, box_selector 以及 loss_evaluator, 这三个函数分别定义在其他的三个文件中, 下面我们就根据函数的调用顺序, 对这几个文件展开解析.

anchor_generator.py 生成 anchors:

# ./maskrcnn_benchmark/modeling/rpn/anchor_generator.py

# 包的导入
from maskrcnn_benchmark.structures.bounding_box import BoxList
# ...

class BufferList(nn.Module):
    # 和 nn.ParameterList 差不多, 但是是针对 buffers 的

    def __init__(self, buffers=None):
        # 初始化函数
        # ...

    def extend(self, buffers):
        # buffer 扩展
        # ...

    def __len__(self):
        # 获取 buffer 长度
        return len(self._buffers)

    def __iter__(self):
        # buffer 迭代器
        return iter(self._buffers.values())

class AnchorGenerator(nn.Module):
    # 对于给定的一系列 image sizes 和 feature maps, 计算对应的 anchors

    def __init__(...):
        # 初始化函数
        # ...

    def num_anchors_per_location(self):
        # 获取每个位置的 anchors 数量
        return [len(cell_anchors) for cell_anchors in self.cell_anchors]

    def grid_anchors(self, grid_sizes):
        # 获取 anchors
        # ...

    def add_visibility_to(self, boxlist):
        # anchors保留的功能，如果超出图像是否舍弃
        # ...

    def forward(self, image_list, feature_maps):
        # 定义前向传播过程
        # ...


def make_anchor_generator(config):
    # 根据配置信息创建 AnchorGenerator 对象实例
    # ...

def generator_anchors(...):
    # 根据给定的 stride, sizes, aspect_ratio 等参数返回一个 anchor box 组成的矩阵
    # ...

def _generate_anchors(base_size, scales, aspect_ratios):
    # 返回 anchor windows ??
    # ...

def _whctrs(anchor):
    # 返回某个 anchor 的宽高以及中心坐标
    # ...

def _mkanchors(ws, hs, x_ctr, y_ctr):
    # 给定关于一系列 centers 的宽和高, 返回对应的 anchors
    # ...

make_anchor_generator() 函数：

def make_anchor_generator(config):
# 定义了 RPN 网络的默认的 anchor 的面积大小
# 默认值为：（32,64,128,256,512）
    anchor_sizes = config.MODEL.RPN.ANCHOR_SIZES
    # 定义了 RPN 网络默认的高宽比
    # 默认值为：（0.5,1.0,2.0）
    aspect_ratios = config.MODEL.RPN.ASPECT_RATIOS
    # 定义了RPN 网络中 feature map 采用的stride
    # 默认值为:(16,)
    anchor_stride = config.MODEL.RPN.ANCHOR_STRIDE
    # 移除那些超过图片 STRADDLE_THRESH 个像素大小的 anchors，起到剪枝作用
    # 默认值为0，如果想要关闭剪枝功能，则将该值置为 -1 或者一个更大的数
    straddle_thresh = config.MODEL.RPN.STRADDLE_THRESH

    if config.MODEL.RPN.USE_FPN:
    # 当使用 fpn 时，要确保rpn和fpn的相关参数匹配
        assert len(anchor_stride) == len(
            anchor_sizes
        ), "FPN should have len(ANCHOR_STRIDE) == len(ANCHOR_SIZES)"
    else:
        assert len(anchor_stride) == 1, "Non-FPN should have a single ANCHOR_STRIDE"
    # 当获取到相关的参数以后，创建一个 AnchorGenerator 实例并将其返回
    anchor_generator = AnchorGenerator(
        anchor_sizes, aspect_ratios, anchor_stride, straddle_thresh
    )
    return anchor_generator

根据上面的函数我们知道, make_anchor_generator(config) 函数会根据对应的配置文件创建一个 AnchorGenerator 的实例, 因此, 我们下面就对 class AnchorGenerator(nn.Module) 类进行解析, 代码如下:

class AnchorGenerator(nn.Module):
    """
    For a set of image sizes and feature maps, computes a set
    of anchors
    对于给定的 image sizes 和 features maps，计算对应的 anchors
    """

    def __init__(
        self,
        sizes=(128, 256, 512),
        aspect_ratios=(0.5, 1.0, 2.0),
        anchor_strides=(8, 16, 32),
        straddle_thresh=0,
    ):
        super(AnchorGenerator, self).__init__()

        if len(anchor_strides) == 1:
        # 如果 anchor_strides 的长度为 1，说明没有 fpn 部分，则直接调用相关函数
            anchor_stride = anchor_strides[0]
            # 此处调用了本文件的 generate——anchors 函数
            cell_anchors = [
                generate_anchors(anchor_stride, sizes, aspect_ratios).float()
            ]
        else:
            if len(anchor_strides) != len(sizes):
                raise RuntimeError("FPN should have #anchor_strides == #sizes")
            # 调用 generate_anchors 函数
            cell_anchors = [
                generate_anchors(
                    anchor_stride,
                    size if isinstance(size, (tuple, list)) else (size,),
                    aspect_ratios
                ).float()
                for anchor_stride, size in zip(anchor_strides, sizes)
            ]
        # 将 strides， cell_anchors, straddle_thresh 作为 AnchorGenerator 的成员
        self.strides = anchor_strides
        self.cell_anchors = BufferList(cell_anchors) # 使用了 bufferlist 类
        self.straddle_thresh = straddle_thresh

    # 返回每一个location 上对应的 anchors 数量
    def num_anchors_per_location(self):
        return [len(cell_anchors) for cell_anchors in self.cell_anchors]
    # 用于生成所有特征图谱的 anchors，会被 forward 函数调用
    def grid_anchors(self, grid_sizes):
        # 创建一个空的 anchors 列表
        anchors = []
        # 针对各种组合
        for size, stride, base_anchors in zip(
            grid_sizes, self.strides, self.cell_anchors
        ):
        # 获取 grid 的尺寸和 base_anchors 的 device
            grid_height, grid_width = size
            device = base_anchors.device
            # 按照步长来获取偏移量
            shifts_x = torch.arange(
                0, grid_width * stride, step=stride, dtype=torch.float32, device=device
            )
            # 获取 y 的偏移量
            shifts_y = torch.arange(
                0, grid_height * stride, step=stride, dtype=torch.float32, device=device
            )
            # 创建关于 shifts_y, shifts_x 的 meshgrid（就是shifts_y x shifts_x的grid）
            shift_y, shift_x = torch.meshgrid(shifts_y, shifts_x)
            # 二者展开成一维
            shift_x = shift_x.reshape(-1)
            shift_y = shift_y.reshape(-1)
            shifts = torch.stack((shift_x, shift_y, shift_x, shift_y), dim=1)

            anchors.append(
                (shifts.view(-1, 1, 4) + base_anchors.view(1, -1, 4)).reshape(-1, 4)
            )

        return anchors

    def add_visibility_to(self, boxlist):
    # anchors保留的功能，如果超出图像是否舍弃
        image_width, image_height = boxlist.size
        anchors = boxlist.bbox
        if self.straddle_thresh >= 0:
            inds_inside = (
                (anchors[..., 0] >= -self.straddle_thresh)
                & (anchors[..., 1] >= -self.straddle_thresh)
                & (anchors[..., 2] < image_width + self.straddle_thresh)
                & (anchors[..., 3] < image_height + self.straddle_thresh)
            )
        else:
            device = anchors.device
            inds_inside = torch.ones(anchors.shape[0], dtype=torch.bool, device=device)
        boxlist.add_field("visibility", inds_inside)

    def forward(self, image_list, feature_maps):
        grid_sizes = [feature_map.shape[-2:] for feature_map in feature_maps]
        anchors_over_all_feature_maps = self.grid_anchors(grid_sizes)
        anchors = []
        for i, (image_height, image_width) in enumerate(image_list.image_sizes):
            anchors_in_image = []
            for anchors_per_feature_map in anchors_over_all_feature_maps:
                boxlist = BoxList(
                    anchors_per_feature_map, (image_width, image_height), mode="xyxy"
                )
                self.add_visibility_to(boxlist)
                anchors_in_image.append(boxlist)
            anchors.append(anchors_in_image)
        return anchors

在 class AnchorGenerator 中, 利用了 generate_anchors() 函数来生成对应的 anchors, 该函数是生成 anchors 的入口函数, 在生成 anchors 时, 需要进行一些计算和转换, 其大致流程和对应的实现函数如下所示:

获取生成 anchors 必要的参数, 包括: stride, sizes, 和 aspect_ratios, 其中, stride 代表特征图谱上的 anchors 的基础尺寸, sizes 代表 anchor 对应在原始图片中的大小(以像素为单位), 因此, 我们容易知道 anchor 在特征图谱上的放缩比例为 sizes/stride, aspect_ratios 代表 anchors 的高宽比, 于是, 最终返回的 anchors 的数量就是 sizes (在特征图谱上固定 base_window 的尺寸, 根据比例的不同来对应不同大小的物体)的数量和 aspect_ratios 数量的乘积;

在获取特征图谱上对应的 base_size(stride)后, 我们将其表示成 [x1, y1, x2, y2](坐标是相对于 anchor 的中心而言的) 的 box 形式. 例如对于 stride=4 的情况, 我们将其表示成 [0, 0, 3, 3], 此部分的实现位于 _generate_anchors(...) 函数中

然后根据 aspect_ratios 的值来获取不同的 anchor boxes 的尺寸, 例如, 对于 stride=4 的 base_anchor 来说, 如果参数 aspect_ratios 为 [0.5, 1.0, 2.0], 那么它就应该返回面积不变, 但是高宽比分别为 [0.5, 1.0, 2.0] 的三个 box 的坐标, 也就是应该返回下面的 box 数组(注意到这里 box 的比例实际上是 [5/2, 1, 2/5], 并不是绝对符合 aspect_ratios, 这是因为像素点只能为整数, 后面还能对这些坐标取整). 这部分的实现位于 _ratio_enum() 函数中;
[[-1. 0.5 4. 2.5] [ 0. 0. 3. 3. ] [ 0.5 -1. 2.5 4. ]]

在获取到不同比例的特征图谱上的 box 坐标以后, 我们就该利用 scales = sizes/stride 来将这些 box 坐标映射到原始图像中, 也就是按照对应的比例将这些 box 放大, 对于我们刚刚举的例子 scales = 32/4 = 8 来说, 最终的 box 的坐标如下所示. 这部分的代码实现位于 _scale_num() 函数中.
[[-22., -10., 25., 13.], [-14., -14., 17., 17.], [-10., -22., 13., 25.]]

代码解释如下：

def generate_anchors(
    stride=16, sizes=(32, 64, 128, 256, 512), aspect_ratios=(0.5, 1, 2)
):
    """Generates a matrix of anchor boxes in (x1, y1, x2, y2) format. Anchors
    are centered on stride / 2, have (approximate) sqrt areas of the specified
    sizes, and aspect ratios as given.
    该函数会生成一个 anchor boxes 列表，列表中的元素是以(x1,x2,y1,y2)形式表示的 box；
    这些 box的坐标是相对于 anchor 的中心而言的，其大小为sizes 数组中元素的平方
    这里的默认参数对应的是使用 resnet-C4 作为 backbone 的 faster——RCNN 模型
    如果使用了FPN，则不同的 size 会对应到不同的特征图上，下面利用 fpn 的参数来讲解代码
    fpn 第一阶段的参数值为：（注意sizes必须写成元组或者列表的形式）
    stride = 4，size=（32，），aspect_ratios=（0.5,1,2）
    """
    return _generate_anchors( # 调用 _genarate_anchors()函数
        stride, # stride=4
        np.array(sizes, dtype=np.float) / stride, # sizes / strides = 32 / 4 = 8
        np.array(aspect_ratios, dtype=np.float), # [0.5, 1, 2]
    )


def _generate_anchors(base_size, scales, aspect_ratios):
# 根据调用语句知，参数值分别为：4,8,[0.5,1,2]
    """Generate anchor (reference) windows by enumerating aspect ratios X
    scales wrt a reference (0, 0, base_size - 1, base_size - 1) window.
    """
    # 首先得到 anchor 的base box坐标（相对于 anchor中心而言）,[0,0,3,3]
    anchor = np.array([1, 1, base_size, base_size], dtype=np.float) - 1
    # 根据 base_box 和给定的高宽比, 得到拥有不同高宽比的 anchors,
    # 此处是使 anchor 的比例转化成 [0.5, 1, 2], 对应的 box 为:
    #[[-1.   0.5  4.   2.5]
    # [ 0.   0.   3.   3. ]
    # [ 0.5 -1.   2.5  4. ]]
    # 注意到这里的 box 的比例实际为 [5/2, 1, 2/5], 具体原理可查看 _ratio_enum() 函数解析
    anchors = _ratio_enum(anchor, aspect_ratios)
    # 得到不同高宽比的 anchors 以后, 按照给定的比例(scales)将其缩放到原始图像中,
    # 此处 scales 的值只有一个, 即为 8, 因此, 将上面的 boxes 放大 8 倍(指的是宽高各放大 8 倍, 故总面积会放大64倍), 得到新的 boxes 坐标如下:
    #[[-22., -10.,  25.,  13.],
    # [-14., -14.,  17.,  17.],
    # [-10., -22.,  13.,  25.]]
    # 这里的 vstack 用于将 3 个 1×4 的数组合并成一个 3×4 的数组, 如上所示.
    # anchors[i, :] 代表的是一个 box 的坐标, 如: [-1.  0.5  4.  2.5]
    anchors = np.vstack(
        [_scale_enum(anchors[i, :], scales) for i in range(anchors.shape[0])]
    )
    # 将 numpy 数组转换成 tensors，然后返回，anchor的shape为：(n,4),其中 n 为 anchors 的数量
    return torch.from_numpy(anchors)

在上面的函数上, 分别使用了 _ratio_enum() 和 _scale_enum() 函数来实现高宽比和放缩比的组合, 下面, 我们就先对这两个函数进行解析:

def _ratio_enum(anchor, ratios):
    """Enumerate a set of anchors for each aspect ratio wrt an anchor."""
    # 该函数按照给定的 ratios 将 base anchor 转化成具有不同高宽比的多个 anchor boxes
    # 例如：
    # anchor：[0. 0. 3. 3.]
    # ratios: [0.5, 1.0 2.0]
    
    # 获取 anchor 的宽，高，以及中心点坐标
    w, h, x_ctr, y_ctr = _whctrs(anchor)
    # 获取 anchor 的面积
    size = w * h
    # 根据高宽比获取 size_ratios 变量，后续会用该变量对 box 的高宽比进行转化
    size_ratios = size / ratios
    # ws = sqrt(size) / sqrt(ratios)
    # hs = sqrt(size) * sqrt(ratios)
    # 高宽比 = hs/ws = sqrt(ratios) * sqrt(ratios) = ratios
    # round 代表四舍五入
    ws = np.round(np.sqrt(size_ratios))
    hs = np.round(ws * ratios)
    # 根据新的 w 和 h, 生成新的 box 坐标(x1, x2, y1, y2) 并将其返回
    anchors = _mkanchors(ws, hs, x_ctr, y_ctr)
    return anchors


def _scale_enum(anchor, scales):
    """Enumerate a set of anchors for each scale wrt an anchor."""
    # 对放缩比进行遍历的函数
    # 举例说明： anchor:[-1. 0.5. 4. 2.5]
    # scales: 8
    # 获取anchor 的宽、高，以及中心坐标
    w, h, x_ctr, y_ctr = _whctrs(anchor)
    # 将宽和高各放大 8 倍
    ws = w * scales
    hs = h * scales
    # 根据新的宽、高以及中心坐标，将 anchor 转化成(x1,x2,y1,y2) 的形式
    anchors = _mkanchors(ws, hs, x_ctr, y_ctr)
    return anchors

在 _ratio_enum() 和 _scale_enum() 函数中, 都使用了 _whctrs() 和 _mkanchors 函数, 前者可以根据 box 的坐标信息得到 box 的宽高以及中心点坐标, 后者则是根据宽高以及中心点坐标得到 box 的 (x1, y1, x2, y2) 形式, 这两个函数的代码解析如下所示：

def _whctrs(anchor):
    """Return width, height, x center, and y center for an anchor (window)."""
    # 根据左上角和右下角的坐标返回该 box 的宽高以及中心点坐标
    w = anchor[2] - anchor[0] + 1
    h = anchor[3] - anchor[1] + 1
    x_ctr = anchor[0] + 0.5 * (w - 1)
    y_ctr = anchor[1] + 0.5 * (h - 1)
    return w, h, x_ctr, y_ctr


def _mkanchors(ws, hs, x_ctr, y_ctr):
    """Given a vector of widths (ws) and heights (hs) around a center
    (x_ctr, y_ctr), output a set of anchors (windows).
    将给定的宽、高以及中心点坐标转化成(x1,y1,x2,y2)的坐标形式
    """
    # 这里新增加了一个维度，以便有 hstack 将结果叠加
    ws = ws[:, np.newaxis]
    hs = hs[:, np.newaxis]
    # 将结果组合起来返回
    anchors = np.hstack(
        (
            x_ctr - 0.5 * (ws - 1),
            y_ctr - 0.5 * (hs - 1),
            x_ctr + 0.5 * (ws - 1),
            y_ctr + 0.5 * (hs - 1),
        )
    )
    return anchors

inference.py 文件解析

# ./maskrcnn_benchmark/modeling/rpn/inference.py

# 导入各种包及函数
from maskrcnn_benchmark.modeling.box_coder import BoxCoder

class RPNPostProcessor(torch.nn.Module):
    # 在将 proposals 喂到网络的 heads 之前, 先对 RPN 输出的 boxes 执行后处理

    def __init__(...):
        # 初始化函数
        # ...

    def add_gt_proposals(self, proposals, targets):
        # ...

    def forward_for_single_feature_map(self, anchors, objectness, box_regression):
        # ...

    def forward(self, anchors, objectness, box_regression, targets=None):
        # ...

    def select_over_all_levels(self, boxlists):
        # ...

def make_rpn_postprocessor(config, rpn_box_coder, is_train):
    # ...

make_rpn_postprocessor()入口函数：

def make_rpn_postprocessor(config, rpn_box_coder, is_train):
# rpn_box_coder: BoxCoder 实例
    # eg : 2000
    fpn_post_nms_top_n = config.MODEL.RPN.FPN_POST_NMS_TOP_N_TRAIN
    # eg: 1000
    if not is_train:
        fpn_post_nms_top_n = config.MODEL.RPN.FPN_POST_NMS_TOP_N_TEST

    pre_nms_top_n = config.MODEL.RPN.PRE_NMS_TOP_N_TRAIN
    post_nms_top_n = config.MODEL.RPN.POST_NMS_TOP_N_TRAIN
    if not is_train:
        pre_nms_top_n = config.MODEL.RPN.PRE_NMS_TOP_N_TEST
        post_nms_top_n = config.MODEL.RPN.POST_NMS_TOP_N_TEST
    fpn_post_nms_per_batch = config.MODEL.RPN.FPN_POST_NMS_PER_BATCH
    nms_thresh = config.MODEL.RPN.NMS_THRESH
    min_size = config.MODEL.RPN.MIN_SIZE
    # 根据配置参数创建一个 RPNPostProcessor 实例
    box_selector = RPNPostProcessor(
        pre_nms_top_n=pre_nms_top_n,
        post_nms_top_n=post_nms_top_n,
        nms_thresh=nms_thresh,
        min_size=min_size,
        box_coder=rpn_box_coder,
        fpn_post_nms_top_n=fpn_post_nms_top_n,
        fpn_post_nms_per_batch=fpn_post_nms_per_batch,
    )
    return box_selector

RPNPostProcessor 类：
初始化函数：

class RPNPostProcessor(torch.nn.Module):
    """
    Performs post-processing on the outputs of the RPN boxes, before feeding the
    proposals to the heads 
    主要完成对 RPN boxes 的后处理功能（在将 boxes 送到 heads 之前执行）
    """

    def __init__(
        self,
        pre_nms_top_n,
        post_nms_top_n,
        nms_thresh,
        min_size,
        box_coder=None,
        fpn_post_nms_top_n=None,
        fpn_post_nms_per_batch=True,
    ):
        """
        Arguments:
            pre_nms_top_n (int)
            post_nms_top_n (int)
            nms_thresh (float)
            min_size (int)
            box_coder (BoxCoder)
            fpn_post_nms_top_n (int)
        """
        super(RPNPostProcessor, self).__init__()

        # 将传进来的参数都变成成员变量
        self.pre_nms_top_n = pre_nms_top_n
        self.post_nms_top_n = post_nms_top_n
        self.nms_thresh = nms_thresh
        self.min_size = min_size
        # 创建一个 BoxCoder 实例
        if box_coder is None:
            box_coder = BoxCoder(weights=(1.0, 1.0, 1.0, 1.0))
        self.box_coder = box_coder

        if fpn_post_nms_top_n is None:
            fpn_post_nms_top_n = post_nms_top_n
        self.fpn_post_nms_top_n = fpn_post_nms_top_n
        self.fpn_post_nms_per_batch = fpn_post_nms_per_batch

添加真实候选框函数：

   def add_gt_proposals(self, proposals, targets):
        """
        将真实的边框标签 targets 添加到当前的 BoxList 列表数据中
        Arguments:
            proposals: list[BoxList]
            targets: list[BoxList]
        """
        # Get the device we're operating on
        # 获取当前正在操作的设备
        device = proposals[0].bbox.device
        # 调用 BoxList 的 copy_with_fields 方法进行深度复制，gt_boxes 是一个列表
        # 其元素类型是 BoxList
        gt_boxes = [target.copy_with_fields([]) for target in targets]

        # later cat of bbox requires all fields to be present for all bbox
        # so we need to add a dummy for objectness that's missing
        # 添加一个字典键，“objectness”，值为当前 boxlist元素中的 box 的数量长度的一维 tensor
        for gt_box in gt_boxes:
            gt_box.add_field("objectness", torch.ones(len(gt_box), device=device))
 
        # 调用 boxlist_ops.py中的 cat_boxlist 函数将 proposal 和 gt_box 合并成一个 boxlist
        proposals = [
            cat_boxlist((proposal, gt_box))
            for proposal, gt_box in zip(proposals, gt_boxes)
        ]

        return proposals

在单一的特征图谱上执行前向传播：

    def forward_for_single_feature_map(self, anchors, objectness, box_regression):
        """
        Arguments:
            anchors: list[BoxList]
            objectness: tensor of size N, A, H, W
            A 代表每个像素点的 anchors 数量；N 代表batchsize，H和W代表特征图谱的高和宽
            box_regression: tensor of size N, A * 4, H, W
        """
        # 获取当前的设备
        device = objectness.device
        # 获取 objectness 的 shape
        N, A, H, W = objectness.shape

        # put in the same format as anchors
        # 将格式转换成和 anchors 相同的格式，先改变维度的排列，然后改变shape的形状
        objectness = permute_and_flatten(objectness, N, A, 1, H, W).view(N, -1) # shape：（N,H*W*A）
        # SIGMOID归一化
        objectness = objectness.sigmoid()
        # 相似的操作，应用在 box_regression 上
        box_regression = permute_and_flatten(box_regression, N, A, 4, H, W)
        # 计算 anchors 的总数量
        num_anchors = A * H * W
       
        # 确保 pre_nms_top_n 不会超过 anchors 的总数量，以免产生错误
        pre_nms_top_n = min(self.pre_nms_top_n, num_anchors)
        # 调用 pytorch 的 topk 函数，该函数返回两个列表，一个是topk 的值，一个是对应下标
        objectness, topk_idx = objectness.topk(pre_nms_top_n, dim=1, sorted=True)
        
        # 创建 batch 的下标，shape 为 Nx1，按照顺序递增，如:[[0],[1],....,[N-1]]
        batch_idx = torch.arange(N, device=device)[:, None]
        # 获取所有 batch 的 top_k box
        box_regression = box_regression[batch_idx, topk_idx]
        
        # 获取所有 anchor 的尺寸
        image_shapes = [box.size for box in anchors]
        # 获取所有的 anchors，将 anchors 连接成一个列表
        concat_anchors = torch.cat([a.bbox for a in anchors], dim=0)
        # 重新按照 batch 划分，同时获取每个batch 的 topk
        concat_anchors = concat_anchors.reshape(N, -1, 4)[batch_idx, topk_idx]
 
        # 将最终的结果解码成方便表示的形式（原本为方便训练的形式）
        proposals = self.box_coder.decode(
            box_regression.view(-1, 4), concat_anchors.view(-1, 4)
        )

        proposals = proposals.view(N, -1, 4)

        result = [] # 组建结果并返回
        for proposal, score, im_shape in zip(proposals, objectness, image_shapes):
            # 根据当前的结果创建一个 boxlist 实例
            boxlist = BoxList(proposal, im_shape, mode="xyxy")
            # 添加 score
            boxlist.add_field("objectness", score)
            # 防止 box 超出 image 的边界
            boxlist = boxlist.clip_to_image(remove_empty=False)
            # 移除过小的 box
            boxlist = remove_small_boxes(boxlist, self.min_size)
            # 在当前的 box 上执行 nms 算法
            boxlist = boxlist_nms(
                boxlist,
                self.nms_thresh,
                max_proposals=self.post_nms_top_n,
                score_field="objectness",
            )
            result.append(boxlist)
        return result

前向传播函数：

    def forward(self, anchors, objectness, box_regression, targets=None):
        """
        Arguments:
            anchors: list[list[BoxList]]
            objectness: list[tensor]
            box_regression: list[tensor]
        Returns:
            boxlists (list[BoxList]): the post-processed anchors, after
                applying box decoding and NMS 经过 box decoding 和NMS 操作处理后的 anchors
        """
        # 创建一个空的 box 列表
        sampled_boxes = []
        num_levels = len(objectness)
        anchors = list(zip(*anchors))

        # 调用类的 forward_for_single_feature_map() 成员函数
        for a, o, b in zip(anchors, objectness, box_regression):
            sampled_boxes.append(self.forward_for_single_feature_map(a, o, b))

        boxlists = list(zip(*sampled_boxes))
        # 调用 boxlist_ops.py 文件中的 cat_boxlist 函数
        boxlists = [cat_boxlist(boxlist) for boxlist in boxlists]

        if num_levels > 1:
        # 调用类的 select_over_all_levels 成员函数
            boxlists = self.select_over_all_levels(boxlists)

        # append ground-truth bboxes to proposals
        # 添加 gt bboxes 到 proposals 当中去
        if self.training and targets is not None:
        # 调用类的 add_gt_proposals 成员函数
            boxlists = self.add_gt_proposals(boxlists, targets)

        return boxlists

在所有层次上进行选择：

    def select_over_all_levels(self, boxlists):
    # 在训练阶段和测试阶段的行为不同，在训练阶段，post_nms_top_n 是在所有的proposals 上进行的
    # 而在测试阶段，是在每一个图片的 proposals 上进行的
        num_images = len(boxlists)
        # different behavior during training and during testing:
        # during training, post_nms_top_n is over *all* the proposals combined, while
        # during testing, it is over the proposals for each image
        # NOTE: it should be per image, and not per batch. However, to be consistent 
        # with Detectron, the default is per batch (see Issue #672)
        if self.training and self.fpn_post_nms_per_batch:
        # 连接“objectness”
            objectness = torch.cat(
                [boxlist.get_field("objectness") for boxlist in boxlists], dim=0
            )
            # 获取box数量
            box_sizes = [len(boxlist) for boxlist in boxlists]
            # 防止 post_nms_top_n 超过 anchors 总数，产生错误
            post_nms_top_n = min(self.fpn_post_nms_top_n, len(objectness))
            # 获取 topk 的下标
            _, inds_sorted = torch.topk(objectness, post_nms_top_n, dim=0, sorted=True)
            inds_mask = torch.zeros_like(objectness, dtype=torch.bool)
            inds_mask[inds_sorted] = 1
            inds_mask = inds_mask.split(box_sizes)
            # 获取所有满足条件的box
            for i in range(num_images):
                boxlists[i] = boxlists[i][inds_mask[i]]
        else:
            for i in range(num_images):
                objectness = boxlists[i].get_field("objectness")
                post_nms_top_n = min(self.fpn_post_nms_top_n, len(objectness))
                _, inds_sorted = torch.topk(
                    objectness, post_nms_top_n, dim=0, sorted=True
                )
                boxlists[i] = boxlists[i][inds_sorted]
        return boxlists

loss.py 文件解析

make_rpn_loss_evaluator() 函数来创建 RPN 网络的损失函数评价器：

def make_rpn_loss_evaluator(cfg, box_coder):
# 根据配置信息创建 matcher 实例
    matcher = Matcher(
        cfg.MODEL.RPN.FG_IOU_THRESHOLD,
        cfg.MODEL.RPN.BG_IOU_THRESHOLD,
        allow_low_quality_matches=True,
    )
# 根据配置信息创建一个 BalancedPositiveNegativeSampler 实例
    fg_bg_sampler = BalancedPositiveNegativeSampler(
        cfg.MODEL.RPN.BATCH_SIZE_PER_IMAGE, cfg.MODEL.RPN.POSITIVE_FRACTION
    )
# 利用上面创建的实例对象进一步创建 RPNLossComputation 实例
    loss_evaluator = RPNLossComputation(
        matcher,
        fg_bg_sampler,
        box_coder,
        generate_rpn_labels
    )
    return loss_evaluator

RPNLossComputation 类的代码实现：

class RPNLossComputation(object):
    """
    This class computes the RPN loss.
    """

    def __init__(self, proposal_matcher, fg_bg_sampler, box_coder,
                 generate_labels_func):
        """
        Arguments:
            proposal_matcher (Matcher)
            fg_bg_sampler (BalancedPositiveNegativeSampler)
            box_coder (BoxCoder)
        """
        # self.target_preparator = target_preparator
        self.proposal_matcher = proposal_matcher
        self.fg_bg_sampler = fg_bg_sampler
        self.box_coder = box_coder
        self.copied_fields = []
        self.generate_labels_func = generate_labels_func
        self.discard_cases = ['not_visibility', 'between_thresholds']

    def match_targets_to_anchors(self, anchor, target, copied_fields=[]):
        match_quality_matrix = boxlist_iou(target, anchor)
        matched_idxs = self.proposal_matcher(match_quality_matrix)
        # RPN doesn't need any fields from target
        # for creating the labels, so clear them all
        target = target.copy_with_fields(copied_fields)
        # get the targets corresponding GT for each anchor
        # NB: need to clamp the indices because we can have a single
        # GT in the image, and matched_idxs can be -2, which goes
        # out of bounds
        matched_targets = target[matched_idxs.clamp(min=0)]
        matched_targets.add_field("matched_idxs", matched_idxs)
        return matched_targets

    def prepare_targets(self, anchors, targets):
        labels = []
        regression_targets = []
        for anchors_per_image, targets_per_image in zip(anchors, targets):
            matched_targets = self.match_targets_to_anchors(
                anchors_per_image, targets_per_image, self.copied_fields
            )

            matched_idxs = matched_targets.get_field("matched_idxs")
            labels_per_image = self.generate_labels_func(matched_targets)
            labels_per_image = labels_per_image.to(dtype=torch.float32)

            # Background (negative examples)
            bg_indices = matched_idxs == Matcher.BELOW_LOW_THRESHOLD
            labels_per_image[bg_indices] = 0

            # discard anchors that go out of the boundaries of the image
            if "not_visibility" in self.discard_cases:
                labels_per_image[~anchors_per_image.get_field("visibility")] = -1

            # discard indices that are between thresholds
            if "between_thresholds" in self.discard_cases:
                inds_to_discard = matched_idxs == Matcher.BETWEEN_THRESHOLDS
                labels_per_image[inds_to_discard] = -1

            # compute regression targets
            regression_targets_per_image = self.box_coder.encode(
                matched_targets.bbox, anchors_per_image.bbox
            )

            labels.append(labels_per_image)
            regression_targets.append(regression_targets_per_image)

        return labels, regression_targets


    def __call__(self, anchors, objectness, box_regression, targets):
        """
        Arguments:
            anchors (list[list[BoxList]])
            objectness (list[Tensor])
            box_regression (list[Tensor])
            targets (list[BoxList])
        Returns:
            objectness_loss (Tensor)
            box_loss (Tensor)
        """
        anchors = [cat_boxlist(anchors_per_image) for anchors_per_image in anchors]
        labels, regression_targets = self.prepare_targets(anchors, targets)
        sampled_pos_inds, sampled_neg_inds = self.fg_bg_sampler(labels)
        sampled_pos_inds = torch.nonzero(torch.cat(sampled_pos_inds, dim=0)).squeeze(1)
        sampled_neg_inds = torch.nonzero(torch.cat(sampled_neg_inds, dim=0)).squeeze(1)

        sampled_inds = torch.cat([sampled_pos_inds, sampled_neg_inds], dim=0)

        objectness, box_regression = \
                concat_box_prediction_layers(objectness, box_regression)

        objectness = objectness.squeeze()

        labels = torch.cat(labels, dim=0)
        regression_targets = torch.cat(regression_targets, dim=0)

        box_loss = smooth_l1_loss(
            box_regression[sampled_pos_inds],
            regression_targets[sampled_pos_inds],
            beta=1.0 / 9,
            size_average=False,
        ) / (sampled_inds.numel())

        objectness_loss = F.binary_cross_entropy_with_logits(
            objectness[sampled_inds], labels[sampled_inds]
        )

        return objectness_loss, box_loss

参考链接：
MaskrcnnBenchmark 源码解析

你可能感兴趣的:(深度学习)

机器学习与深度学习间关系与区别 ℒℴѵℯ心·动ꦿ໊ོ꫞ 人工智能学习深度学习 python
一、机器学习概述定义机器学习（MachineLearning,ML）是一种通过数据驱动的方法，利用统计学和计算算法来训练模型，使计算机能够从数据中学习并自动进行预测或决策。机器学习通过分析大量数据样本，识别其中的模式和规律，从而对新的数据进行判断。其核心在于通过训练过程，让模型不断优化和提升其预测准确性。主要类型1.监督学习（SupervisedLearning）监督学习是指在训练数据集中包含输入
将cmd中命令输出保存为txt文本文件落难Coder Windows cmd window
最近深度学习本地的训练中我们常常要在命令行中运行自己的代码，无可厚非，我们有必要保存我们的炼丹结果，但是复制命令行输出到txt是非常麻烦的，其实Windows下的命令行为我们提供了相应的操作。其基本的调用格式就是：运行指令>输出到的文件名称或者具体保存路径测试下，我打开cmd并且ping一下百度：pingwww.baidu.com>./data.txt看下相同目录下data.txt的输出：如果你再
推荐3家毕业AI论文可五分钟一键生成！文末附免费教程！小猪包333 写论文人工智能 AI写作深度学习计算机视觉
在当前的学术研究和写作领域，AI论文生成器已经成为许多研究人员和学生的重要工具。这些工具不仅能够帮助用户快速生成高质量的论文内容，还能进行内容优化、查重和排版等操作。以下是三款值得推荐的AI论文生成器：千笔-AIPassPaper、懒人论文以及AIPaperPass。千笔-AIPassPaper千笔-AIPassPaper是一款基于深度学习和自然语言处理技术的AI写作助手，旨在帮助用户快速生成高质
AI大模型的架构演进与最新发展季风泯灭的季节 AI大模型应用技术二人工智能架构
随着深度学习的发展，AI大模型（LargeLanguageModels,LLMs）在自然语言处理、计算机视觉等领域取得了革命性的进展。本文将详细探讨AI大模型的架构演进，包括从Transformer的提出到GPT、BERT、T5等模型的历史演变，并探讨这些模型的技术细节及其在现代人工智能中的核心作用。一、基础模型介绍：Transformer的核心原理Transformer架构的背景在Transfo
[实践应用] 深度学习之模型性能评估指标 YuanDaima2048 深度学习工具使用深度学习人工智能损失函数性能评估 pytorch python 机器学习
文章总览：YuanDaiMa2048博客文章总览深度学习之模型性能评估指标分类任务回归任务排序任务聚类任务生成任务其他介绍在机器学习和深度学习领域，评估模型性能是一项至关重要的任务。不同的学习任务需要不同的性能指标来衡量模型的有效性。以下是对一些常见任务及其相应的性能评估指标的详细解释和总结。分类任务分类任务是指模型需要将输入数据分配到预定义的类别或标签中。以下是分类任务中常用的性能指标：准确率(
[实践应用] 深度学习之优化器 YuanDaima2048 深度学习工具使用 pytorch 深度学习人工智能机器学习 python 优化器
文章总览：YuanDaiMa2048博客文章总览深度学习之优化器1.随机梯度下降（SGD）2.动量优化（Momentum）3.自适应梯度（Adagrad）4.自适应矩估计（Adam）5.RMSprop总结其他介绍在深度学习中，优化器用于更新模型的参数，以最小化损失函数。常见的优化函数有很多种，下面是几种主流的优化器及其特点、原理和PyTorch实现：1.随机梯度下降（SGD）原理:随机梯度下降通过
生成式地图制图 Bwywb_3 深度学习机器学习深度学习生成对抗网络
生成式地图制图（GenerativeCartography）是一种利用生成式算法和人工智能技术自动创建地图的技术。它结合了传统的地理信息系统（GIS）技术与现代生成模型（如深度学习、GANs等），能够根据输入的数据自动生成符合需求的地图。这种方法在城市规划、虚拟环境设计、游戏开发等多个领域具有应用前景。主要特点：自动化生成：通过算法和模型，系统能够根据输入的地理或空间数据自动生成地图，而无需人工逐
吴恩达深度学习笔记(30)-正则化的解释极客Array
正则化（Regularization）深度学习可能存在过拟合问题——高方差，有两个解决方法，一个是正则化，另一个是准备更多的数据，这是非常可靠的方法，但你可能无法时时刻刻准备足够多的训练数据或者获取更多数据的成本很高，但正则化通常有助于避免过拟合或减少你的网络误差。如果你怀疑神经网络过度拟合了数据，即存在高方差问题，那么最先想到的方法可能是正则化，另一个解决高方差的方法就是准备更多数据，这也是非常
个人学习笔记7-6：动手学深度学习pytorch版-李沐浪子L 深度学习深度学习笔记计算机视觉 python 人工智能神经网络 pytorch
#人工智能##深度学习##语义分割##计算机视觉##神经网络#计算机视觉13.11全卷积网络全卷积网络（fullyconvolutionalnetwork，FCN）采用卷积神经网络实现了从图像像素到像素类别的变换。引入l转置卷积（transposedconvolution）实现的，输出的类别预测与输入图像在像素级别上具有一一对应关系：通道维的输出即该位置对应像素的类别预测。13.11.1构造模型下
深度学习-点击率预估-研究论文2024-09-14速读 sp_fyf_2024 深度学习人工智能
深度学习-点击率预估-研究论文2024-09-14速读1.DeepTargetSessionInterestNetworkforClick-ThroughRatePredictionHZhong,JMa,XDuan,SGu,JYao-2024InternationalJointConferenceonNeuralNetworks,2024深度目标会话兴趣网络用于点击率预测摘要：这篇文章提出了一种新
损失函数与反向传播 Star_. PyTorch pytorch 深度学习 python
损失函数定义与作用损失函数(lossfunction)在深度学习领域是用来计算搭建模型预测的输出值和真实值之间的误差。1.损失函数越小越好2.计算实际输出与目标之间的差距3.为更新输出提供依据（反向传播)常见的损失函数回归常见的损失函数有：均方差（MeanSquaredError，MSE）、平均绝对误差（MeanAbsoluteErrorLoss，MAE）、HuberLoss是一种将MSE与MAE
【深度学习】训练过程中一个OOM的问题，太难查了 weixin_40293999 深度学习深度学习人工智能
现象：各位大佬又遇到过ubuntu的这个问题么？现象是在训练过程中，ssh上不去了，能ping通，没死机，但是ubunutu的pc侧的显示器，鼠标啥都不好用了。只能重启。问题原因：OOM了95G，尼玛！！！！pytorch爆内存了，然后journald假死了，在journald被watchdog干掉之后，系统就崩溃了。这种规模的爆内存一般，即使被oomkill了，也要卡半天的，确实会这样，能不能配
云服务业界动态简报-20180128 Captain7
一、青云青云QingCloud推出深度学习平台DeepLearningonQingCloud，包含了主流的深度学习框架及数据科学工具包，通过QingCloudAppCenter一键部署交付，可以让算法工程师和数据科学家快速构建深度学习开发环境，将更多的精力放在模型和算法调优。二、腾讯云1.腾讯云正式发布腾讯专有云TCE(TencentCloudEnterprise)矩阵，涵盖企业版、大数据版、AI
机器学习VS深度学习 nfgo 机器学习
机器学习（MachineLearning,ML）和深度学习（DeepLearning,DL）是人工智能（AI）的两个子领域，它们有许多相似之处，但在技术实现和应用范围上也有显著区别。下面从几个方面对两者进行区分：1.概念层面机器学习：是让计算机通过算法从数据中自动学习和改进的技术。它依赖于手动设计的特征和数学模型来进行学习，常用的模型有决策树、支持向量机、线性回归等。深度学习：是机器学习的一个子领
大数据毕业设计hadoop+spark+hive知识图谱租房数据分析可视化大屏租房推荐系统 58同城租房爬虫房源推荐系统房价预测系统计算机毕业设计机器学习深度学习人工智能 2401_84572577 程序员大数据 hadoop 人工智能
做了那么多年开发，自学了很多门编程语言，我很明白学习资源对于学一门新语言的重要性，这些年也收藏了不少的Python干货，对我来说这些东西确实已经用不到了，但对于准备自学Python的人来说，或许它就是一个宝藏，可以给你省去很多的时间和精力。别在网上瞎学了，我最近也做了一些资源的更新，只要你是我的粉丝，这期福利你都可拿走。我先来介绍一下这些东西怎么用，文末抱走。（1）Python所有方向的学习路线（
深度学习-13-小语言模型之SmolLM的使用皮皮冰燃深度学习深度学习
文章附录1SmolLM概述1.1SmolLM简介1.2下载模型2运行2.1在CPU/GPU/多GPU上运行模型2.2使用torch.bfloat162.3通过位和字节的量化版本3应用示例4问题及解决4.1attention_mask和pad_token_id报错4.2max_new_tokens=205参考附录1SmolLM概述1.1SmolLM简介SmolLM是一系列尖端小型语言模型，提供三种规
基于深度学习的农作物病害检测 SEU-WYL 深度学习dnn 深度学习人工智能
基于深度学习的农作物病害检测利用卷积神经网络（CNN）、生成对抗网络（GAN）、Transformer等深度学习技术，自动识别和分类农作物的病害，帮助农业工作者提高作物管理效率、减少损失。1.农作物病害检测的挑战病害种类繁多：农作物病害的类型多样，不同病害在同一作物上的表现差异很大，同时同一种病害在不同生长阶段的症状也可能不同。环境影响：天气、光照、湿度等外部环境因素会影响农作物的表现，使得病害检
基于深度学习的文本引导的图像编辑 SEU-WYL 深度学习dnn 深度学习人工智能
基于深度学习的文本引导的图像编辑（Text-GuidedImageEditing）是一种通过自然语言文本指令对图像进行编辑或修改的技术。它结合了图像生成和自然语言处理（NLP）的最新进展，使用户能够通过描述性文本对图像内容进行精确的调整和操控。1.文本引导的图像编辑的挑战文本和图像之间的对齐：如何将文本中的语义信息准确地映射到图像中的特定区域或元素是一个关键挑战。这涉及到多模态数据的对齐和理解。编
深度学习--对抗生成网络（GAN, Generative Adversarial Network） Ambition_LAO 深度学习生成对抗网络
对抗生成网络（GAN,GenerativeAdversarialNetwork）是一种深度学习模型，由IanGoodfellow等人在2014年提出。GAN主要用于生成数据，通过两个神经网络相互对抗，来生成以假乱真的新数据。以下是对GAN的详细阐述，包括其概念、作用、核心要点、实现过程、代码实现和适用场景。1.概念GAN由两个神经网络组成：生成器（Generator）和判别器（Discrimina
深度学习：怎么看pth文件的参数奥利给少年深度学习人工智能
.pth文件是PyTorch模型的权重文件，它通常包含了训练好的模型的参数。要查看或使用这个文件，你可以按照以下步骤操作：1.确保你有模型的定义你需要有创建这个.pth文件时所用的模型的代码。这意味着你需要有模型的类定义和架构。2.加载模型权重使用PyTorch的load_state_dict方法来加载权重。这里是如何操作的：importtorchimporttorch.nnasnn#定义模型结构
chatgpt赋能python：如何在Python中安装Keras库？ turensu ChatGpt python chatgpt keras 计算机
如何在Python中安装Keras库？Keras是一个简单易用的神经网络库，由FrançoisChollet编写。它在Python编程语言中实现了深度学习的功能，可以使您更轻松地构建和试验不同类型的神经网络。如果您是一名Python开发人员，肯定会想知道如何在您的Python项目中安装Keras库。在本文中，我们将向您展示如何安装和配置Keras库。步骤1：安装Python要使用Keras库，您需
如何理解深度学习的训练过程奋斗的草莓熊深度学习人工智能 python scikit-learn virtualenv numpy pandas
文章目录1.训练是干什么？2.预训练模型进行训练，主要更改的是预训练模型的什么东西？1.训练是干什么？以yolov5为例子，训练的目的是把一组输入猫狗图像放到神经网络中，得到一个输出模型，这个模型下次可以直接用来识别哪个是猫，哪个是狗2.预训练模型进行训练，主要更改的是预训练模型的什么东西？超参数（Hyperparameters）：这是模型结构中定义的参数，比如：卷积核大小（kernel_size
Keras深度学习框架入门及实战指南司莹嫣Maude
Keras深度学习框架入门及实战指南keraskeras-team/keras:是一个基于Python的深度学习库，它没有使用数据库。适合用于深度学习任务的开发和实现，特别是对于需要使用Python深度学习库的场景。特点是深度学习库、Python、无数据库。项目地址:https://gitcode.com/gh_mirrors/ke/keras一、项目介绍Keras简介Keras是一款高级神经网络
深度学习驱动的车牌识别：技术演进与未来挑战逼子歌深度学习车牌识别神经网络字符识别 YOLO 卷积神经网络
一、引言1.1研究背景在当今社会，智能交通系统的发展日益重要，而车牌识别作为其关键组成部分，发挥着至关重要的作用。车牌识别技术广泛应用于交通管理、停车场管理、安防监控等领域。在交通管理中，它可以用于车辆识别、交通违法监控和车流统计等，提高交通管理的效率和准确性。在停车场管理中，实现车辆的自动识别和收费，提升管理和服务水平。在安防监控领域，可用于追踪嫌疑人及犯罪行为。深度学习的出现为车牌识别带来了重
每天五分钟玩转深度学习PyTorch：模型参数优化器torch.optim 幻风_huanfeng 深度学习框架pytorch 深度学习 pytorch 人工智能神经网络机器学习优化算法
本文重点在机器学习或者深度学习中，我们需要通过修改参数使得损失函数最小化(或最大化)，优化算法就是一种调整模型参数更新的策略。在pytorch中定义了优化器optim，我们可以使用它调用封装好的优化算法，然后传递给它神经网络模型参数，就可以对模型进行优化。本文是学习第6步(优化器)，参考链接pytorch的学习路线随机梯度下降算法在深度学习和机器学习中，梯度下降算法是最常用的参数更新方法，它的公式
什么是AIGC？有哪些免费工具？ chent_某位 AIGC
AIGC（AIGeneratedContent），即“人工智能生成内容”，是指通过人工智能技术自动生成各种类型的数字内容。AIGC让机器能够根据输入的信息或数据生成符合人类需求的文本、图像、音频、视频等内容，极大提高了内容创作的效率。AIGC的背景与起源随着深度学习和自然语言处理技术的快速发展，人工智能已经不再局限于简单的任务，如分类、预测和数据分析，而是具备了生成内容的能力。生成式AI模型，如O
transformer架构(Transformer Architecture)原理与代码实战案例讲解 AI架构设计之禅大数据AI人工智能 Python入门实战计算科学神经计算深度学习神经网络大数据人工智能大型语言模型 AI AGI LLM Java Python 架构设计 Agent RPA
transformer架构(TransformerArchitecture)原理与代码实战案例讲解关键词：Transformer,自注意力机制,编码器-解码器,预训练,微调,NLP,机器翻译作者：禅与计算机程序设计艺术/ZenandtheArtofComputerProgramming1.背景介绍1.1问题的由来自然语言处理（NLP）领域的发展经历了从规则驱动到统计驱动再到深度学习驱动的三个阶段。
如何有效的学习AI大模型？ Python程序员罗宾学习人工智能语言模型自然语言处理架构
学习AI大模型是一个系统性的过程，涉及到多个学科的知识。以下是一些建议，帮助你更有效地学习AI大模型：基础知识储备：数学基础：学习线性代数、概率论、统计学和微积分等，这些是理解机器学习算法的数学基础。编程技能：掌握至少一种编程语言，如Python，因为大多数AI模型都是用Python实现的。理论学习：机器学习基础：了解监督学习、非监督学习、强化学习等基本概念。深度学习：学习神经网络的基本结构，如卷
【深度学习】【OnnxRuntime】【Python】模型转化、环境搭建以及模型部署的详细教程牙牙要健康深度学习 onnx onnxruntime 深度学习 python 人工智能
【深度学习】【OnnxRuntime】【Python】模型转化、环境搭建以及模型部署的详细教程提示:博主取舍了很多大佬的博文并亲测有效,分享笔记邀大家共同学习讨论文章目录【深度学习】【OnnxRuntime】【Python】模型转化、环境搭建以及模型部署的详细教程前言模型转换--pytorch转onnxWindows平台搭建依赖环境onnxruntime调用onnx模型ONNXRuntime推理核
基于深度学习的多模态信息检索 SEU-WYL 深度学习dnn 深度学习人工智能
基于深度学习的多模态信息检索（MultimodalInformationRetrieval,MMIR）是指利用深度学习技术，从包含多种模态（如文本、图像、视频、音频等）的数据集中检索出满足用户查询意图的相关信息。这种方法不仅可以处理单一模态的数据，还可以在多种模态之间建立关联，从而更准确地满足用户需求。1.多模态信息检索的挑战异构数据表示：多模态数据通常具有不同的特征和表示形式（如文本的词嵌入与图
sql统计相同项个数并按名次显示朱辉辉33 java oracle
现在有如下这样一个表： A表 ID Name time ------------------------------ 0001 aaa 2006-11-18 0002 ccc 2006-11-18 0003 eee 2006-11-18 0004 aaa 2006-11-18 0005 eee 2006-11-18 0004 aaa 2006-11-18 0002 ccc 20
Android+Jquery Mobile学习系列-目录白糖_ JQuery Mobile
最近在研究学习基于Android的移动应用开发，准备给家里人做一个应用程序用用。向公司手机移动团队咨询了下，觉得使用Android的WebView上手最快，因为WebView等于是一个内置浏览器，可以基于html页面开发，不用去学习Android自带的七七八八的控件。然后加上Jquery mobile的样式渲染和事件等，就能非常方便的做动态应用了。从现在起，往后一段时间，我打算
如何给线程池命名 daysinsun 线程池
在系统运行后，在线程快照里总是看到线程池的名字为pool-xx，这样导致很不好定位，怎么给线程池一个有意义的名字呢。参照ThreadPoolExecutor类的ThreadFactory，自己实现ThreadFactory接口，重写newThread方法即可。参考代码如下： public class Named
IE 中"HTML Parsing Error:Unable to modify the parent container element before the 周凡杨 html 解析 error readyState
错误： IE 中"HTML Parsing Error:Unable to modify the parent container element before the child element is closed" 现象：同事之间几个IE 测试情况下，有的报这个错，有的不报。经查询资料后，可归纳以下原因。
java上传 g21121 java
我们在做web项目中通常会遇到上传文件的情况，用struts等框架的会直接用的自带的标签和组件，今天说的是利用servlet来完成上传。我们这里利用到commons-fileupload组件，相关jar包可以取apache官网下载：http://commons.apache.org/ 下面是servlet的代码： //定义一个磁盘文件工厂 DiskFileItemFactory fact
SpringMVC配置学习 510888780 spring mvc
spring MVC配置详解现在主流的Web MVC框架除了Struts这个主力外，其次就是Spring MVC了，因此这也是作为一名程序员需要掌握的主流框架，框架选择多了，应对多变的需求和业务时，可实行的方案自然就多了。不过要想灵活运用Spring MVC来应对大多数的Web开发，就必须要掌握它的配置及原理。　　一、Spring MVC环境搭建：（Spring 2.5.6 + Hi
spring mvc-jfreeChart 柱图(1) 布衣凌宇 jfreechart
第一步：下载jfreeChart包，注意是jfreeChart文件lib目录下的，jcommon-1.0.23.jar和jfreechart-1.0.19.jar两个包即可；第二步：配置web.xml; web.xml代码如下 <servlet> <servlet-name>jfreechart</servlet-nam
我的spring学习笔记13-容器扩展点之PropertyPlaceholderConfigurer aijuans Spring3
PropertyPlaceholderConfigurer是个bean工厂后置处理器的实现，也就是BeanFactoryPostProcessor接口的一个实现。关于BeanFactoryPostProcessor和BeanPostProcessor类似。我会在其他地方介绍。PropertyPlaceholderConfigurer可以将上下文（配置文件）中的属性值放在另一个单独的标准java P
java 线程池使用 Runnable&Callable&Future antlove java thread Runnable callable future
1. 创建线程池 ExecutorService executorService = Executors.newCachedThreadPool(); 2. 执行一次线程，调用Runnable接口实现 Future<?> future = executorService.submit(new DefaultRunnable()); System.out.prin
XML语法元素结构的总结百合不是茶 xml 树结构
1.XML介绍1969年 gml (主要目的是要在不同的机器进行通信的数据规范)1985年 sgml standard generralized markup language1993年 html(www网)1998年 xml extensible markup language
改变eclipse编码格式 bijian1013 eclipse 编码格式
1.改变整个工作空间的编码格式改变整个工作空间的编码格式，这样以后新建的文件也是新设置的编码格式。 Eclipse->window->preferences->General->workspace-
javascript中return的设计缺陷 bijian1013 JavaScript AngularJS
代码1： <script> var gisService = (function(window) { return { name:function () { alert(1); } }; })(this); gisService.name(); &l
【持久化框架MyBatis3八】Spring集成MyBatis3 bit1129 Mybatis3
pom.xml配置 Maven的pom中主要包括： MyBatis MyBatis-Spring Spring MySQL-Connector-Java Druid applicationContext.xml配置 <?xml version="1.0" encoding="UTF-8"?> &
java web项目启动时自动加载自定义properties文件 bitray java Web 监听器相对路径
创建一个类 public class ContextInitListener implements ServletContextListener 使得该类成为一个监听器。用于监听整个容器生命周期的，主要是初始化和销毁的。类创建后要在web.xml配置文件中增加一个简单的监听器配置，即刚才我们定义的类。 <listener> <des
用nginx区分文件大小做出不同响应 ronin47
昨晚和前21v的同事聊天，说到我离职后一些技术上的更新。其中有个给某大客户(游戏下载类)的特殊需求设计，因为文件大小差距很大——估计是大版本和补丁的区别——又走的是同一个域名，而squid在响应比较大的文件时，尤其是初次下载的时候，性能比较差，所以拆成两组服务器，squid服务于较小的文件，通过pull方式从peer层获取，nginx服务于较大的文件，通过push方式由peer层分发同步。外部发布
java-67-扑克牌的顺子.从扑克牌中随机抽5张牌，判断是不是一个顺子，即这5张牌是不是连续的.2-10为数字本身，A为1，J为11，Q为12，K为13，而大 bylijinnan java
package com.ljn.base; import java.util.Arrays; import java.util.Random; public class ContinuousPoker { /** * Q67 扑克牌的顺子从扑克牌中随机抽5张牌，判断是不是一个顺子，即这5张牌是不是连续的。 * 2-10为数字本身，A为1，J为1
翟鸿燊老师语录 ccii 翟鸿燊
一、国学应用智慧TAT之亮剑精神A 1. 角色就是人格就像你一回家的时候，你一进屋里面，你已经是儿子，是姑娘啦，给老爸老妈倒怀水吧，你还觉得你是老总呢？还拿派呢？就像今天一样，你们往这儿一坐，你们之间是什么，同学，是朋友。还有下属最忌讳的就是领导向他询问情况的时候，什么我不知道，我不清楚，该你知道的你凭什么不知道
[光速与宇宙]进行光速飞行的一些问题 comsci 问题
在人类整体进入宇宙时代，即将开展深空宇宙探索之前，我有几个猜想想告诉大家仅仅是猜想。。。未经官方证实 1：要在宇宙中进行光速飞行，必须首先获得宇宙中的航行通行证，而这个航行通行证并不是我们平常认为的那种带钢印的证书，是什么呢？下面我来告诉
oracle undo解析 cwqcwqmax9 oracle
oracle undo解析2012-09-24 09:02:01 我来说两句作者：虫师收藏我要投稿 Undo是干嘛用的？ &nb
java中各种集合的详细介绍 dashuaifu java 集合
一，java中各种集合的关系图 Collection 接口的接口对象的集合 ├ List 子接口 &n
卸载windows服务的方法 dcj3sjt126com windows service
卸载Windows服务的方法在Windows中，有一类程序称为服务，在操作系统内核加载完成后就开始加载。这里程序往往运行在操作系统的底层，因此资源占用比较大、执行效率比较高，比较有代表性的就是杀毒软件。但是一旦因为特殊原因不能正确卸载这些程序了，其加载在Windows内的服务就不容易删除了。即便是删除注册表中的相应项目，虽然不启动了，但是系统中仍然存在此项服务，只是没有加载而已。如果安装其他
Warning: The Copy Bundle Resources build phase contains this target's Info.plist dcj3sjt126com ios xcode
http://developer.apple.com/iphone/library/qa/qa2009/qa1649.html Excerpt: You are getting this warning because you probably added your Info.plist file to your Copy Bundle
2014之C++学习笔记（一） Etwo C++Etwo Etwo iterator 迭代器
已经有很长一段时间没有写博客了，可能大家已经淡忘了Etwo这个人的存在，这一年多以来，本人从事了AS的相关开发工作，但最近一段时间，AS在天朝的没落，相信有很多码农也都清楚，现在的页游基本上达到饱和，手机上的游戏基本被unity3D与cocos占据，AS基本没有容身之处。so。。。最近我并不打算直接转型
js跨越获取数据问题记录 haifengwuch jsonp json Ajax
js的跨越问题，普通的ajax无法获取服务器返回的值。第一种解决方案，通过getson，后台配合方式，实现。 Java后台代码： protected void doPost(HttpServletRequest req, HttpServletResponse resp) throws ServletException, IOException { String ca
蓝色jQuery导航条 ini JavaScript html jquery Web html5
效果体验：http://keleyi.com/keleyi/phtml/jqtexiao/39.htmHTML文件代码： <!DOCTYPE html> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <title>jQuery鼠标悬停上下滑动导航条 - 柯乐义<
linux部署jdk,tomcat,mysql kerryg jdk tomcat linux mysql
1、安装java环境jdk: 一般系统都会默认自带的JDK,但是不太好用，都会卸载了，然后重新安装。 1.1）、卸载：（rpm -qa :查询已经安装哪些软件包； rmp -q 软件包：查询指定包是否已
DOMContentLoaded VS onload VS onreadystatechange mutongwu jquery js
1. DOMContentLoaded 在页面html、script、style加载完毕即可触发，无需等待所有资源（image/iframe）加载完毕。（IE9+） 2. onload是最早支持的事件，要求所有资源加载完毕触发。 3. onreadystatechange 开始在IE引入，后来其它浏览器也有一定的实现。涉及以下 document , applet, embed, fra
sql批量插入数据 qifeifei 批量插入
hi，自己在做工程的时候，遇到批量插入数据的数据修复场景。我的思路是在插入前准备一个临时表，临时表的整理就看当时的选择条件了，临时表就是要插入的数据集，最后再批量插入到数据库中。 WITH tempT AS ( SELECT item_id AS combo_id, item_id, now() AS create_date FROM a
log4j打印日志文件如何实现相对路径到项目工程下 thinkfreer Web log4j 应用服务器日志
最近为了实现统计一个网站的访问量，记录用户的登录信息，以方便站长实时了解自己网站的访问情况，选择了Apache 的log4j,但是在选择相对路径那块卡主了，X度了好多方法(其实大多都是一样的内用，还一个字都不差的)，都没有能解决问题，无奈搞了2天终于解决了，与大家分享一下需求：用户登录该网站时，把用户的登录名,ip,时间。统计到一个txt文档里，以方便其他系统调用此txt。项目名
linux下mysql-5.6.23.tar.gz安装与配置笑我痴狂 mysql linux unix
1.卸载系统默认的mysql [root@localhost ~]# rpm -qa | grep mysql mysql-libs-5.1.66-2.el6_3.x86_64 mysql-devel-5.1.66-2.el6_3.x86_64 mysql-5.1.66-2.el6_3.x86_64 [root@localhost ~]# rpm -e mysql-libs-5.1