mmseg解析

官方文档

getting_started.md
如何test a model:
如何train a model: python tools/train.py ${CONFIG_FILE} [optional arguments]

train(tools/train.py)

1.模型构建

配置参数
model = dict(
    type='ChangeExtraSegmentor',
    backbone=dict(
        type='ChangeExtraResNet',
        depth=50,
        use_IN1=True,
        pretrained='/mnt/lustre/wujiang/.cache/torch/checkpoints/' \
            + 'resnet50_v1c-2cccc1ad.pth',
        base_channels=64,
        num_stages=4,
        out_indices=(0, 1, 2, 3),
        dilations=(1, 1, 2, 4),
        strides=(1, 2, 1, 1),
        deep_stem=True,
        norm_cfg=norm_cfg,
        norm_eval=False,
        contract_dilation=True,
        style='pytorch',

model = build_segmentor( cfg.model, train_cfg=cfg.train_cfg, test_cfg=cfg.test_cfg)
build_from_cfg(cfg_, registry, default_args) for cfg_ in cfg
见mmcv.utils.registry

def build_from_cfg(cfg, registry, default_args=None):
    """Build a module from config dict.

    Args:
        cfg (dict): Config dict. It should at least contain the key "type".
        registry (:obj:`Registry`): The registry to search the type from.
        default_args (dict, optional): Default initialization arguments.

    Returns:
        object: The constructed object.
    """
obj_type = args.pop('type')
obj_cls = registry.get(obj_type) #根据网络名字获得实际的网络类
obj_cls(**args) #实例化网络类

2.训练
train_segmentor(model,datasets,cfg)

runner = IterBasedRunner(model=model,batch_processor=None,optimizer=optimizer, work_dir=cfg.work_dir,  logger=logger,  meta=meta)
runner.run(data_loaders, cfg.workflow, cfg.total_iters)

其中IterBasedRunner见mmcv.runner.iter_based_runner.py

while self.iter < self._max_iters:
    for i, flow in enumerate(workflow):
        self._inner_iter = 0
        mode, iters = flow
        iter_runner = getattr(self, mode)
        for _ in range(iters):
        	iter_runner(iter_loaders[i], **kwargs)

具体例子讲解

args:config=‘configs/_rs_waterRGB/fcn_hr48_896x896_4k__ShCHY_dgl__oeg_edr2.py’
train.py

模型初始化

cfg = Config.fromfile(args.config)
build_segmentor(  cfg.model, train_cfg=cfg.train_cfg, test_cfg=cfg.test_cfg)
这里的cfg定义了构建模型所需的全部参数:
builder.py return build_from_cfg(cfg, registry, default_args)
mmcv/registry.py    build_from_cfg(cfg, registry, default_args=None)
     obj_cls = registry.get(obj_type),其中obj_type为EncoderDecoderMultiHead,obj_cls为mmseg.models.segmentors.encoder_decoder_multihead.EncoderDecoderMultiHead
     模型构建时,会依次构建backbone,neck,head,auxiliary_head等
     self.backbone = builder.build_backbone(backbone)  #调用builder.py中build_backbone,进一步调用builder.py中build,mmcv/registry.py    build_from_cfg
     self.neck = builder.build_neck(neck)
     self._init_decode_head(decode_head)
     self._init_auxiliary_head(auxiliary_head)

构造模型时传入的cfg
mmseg解析_第1张图片

** 模型构建时,会依次构建backbone,neck,head,auxiliary_head等**
class EncoderDecoderMultiHead(EncoderDecoder)
其中backbone的cfg为:
mmseg解析_第2张图片

self.backbone = builder.build_backbone(backbone)
	build(cfg, BACKBONES)
	build_from_cfg(cfg, registry, default_args)
	调用HRnet的初始化,HRNet类的定义位于mmseg/models/backbones/hrnet.py,构建了HRnet的4个stage,输出为不同size的feature list

self._init_decode_head(decode_head)
	self.decode_head = builder.build_head(decode_head)
	build(cfg, HEADS)
	其中cfg为FCNHead组成的list,FCNHead的定义位于mmseg/models/decode_heads/fcn_head.py
	FCNhead将feature list先上采样到同一分辨率(x = self._transform_inputs(inputs)),然后concat,然后卷积为指定通道数(output = self.convs(x)),然后2个卷积得到输出结果(output = self.cls_seg(output))
	head的forward函数会先计算结果,然后根据指定的loss返回损失


head是怎么构建的,怎么和cfg中对应的??

head在不同的任务中有着很大的不同,head的卷积层需要并行,传行和concat等,所以head的构建和forward函数,需要根据配置文件自己写和更改
for i in range(self.num_heads):
	self.decode_head.append(builder.build_head(decode_head[i]))
		build_from_cfg(cfg, HEAD, default_args)
		#调用FCNHead的构建函数
		构建self.convs:   self.convs = nn.Sequential(*convs),包括了inputchannels->outputchannels和num_convs个卷积
		构建self.conv_seg:self.conv_seg = nn.Conv2d(channels, num_classes, kernel_size=1),包括了outputchannels->回归结果的卷积
		
		

前向传播的过程
backbone用于提取特征;
head会继续调用self.decode_head[i].forward_train(x, img_metas, gt_seg, self.train_cfg),用于进一步提取特征和计算损失

seg_logits = self.forward(inputs)
	x = self._transform_inputs(inputs) #特征list做resize,然后concat
		inputs = torch.cat(upsampled_inputs, dim=1)
	output = self.convs(x) #根据input_channels,channnels,构建多个卷积层,将input_channels的卷积,卷为channels的卷积
	output = self.cls_seg(output) #卷为指定的输出通道数
	

losses = self.losses(seg_logits, gt_semantic_seg, seg_weight_map=kargs['seg_weight_map'])
计算损失时,会先初始化self.loss_decode:self.loss_decode = build_loss(loss_decode)#调用 mmseg/models/losses/cross_entropy_loss.py类构建损失;然后调用mmseg/models/decode_heads/decode_head.py/losses()函数,该函数利用loss_decode计算损失

mmseg解析_第3张图片

主要问题为head要如何构建和前向传播
head负责将backbone中提取的特征得到结果,然后计算损失。

mmseg解析_第4张图片

初始化后的模型:
mmseg解析_第5张图片
数据dataset构建
datasets = [build_dataset(cfg.data.train)] # runner = IterBasedRunner(model=model,batch_processor=None,optimizer=optimizer,work_dir=cfg.work_dir, logger=logger,meta=meta)

数据部分

dataloader返回的是一个字典,包含img,img_metas,gt_semantic_seg和其它字段,其中img和gt_semantic_seg是tensor,而img_metas是字典,表示图像处理时所记录的文本信息
mmseg解析_第6张图片
base.py train_step

你可能感兴趣的:(mmseg解析)