【Python】-【Pytorch】学习日记三 UNet实现代码阅读

【Python】-【Pytorch】学习日记三 UNet实现代码阅读

从今天开始学习pytorch。

【Python】-【Pytorch】学习日记一 张量(Tensor)
【Python】-【Pytorch】学习日记二 autograd 与 torch.nn 构建神经网络


【Python】-【Pytorch】学习日记三 UNet实现代码阅读

  • 【Python】-【Pytorch】学习日记三 UNet实现代码阅读
  • 前言
  • 一、Unet
  • 二、实现代码一
    • 1.模型结构
    • 2.读入数据
    • 3.定义损失与训练
      • 加权损失
      • 训练过程
      • 使用网络
    • 真·笔记
      • torch.nn.Upsample
      • torch.nn.functional.binary_cross_entropy_with_logits
      • torch.optim.Adam
      • torch.optim.lr_scheduler.StepLR
  • 三、实现代码二
    • 1.模型定义
    • 2.调用示例
    • 3.参数选择
    • 真·笔记
      • torch.nn.ModuleList
      • torch.nn.ConvTranspose2d
      • nn.Sequential
      • torch.nn.BatchNorm2d
  • 四、实现代码三
    • 1.模型定义
    • 2.损失定义
  • 总结
  • 学习资料 与 参考文献


前言

写(魔改)代码从读代码开始。


一、Unet

一种语义分割网络,具体介绍在 https://blog.csdn.net/qq_44055705/article/details/115733245
网络结构如下:
【Python】-【Pytorch】学习日记三 UNet实现代码阅读_第1张图片

论文作者给出的是Caffe版本,我从GitHub上找了几个pytorch实现作为学习对象。

二、实现代码一

只有网络架构的部分。但是架构也与论文设计有许多不一致的地方。
数据增强,加权损失,无缝拼图等部分没有实现。
github地址:https://github.com/usuyama/pytorch-unet

1.模型结构

def double_conv(in_channels, out_channels):
    # 定义两个连续的卷积层,卷积核3*3,填充数为1.
    # 每卷积一次,图像尺寸不变,这一点与结构图中不一致。结构图中每卷积一次,图像尺寸缩2
    # 第一层扩充通道数(收缩通道,下采样过程)和削减通道数(扩张通道,上采样过程)
    # 第二层通道数不变化
    # 激活函数是ReLu函数,增加非线性。
    return nn.Sequential(
        nn.Conv2d(in_channels, out_channels, 3, padding=1),
        nn.ReLU(inplace=True),
        nn.Conv2d(out_channels, out_channels, 3, padding=1),
        nn.ReLU(inplace=True)
    )   


class UNet(nn.Module):

    def __init__(self, n_class):
        super().__init__()
        # 收缩通道中的四层
        self.dconv_down1 = double_conv(3, 64)
        self.dconv_down2 = double_conv(64, 128)
        self.dconv_down3 = double_conv(128, 256)
        self.dconv_down4 = double_conv(256, 512)        

        # 用于下采样的2*2最大池化
        self.maxpool = nn.MaxPool2d(2)
        # 用于上采样的上采样层,2倍放大,双线性插值
        self.upsample = nn.Upsample(scale_factor=2, mode='bilinear', align_corners=True)        

        # 此外,这一结构中还缺少U型结构底部的两次卷积

        # 扩张通路中的三层,这一点设定与结构图不一致。
        # 就通道数变化而言应该是:(512+512, 512),(256+256, 256),(128+128, 128),(64+64, 64)
        self.dconv_up3 = double_conv(256 + 512, 256)
        self.dconv_up2 = double_conv(128 + 256, 128)
        self.dconv_up1 = double_conv(128 + 64, 64)

        # 最后一层卷积层代替全连接层,64通道变成分类数目,卷积核是1*1
        self.conv_last = nn.Conv2d(64, n_class, 1)

    def forward(self, x):
        conv1 = self.dconv_down1(x)
        x = self.maxpool(conv1)
        # 在收缩通路中,每次双卷积后会进行最大池化。将维度减半

        conv2 = self.dconv_down2(x)
        x = self.maxpool(conv2)
        
        conv3 = self.dconv_down3(x)
        x = self.maxpool(conv3)   
        
        x = self.dconv_down4(x)
        x = self.upsample(x)

        x = torch.cat([x, conv3], dim=1)
        # 使用cat实现skip connection结构
        x = self.dconv_up3(x)
        x = self.upsample(x)
        # 在扩张通路中,每次上采样后,会联合收缩通路中对应层数的feature map,增强信息
        # 然后进行两次卷积,并再次上采样
        
        x = torch.cat([x, conv2], dim=1)
        x = self.dconv_up2(x)
        x = self.upsample(x)

        x = torch.cat([x, conv1], dim=1)
        x = self.dconv_up1(x)
        
        out = self.conv_last(x)
        
        return out

2.读入数据

class SimDataset(Dataset):
    def __init__(self, count, transform=None):
        self.input_images, self.target_masks = simulation.generate_random_data(192, 192, count=count)
        # 随机产生的数据
        self.transform = transform

    def __len__(self):
        return len(self.input_images)

    def __getitem__(self, idx):
        image = self.input_images[idx]
        mask = self.target_masks[idx]
        if self.transform:
            image = self.transform(image)

        return [image, mask]

# use the same transformations for train/val in this example
trans = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225]) # imagenet
])

train_set = SimDataset(2000, transform = trans)
val_set = SimDataset(200, transform = trans)

image_datasets = {
    'train': train_set, 'val': val_set
}

batch_size = 25
dataloaders = {
    'train': DataLoader(train_set, batch_size=batch_size, shuffle=True, num_workers=0),
    'val': DataLoader(val_set, batch_size=batch_size, shuffle=True, num_workers=0)
}

3.定义损失与训练

加权损失

这一实现方法使用DICE和交叉熵的加权损失,与论文的加权损失不一样。

def calc_loss(pred, target, metrics, bce_weight=0.5):
    # pred,预测的map;target,真值map;metrics,记录损失变化;bce_weight,损失各项之间的权重
    bce = F.binary_cross_entropy_with_logits(pred, target)
    # 交叉熵损失
    
    pred = F.sigmoid(pred)
    dice = dice_loss(pred, target)
    # DICE损失

    loss = bce * bce_weight + dice * (1 - bce_weight)
    # 加权损失

    metrics['bce'] += bce.data.cpu().numpy() * target.size(0)
    metrics['dice'] += dice.data.cpu().numpy() * target.size(0)
    metrics['loss'] += loss.data.cpu().numpy() * target.size(0)

    return loss

其中,DICE损失(带拉普拉斯平滑)的定义为: L s = 1 − 2 ∣ X ∩ Y ∣ + 1 ∣ X ∣ + ∣ Y ∣ + 1 L_s=1-\frac{2|X\cap Y|+1}{|X|+|Y|+1} Ls=1X+Y+12XY+1

https://zhuanlan.zhihu.com/p/86704421

实现:

def dice_loss(pred, target, smooth = 1.):
    # pred,预测集合;target,真值集合;smooth,拉普拉斯平滑系数

    # 保证这两个量是连续的
    # https://zhuanlan.zhihu.com/p/64551412
    pred = pred.contiguous()
    target = target.contiguous()

    # 将 |X∩Y| 近似为预测图pred和真值图target之间的点乘,并将点乘的元素的结果相加
    intersection = (pred * target).sum(dim=2).sum(dim=2)

    # |X|和|Y|为逐元素类和
    loss = (1 - ((2. * intersection + smooth) / (pred.sum(dim=2).sum(dim=2) + target.sum(dim=2).sum(dim=2) + smooth)))
    
    return loss.mean()

训练过程

def train_model(model, optimizer, scheduler, num_epochs=25):
    # scheduler是正则化

    best_model_wts = copy.deepcopy(model.state_dict())
    # https://zhuanlan.zhihu.com/p/270344655
    # 保存当前模型的参数

    best_loss = 1e10
    # 收敛的loss的界

    for epoch in range(num_epochs):
        print('Epoch {}/{}'.format(epoch, num_epochs - 1))
        print('-' * 10)

        since = time.time()

        # Each epoch has a training and validation phase
        for phase in ['train', 'val']:
            if phase == 'train':
                scheduler.step()
                for param_group in optimizer.param_groups:
                    print("LR", param_group['lr'])

                model.train()  # Set model to training mode
            else:
                model.eval()   # Set model to evaluate mode

            metrics = defaultdict(float)
            # 保证字典的默认值初始化问题
            # https://blog.csdn.net/real_ray/article/details/17919289

            epoch_samples = 0

            for inputs, labels in dataloaders[phase]:
                inputs = inputs.to(device)
                labels = labels.to(device)

                # zero the parameter gradients
                optimizer.zero_grad()

                # forward
                # track history if only in train
                with torch.set_grad_enabled(phase == 'train'):
                    outputs = model(inputs)
                    loss = calc_loss(outputs, labels, metrics)

                    # backward + optimize only if in training phase
                    if phase == 'train':
                        loss.backward()
                        optimizer.step()

                # statistics
                epoch_samples += inputs.size(0)

            print_metrics(metrics, epoch_samples, phase)
            epoch_loss = metrics['loss'] / epoch_samples

            # deep copy the model
            if phase == 'val' and epoch_loss < best_loss:
                print("saving best model")
                best_loss = epoch_loss
                best_model_wts = copy.deepcopy(model.state_dict())
                # 验证阶段,更新性能最高的模型参数

        time_elapsed = time.time() - since
        print('{:.0f}m {:.0f}s'.format(time_elapsed // 60, time_elapsed % 60))

    print('Best val loss: {:4f}'.format(best_loss))

    # load best model weights
    model.load_state_dict(best_model_wts)
    return model

训练:


device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
print(device)

num_class = 6
# model = ResNetUNet(num_class).to(device)
model = pytorch_unet.UNet()

# freeze backbone layers
#for l in model.base_layers:
#    for param in l.parameters():
#        param.requires_grad = False

optimizer_ft = optim.Adam(filter(lambda p: p.requires_grad, model.parameters()), lr=1e-4)
# filter() 函数用于过滤序列,过滤掉不符合条件的元素,返回一个迭代器。
# 该接收两个参数,第一个为函数,第二个为序列。
# 序列的每个元素作为参数传递给函数进行判断,然后返回 True 或 False,最后将返回 True 的元素放到新列表中
# https://www.runoob.com/python3/python3-func-filter.html

exp_lr_scheduler = lr_scheduler.StepLR(optimizer_ft, step_size=30, gamma=0.1)
# 额外的权重衰减
model = train_model(model, optimizer_ft, exp_lr_scheduler, num_epochs=60)

使用网络

import math

model.eval()   # Set model to the evaluation mode

# Create another simulation dataset for test
test_dataset = SimDataset(3, transform = trans)
test_loader = DataLoader(test_dataset, batch_size=3, shuffle=False, num_workers=0)

# Get the first batch
inputs, labels = next(iter(test_loader))
inputs = inputs.to(device)
labels = labels.to(device)

# Predict
pred = model(inputs)
# The loss functions include the sigmoid function.
pred = F.sigmoid(pred)
pred = pred.data.cpu().numpy()
print(pred.shape)

# Change channel-order and make 3 channels for matplot
input_images_rgb = [reverse_transform(x) for x in inputs.cpu()]

# Map each channel (i.e. class) to each color
target_masks_rgb = [helper.masks_to_colorimg(x) for x in labels.cpu().numpy()]
pred_rgb = [helper.masks_to_colorimg(x) for x in pred]

helper.plot_side_by_side([input_images_rgb, target_masks_rgb, pred_rgb])

真·笔记

torch.nn.Upsample

torch.nn.Upsample(size=None, scale_factor=None, mode=‘nearest’, align_corners=None)
对给定的多通道1D(时间)、2D(空间)或3D(体积)数据进行采样。输入数据假设为minibatch x channels x[可选深度]x[可选高度]x width。因此,对于空间输入,我们期望一个四维张量,对于体积输入,我们期望一个5D张量。对于三维、4D和5D输入张量,可采用的上采样算法分别是最近邻和线性、双线性、双三次和三线性。

  • size (int or Tuple[int] or Tuple[int, int] or Tuple[int, int, int], optional) – output spatial sizes. 输出的具体尺寸

  • scale_factor (float or Tuple[float] or Tuple[float, float] or Tuple[float, float, float], optional) – multiplier for spatial size. Has to match input size if it is a tuple. 尺寸的放大倍数。

  • mode (str, optional) – the upsampling algorithm: one of ‘nearest’, ‘linear’, ‘bilinear’, ‘bicubic’ and ‘trilinear’. Default: ‘nearest’

  • align_corners (bool, optional) – if True, the corner pixels of the input and output tensors are aligned, and thus preserving the values at those pixels. This only has effect when mode is ‘linear’, ‘bilinear’, or ‘trilinear’. Default: False 是否进行角对齐,在线性插值时有用

torch.nn.functional.binary_cross_entropy_with_logits

torch.nn.functional.binary_cross_entropy_with_logits(input, target, weight=None, size_average=None, reduce=None, reduction=‘mean’, pos_weight=None)
计算目标对数和输出对数之间的二进制交叉熵的函数。

  • input – Tensor of arbitrary shape 输入

  • target – Tensor of the same shape as input 目标

  • weight (Tensor, optional) – a manual rescaling weight if provided it’s repeated to match input tensor shape

  • size_average (bool, optional) – Deprecated (see reduction).

  • reduce (bool, optional) – Deprecated (see reduction).

  • reduction (string, optional) – Specifies the reduction to apply to the output: ‘none’ | ‘mean’ | ‘sum’. ‘none’: no reduction will be applied, ‘mean’: the sum of the output will be divided by the number of elements in the output, ‘sum’: the output will be summed. Note: size_average and reduce are in the process of being deprecated, and in the meantime, specifying either of those two args will override reduction. Default: ‘mean’

  • pos_weight (Tensor, optional) – a weight of positive examples. Must be a vector with length equal to the number of classes.正样本的权重

torch.optim.Adam

torch.optim.Adam(params, lr=0.001, betas=(0.9, 0.999), eps=1e-08, weight_decay=0, amsgrad=False)
优化器

  • params (iterable) – iterable of parameters to optimize or dicts defining parameter groups

  • lr (float, optional) – learning rate (default: 1e-3)

  • betas (Tuple[float, float], optional) – coefficients used for computing running averages of gradient and its square (default: (0.9, 0.999))

  • eps (float, optional) – term added to the denominator to improve numerical stability (default: 1e-8) 防爆系数

  • weight_decay (float, optional) – weight decay 权值衰减 (L2 penalty) (default: 0)

  • amsgrad (boolean, optional) – whether to use the AMSGrad variant of this algorithm from the paper On the Convergence of Adam and Beyond (default: False) 启用变体算法AMSGrad

torch.optim.lr_scheduler.StepLR

torch.optim.lr_scheduler.StepLR(optimizer, step_size, gamma=0.1, last_epoch=-1, verbose=False)
每step_size个epoch用gamma衰减每个参数组的学习速率。请注意,这种衰减可以与外部的学习速率变化同时发生。当last_epoch=-1时,将初始lr设置为lr。

  • optimizer (Optimizer) – 要覆盖的学习器.

  • step_size (int) – Period of learning rate decay.权重衰减的epoch周期

  • gamma (float) – Multiplicative factor of learning rate decay. Default: 0.1.衰减的系数

  • last_epoch (int) – The index of last epoch. Default: -1. 衰减的epoch上限

  • verbose (bool) – If True, prints a message to stdout for each update. Default: False.

实例:

scheduler = StepLR(optimizer, step_size=30, gamma=0.1)
for epoch in range(100):
    train(...)
    validate(...)
    scheduler.step()

三、实现代码二

github地址:https://github.com/jvanvugt/pytorch-unet

1.模型定义

import torch
from torch import nn
import torch.nn.functional as F


class UNet(nn.Module):
    # 可以进行参数调整,如网络的深度等。
    def __init__(
        self,
        in_channels=1,
        n_classes=2,
        depth=5,
        wf=6,
        padding=False,
        batch_norm=False,
        up_mode='upconv',
    ):
        """
        Implementation of
        U-Net: Convolutional Networks for Biomedical Image Segmentation
        (Ronneberger et al., 2015)
        https://arxiv.org/abs/1505.04597

        Using the default arguments will yield the exact version used
        in the original paper

        Args:
            in_channels (int): number of input channels
            n_classes (int): number of output channels
            depth (int): depth of the network
            wf (int): number of filters in the first layer is 2**wf
            padding (bool): if True, apply padding such that the input shape
                            is the same as the output.
                            This may introduce artifacts
            batch_norm (bool): Use BatchNorm after layers with an
                               activation function
            up_mode (str): one of 'upconv' or 'upsample'.
                           'upconv' will use transposed convolutions for
                           learned upsampling.
                           'upsample' will use bilinear upsampling.
        """
        super(UNet, self).__init__()
        assert up_mode in ('upconv', 'upsample')

        self.padding = padding
        self.depth = depth

        # 下降/收缩通道
        prev_channels = in_channels
        self.down_path = nn.ModuleList()
        for i in range(depth):
            self.down_path.append(
                UNetConvBlock(prev_channels, 2 ** (wf + i), padding, batch_norm)
            )
            prev_channels = 2 ** (wf + i)
            # 每卷积一次,通道数*2

        # 上升/扩张通道
        self.up_path = nn.ModuleList()
        for i in reversed(range(depth - 1)):
            # 将在U型底部的一次卷积看作下降模块,因此下降的次数比上升多1
            # reversed 逆序的,所以wf公式不变
            self.up_path.append(
                UNetUpBlock(prev_channels, 2 ** (wf + i), up_mode, padding, batch_norm)
            )
            prev_channels = 2 ** (wf + i)
        # 全卷积
        self.last = nn.Conv2d(prev_channels, n_classes, kernel_size=1)

    def forward(self, x):
        blocks = []
        for i, down in enumerate(self.down_path):
            x = down(x)
            if i != len(self.down_path) - 1:
                # 最底层多进行了一次卷积,但是不需要再池化了,所以前n-1次下降需要池化
                blocks.append(x)
                # block的对应的下降模块的顺序是【0,1,2】
                x = F.max_pool2d(x, 2)

        for i, up in enumerate(self.up_path):
            # 设depth=4,则0,1,2为下降层,3为底层,4,5,6(上升通路第0,1,2层)分别是上升层。
            # 4层(第0层)对应下降的最后一层,也就是2层,对应的feature map是blocks倒数第一个,即[-1],以此类推
            x = up(x, blocks[-i - 1])
            # skip connection部分

        return self.last(x)


class UNetConvBlock(nn.Module):
    # 下降/卷积模块
    def __init__(self, in_size, out_size, padding, batch_norm):
        super(UNetConvBlock, self).__init__()
        block = []

        block.append(nn.Conv2d(in_size, out_size, kernel_size=3, padding=int(padding)))
        block.append(nn.ReLU())
        if batch_norm:
            block.append(nn.BatchNorm2d(out_size))

        block.append(nn.Conv2d(out_size, out_size, kernel_size=3, padding=int(padding)))
        block.append(nn.ReLU())
        if batch_norm:
            block.append(nn.BatchNorm2d(out_size))

        self.block = nn.Sequential(*block)

    def forward(self, x):
        out = self.block(x)
        return out


class UNetUpBlock(nn.Module):
    # 上升/反卷积过程
    # 一个Upsample/ConvTranspose2d + 一个双卷积模块
    def __init__(self, in_size, out_size, up_mode, padding, batch_norm):
        super(UNetUpBlock, self).__init__()
        if up_mode == 'upconv':
            self.up = nn.ConvTranspose2d(in_size, out_size, kernel_size=2, stride=2)
        elif up_mode == 'upsample':
            self.up = nn.Sequential(
                nn.Upsample(mode='bilinear', scale_factor=2),
                nn.Conv2d(in_size, out_size, kernel_size=1),
            )

        self.conv_block = UNetConvBlock(in_size, out_size, padding, batch_norm)

    # 中心对齐的裁剪
    def center_crop(self, layer, target_size):
        _, _, layer_height, layer_width = layer.size()
        diff_y = (layer_height - target_size[0]) // 2
        diff_x = (layer_width - target_size[1]) // 2
        return layer[
            :, :, diff_y: (diff_y + target_size[0]), diff_x: (diff_x + target_size[1])
        ]

    def forward(self, x, bridge):
        up = self.up(x)
        crop1 = self.center_crop(bridge, up.shape[2:])
        out = torch.cat([up, crop1], 1)
        out = self.conv_block(out)

        return out

2.调用示例

import torch
import torch.nn.functional as F
from unet import UNet

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
model = UNet(n_classes=2, padding=True, up_mode='upsample').to(device)
optim = torch.optim.Adam(model.parameters())
dataloader = ...
epochs = 10

for _ in range(epochs):
    for X, y in dataloader:
        X = X.to(device)  # [N, 1, H, W]
        y = y.to(device)  # [N, H, W] with class indices (0, 1)
        prediction = model(X)  # [N, 2, H, W]
        loss = F.cross_entropy(prediction, y)

        optim.zero_grad()
        loss.backward()
        optim.step()

3.参数选择

这一部分在github项目README部分给出。
1.SAME vs VALID padding
在不考虑“无缝拼接”策略时,使用VALID,即不填充(论文中的方法)
2.Upsampling vs Transposed convolutions
建议默认使用upsampling,除非问题需要很高的空间分辨率
3.Input size
需要计算输入大小与输出大小的关系,如果是奇数(甚至如果不是2的depth-1幂次的数),在maxpool之后会/2,再还原回去之后会丢失维度。此时可以提前将图片用零填充到合适的大小。(这也是不考虑无缝拼接那种策略的情况下)

真·笔记

torch.nn.ModuleList

torch.nn.ModuleList(modules=None)
保存列表中的子模块。
ModuleList可以像常规的Python列表一样被索引,但是它包含的模块应是正确注册过的,对所有的module method可见。
参数

  • modules (list, optional) – 将要被添加到MuduleList中的 modules 列表

方法:

  • append(module) 将给定的模块追加到列表的末尾。
  • extend(modules) 将Python可迭代对象中的模块追加到列表的末尾。
  • insert(index, module) 在列表中给定的索引之前插入给定的模块。

torch.nn.ConvTranspose2d

torch.nn.ConvTranspose2d(in_channels, out_channels, kernel_size, stride=1, padding=0, output_padding=0, groups=1, bias=True, dilation=1, padding_mode=‘zeros’)
对由多个输入平面组成的输入图像应用二维转置卷积算子。这个模块可以看作是Conv2d相对于其输入的梯度。它也被称为分段式卷积或反卷积(尽管它不是一个实际的反卷积操作)。
输入输出尺寸变化(设图片等宽高,用H表示): H o u t = ( H i n − 1 ) × s t r i d e − 2 × p a d d i n g + d i l a t i o n × ( k e r n e l _ s i z e − 1 ) + o u t p u t _ p a d d i n g + 1 H_{out}=(H_{in}-1) \times stride-2 \times padding+dilation \times (kernel\_size-1)+output\_padding+1 Hout=(Hin1)×stride2×padding+dilation×(kernel_size1)+output_padding+1在不填充,不膨胀的情况下: H o u t = ( H i n − 1 ) × s t r i d e + k e r n e l _ s i z e H_{out}=(H_{in}-1) \times stride+kernel\_size Hout=(Hin1)×stride+kernel_size由此可姑且认为:H扩张的倍数就是stride,剩余像素大小对齐的事情由kenerlsize搞定。
参数:

  • in_channels (int) – 输入图像的通道数
  • out_channels (int) – 输出图像的通道数
  • kernel_size (int or tuple) – 卷积核
  • stride (int or tuple, optional) – 卷积步长. Default: 1
  • padding (int or tuple, optional) – dilation * (kernel_size - 1) - padding zero-padding will be added to both sides of each dimension in the input. Default: 0
  • output_padding (int or tuple, optional) – Additional size added to one side of each dimension in the output shape. Default: 0
  • groups (int, optional) – 从输入通道到输出通道的连接数关系。Default: 1
  • bias (bool, optional) – If True, adds a learnable bias to the output. Default: True
  • dilation (int or tuple, optional) – Spacing between kernel elements. 膨胀系数。Default: 1

nn.Sequential

torch.nn.Sequential(*args)
顺序容器。模块将按照它们在构造函数中传递的顺序添加到它中。另外,也可以传入一个有序的模块字典。例子:

# Example of using Sequential
model = nn.Sequential(
          nn.Conv2d(1,20,5),
          nn.ReLU(),
          nn.Conv2d(20,64,5),
          nn.ReLU()
        )

# Example of using Sequential with OrderedDict
model = nn.Sequential(OrderedDict([
          ('conv1', nn.Conv2d(1,20,5)),
          ('relu1', nn.ReLU()),
          ('conv2', nn.Conv2d(20,64,5)),
          ('relu2', nn.ReLU())
        ]))

torch.nn.BatchNorm2d

torch.nn.BatchNorm2d(num_features, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
对4D输入应用批处理标准化(带有额外通道维度的2D输入的小批样本):
y = x − E [ x ] V a r [ x ] + ε × γ + β y=\frac{x-E\left[ x \right]}{\sqrt{Var\left[ x \right] +\varepsilon}}\times \gamma +\beta y=Var[x]+ε xE[x]×γ+β
平均和标准偏差是在小批次内按每个维度分别计算的。 γ γ γ β β β是可学习的参数向量,大小C(C是输入大小)。默认情况下, γ \gamma γ设置为1, β \beta β设置为0。标准差是通过有偏估计器计算的,等效于torch.var(input, unbiased=False)
另外,默认情况下,在训练过程中,这一层保持对其计算的均值和方差的运行时估计,然后在评估过程中用于标准化。运行时估计以默认动量0.1保持。
如果track_running_stats设置为False,那么这一层就不会继续运行时评估,而批处理统计信息也会在评估期间使用。
参数:

  • num_features – C ,来自输入大小 (N, C, H, W)
  • eps –防爆系数. Default: 1e-5
  • momentum – the value used for the running_mean and running_var computation. Can be set to None for cumulative moving average (i.e. simple average). Default: 0.1
  • affine – a boolean value that when set to True, this module has learnable affine parameters. 一个布尔值,当设置为True时,该模块有可学习的仿射参数。Default: True
  • track_running_stats – a boolean value that when set to True, this module tracks the running mean and variance, and when set to False, this module does not track such statistics, and initializes statistics buffers running_mean and running_var as None. When these buffers are None, this module always uses batch statistics. in both training and eval modes. Default: True

四、实现代码三

github地址:https://github.com/milesial/Pytorch-UNet

1.模型定义

class UNet(nn.Module):
    def __init__(self, n_channels, n_classes, bilinear=True):
        super(UNet, self).__init__()
        self.n_channels = n_channels
        self.n_classes = n_classes
        self.bilinear = bilinear
        # 可选参数:输入的通道数,分类数,上采样的插值标志

        self.inc = DoubleConv(n_channels, 64)
        self.down1 = Down(64, 128)
        self.down2 = Down(128, 256)
        self.down3 = Down(256, 512)
        factor = 2 if bilinear else 1
        self.down4 = Down(512, 1024 // factor)
        # 四次下降
        # 因为down模块时先下降再卷积两次,所以底部的两次卷积被算到模块4中
        # 而原始输入后,第一次max pool之前的两次卷积被单独放了出来

        self.up1 = Up(1024, 512 // factor, bilinear)
        self.up2 = Up(512, 256 // factor, bilinear)
        self.up3 = Up(256, 128 // factor, bilinear)
        self.up4 = Up(128, 64, bilinear)
        # 四次上升

        self.outc = OutConv(64, n_classes)

    def forward(self, x):
        x1 = self.inc(x)
        x2 = self.down1(x1)
        x3 = self.down2(x2)
        x4 = self.down3(x3)
        x5 = self.down4(x4)
        # 保存中间结果
        x = self.up1(x5, x4)
        x = self.up2(x, x3)
        x = self.up3(x, x2)
        x = self.up4(x, x1)
        # skip connection 结构
        logits = self.outc(x)
        return logits
class DoubleConv(nn.Module):
    """(convolution => [BN] => ReLU) * 2"""
    # 一个单纯的双卷积层,能指定输入与输出通道数,可以设置第二个卷积是否保持通道数不变
    def __init__(self, in_channels, out_channels, mid_channels=None):
        super().__init__()
        if not mid_channels:
            mid_channels = out_channels
            # 设置第二个卷积是否通道数不变

        self.double_conv = nn.Sequential(
            nn.Conv2d(in_channels, mid_channels, kernel_size=3, padding=1),
            nn.BatchNorm2d(mid_channels),
            nn.ReLU(inplace=True),
            nn.Conv2d(mid_channels, out_channels, kernel_size=3, padding=1),
            nn.BatchNorm2d(out_channels),
            nn.ReLU(inplace=True)
        )

    def forward(self, x):
        return self.double_conv(x)


class Down(nn.Module):
    """Downscaling with maxpool then double conv"""
    # 下降过程,max pool+双卷积
    def __init__(self, in_channels, out_channels):
        super().__init__()
        self.maxpool_conv = nn.Sequential(
            nn.MaxPool2d(2),
            DoubleConv(in_channels, out_channels)
        )

    def forward(self, x):
        return self.maxpool_conv(x)


class Up(nn.Module):
    """Upscaling then double conv"""
    # 上升过程,Upsample/ConvTranspose2d后+一个双卷积
    def __init__(self, in_channels, out_channels, bilinear=True):
        super().__init__()

        # if bilinear, use the normal convolutions to reduce the number of channels
        if bilinear:
            self.up = nn.Upsample(scale_factor=2, mode='bilinear', align_corners=True)
            self.conv = DoubleConv(in_channels, out_channels, in_channels // 2)
        else:
            self.up = nn.ConvTranspose2d(in_channels, in_channels // 2, kernel_size=2, stride=2)
            self.conv = DoubleConv(in_channels, out_channels)

    def forward(self, x1, x2):
        x1 = self.up(x1)
        # input is CHW
        diffY = x2.size()[2] - x1.size()[2]
        diffX = x2.size()[3] - x1.size()[3]

        x1 = F.pad(x1, [diffX // 2, diffX - diffX // 2,
                        diffY // 2, diffY - diffY // 2])
        # if you have padding issues, see
        # https://github.com/HaiyongJiang/U-Net-Pytorch-Unstructured-Buggy/commit/0e854509c2cea854e247a9c615f175f76fbb2e3a
        # https://github.com/xiaopeng-liao/Pytorch-UNet/commit/8ebac70e633bac59fc22bb5195e513d5832fb3bd
        x = torch.cat([x2, x1], dim=1)
        return self.conv(x)


class OutConv(nn.Module):
    # 最后一层,全卷积
    def __init__(self, in_channels, out_channels):
        super(OutConv, self).__init__()
        self.conv = nn.Conv2d(in_channels, out_channels, kernel_size=1)

    def forward(self, x):
        return self.conv(x)

2.损失定义

DICE:

class DiceCoeff(Function):
    """Dice coeff for individual examples"""

    def forward(self, input, target):
        self.save_for_backward(input, target)
        eps = 0.0001
        # DICE没有使用拉普拉斯平滑,但是加了防爆因子eps

        self.inter = torch.dot(input.view(-1), target.view(-1))
        self.union = torch.sum(input) + torch.sum(target) + eps

        t = (2 * self.inter.float() + eps) / self.union.float()
        return t

    # This function has only a single output, so it gets only one gradient
    def backward(self, grad_output):

        input, target = self.saved_variables
        grad_input = grad_target = None

        if self.needs_input_grad[0]:
            grad_input = grad_output * 2 * (target * self.union - self.inter) \
                         / (self.union * self.union)
        if self.needs_input_grad[1]:
            grad_target = None

        return grad_input, grad_target


def dice_coeff(input, target):
    """Dice coeff for batches"""
    if input.is_cuda:
        s = torch.FloatTensor(1).cuda().zero_()
    else:
        s = torch.FloatTensor(1).zero_()

    for i, c in enumerate(zip(input, target)):
        s = s + DiceCoeff().forward(c[0], c[1])

    return s / (i + 1)

但是这一模型,在训练时使用的是交叉熵,在验证时才使用DICE


总结

一个模型,有五花八门的写法。

学习资料 与 参考文献

1.https://blog.csdn.net/qq_44055705/article/details/115733245
2.https://github.com/usuyama/pytorch-unet
3.https://github.com/jvanvugt/pytorch-unet
4.https://github.com/milesial/Pytorch-UNet
5.https://zhuanlan.zhihu.com/p/86704421
6.https://zhuanlan.zhihu.com/p/64551412
7.https://zhuanlan.zhihu.com/p/270344655
8.https://blog.csdn.net/real_ray/article/details/17919289
9.https://www.runoob.com/python3/python3-func-filter.html
10.https://zhuanlan.zhihu.com/p/269592183

你可能感兴趣的:(pytorch,论文笔记,python,卷积,深度学习,卷积神经网络,神经网络)