intro-to-seismic-salt-and-how-to-geophysics 这个Notebook介绍了学科背景,并发现有39%的训练数据为空标注。
basic-data-visualization-using-pytorch-dataset 使用torch的data工具包简要的显示了图片和相应的mask。
train-dataset-visualization 将image和mask合二为一
在这个比赛里,出题方还提供了地质图片对应的深度,大家发现同一个depth的图片其实来源于更大的一张大图片的分割。因此大家逆向工程设法把这些图片重新拼接起来。不过后来由于主办方在private数据集消除了这个leak,因此提升基本仅限于public。4th Place Solution通过这个leak进行了后处理,用手工规则修正答案,public提升0.01,private斤提升0.001
本次比赛是IoU(Intersection Over Union )
explanation-of-scoring-metric 对metric做出了详细的解释。
fast-iou-scoring-metric-in-pytorch-and-numpy fast-iou-metric-in-numpy-and-tensorflow 里有对IoU的快速实现
u-net-dropout-augmentation-stratification 做了flip
simple-tricks flip也可用于tta
Augmentation that works
蛙神展示正确的random crop方法
1st Place Solution
part of 8th place (Private LB) solution
认为图片明暗度不统一,因此np.clip(img - np.median(img) +127, 0, 255)
使用了: random gamma, brightness, shift, scale, rotate, horizontal flip, contrast
11th place solution
14th place solution
先用skimage 的biharmonic inpaint pad148×148,然后再Random crop到128×128
增强用了:flip in the left-right direction and random linear color
30th: strong baseline
这时候kaggle还基本被keras统治,这里复盘一下Notebook的分享是怎么从trained-from-scratch U-Net里一点一点提升
1.intro-to-seismic-salt-and-how-to-geophysics 从头建立了一个trained-from-scratch U-Net,这应该是语义分割最常见的模型。(0.65535)
The U-Net is basically looking like an Auto-Encoder with shortcuts. 在基础unet上(第1个kernel)加了depth feature (0.70118)
2. u-net-dropout-augmentation-stratification 在前者上加了augmentation和dropout项,搜索最佳threshold,得到了提升(0.74744)
u-net-with-simple-resnet-blocks-forked (加了对预测结果里mask偏低的样本直接手工改成空mask(0.80988)
3.u-net-with-simple-resnet-blocks 在前者基础上添加了resnet模块(0.81394)
u-net-with-simple-resnet-blocks-v2-new-loss 将loss改为lovasz_hinge loss,移除了最后一层dropout,得到了提升(0.83410)
introduction-to-u-net-with-simple-resnet-blocks 在lovasz loss从relu改为elu(0.83434) unet-with-simple-resnet-blocks 进一步调整learning rate, 调大epochs and batch size(0.84898)
4.unet-resnet34-in-keras pretrained-resnet34-in-keras
再进一步pretrained Xception model with ResNet decoder, Pseudo-Labelling,SWA (约0.87)
unet-resnetblock-hypercolumn-deep-supervision-fold 提出建立二分类模型来预测mask为空的
deeplabv3 deeplabv3的baseline
9.goto-pytorch-fix-for-v0-3 tgs-fastai-resnet34-unet pytorch和fastai的baseline,不过一年半前这两个框架不是Notebook的主流 这个post里作者详细的描述了如何改进模型以得到提升。
10. 蛙神提到可以增加一个Deep semi-supervised learning来对是否为空图片进行分类
在这篇post里介绍了snapshot ensembling and cyclic lr
1st Place Solution
b.e.s. model
Input: 101 -> resize to 192 -> pad to 224
Encoder: ResNeXt50 pretrained on ImageNet
Decoder: conv3x3 + BN, Upsampling, scSE
Training overview:
Optimizer: RMSprop. Batch size: 24
Loss: BCE+Dice. Reduce LR on plateau starting from 0.0001
Loss: Lovasz. Reduce LR on plateau starting from 0.00005
Loss: Lovasz. 4 snapshots with cosine annealing LR, 80 epochs each, LR starting from 0.0001
phalanx model
ResNet34 (architecture is similar to resnet_34_pad_128 described below) with input: 101 -> resize to 202 -> pad to 256
然后在baseline的基础上添加高置信度的 pseudo labels,进一步的提升了分数。
Input: 101 -> pad to 128
Encoder: ResNet34 + scSE (conv7x7 -> conv3x3 and remove first max pooling)
Center Block: Feature Pyramid Attention (remove 7x7)
Decoder: conv3x3, transposed convolution, scSE + hyper columns
Loss: Lovasz
Input: 101 -> resize to 128
Encoder: ResNet34 + scSE (remove first max pooling)
Center Block: conv3x3, Global Convolutional Network
Decoder: Global Attention Upsample (implemented like senet -> like scSE, conv3x3 -> GCN) + deep supervision
Loss: BCE for classification and Lovasz for segmentation
Optimizer: SGD. Batch size: 32.
Pretrain on pseudolabels for 150 epochs (50 epochs per cycle with cosine annealing, LR 0.01 -> 0.001)
Finetune on train data. 5 folds, 4 snapshots with cosine annealing LR, 50 epochs each, LR 0.01 -> 0.001
4th Place Solution
input: 101 random pad to 128*128, random LRflip;
encoder: resnet34, se-resnext50, resnext101_ibna, se-resnet101, se-resnet152, se resnet154;
decoder: scse, hypercolumn (not used in network with resnext101ibna, seresnext101 backbone), ibn block, dropout;
Deep supervision structure with Lovasz softmax (a great idea from Heng);
SGD: momentum -- 0.9, weight decay -- 0.0002, lr -- from 0.01 to
0.001 (changed in each epoch);
LR schedule: cosine annealing with snapshot ensemble (shared by
Peter), 50 epochs/cycle, 7cycles/fold ,10fold;
也使用了Pseudo Label
5th place solution 5th place solution(知乎)
也使用了cosine annealing lr和snapshot ensemble method
part of 8th place (Private LB) solution
Best performing backbones: SeNet154, SeResNext101, SeResNext50, DPN92
(from top to bottom)
Decoder: U-Net like decoder with ScSe, CBAM and Hypercolumn
train method
Batch size: 8
Adam for 200 epochs.
LR schedule: 0.0001 -- 100 epochs, 0.00001 -- 100 epochs (cycle #1)
CycleLR for 40 epochs more with 10 epoch cycle and rmsprop optimizer
(cycles #2, #3, #4, #5).
On each training cycle, one checkpoint was made.
Hard example mining was performed as well.
9th place solution(single model)
11th place solution
第一位队员的Model: UNet-like architecture
Backbone: SE ResNeXt-50, pretrained on ImageNet
Decoder features (inspired by Heng’s helpful posts and discussions):
Spatial and Channel Squeeze Excitation gating
Deep supervision (zero/nonzero mask)
SGD: momentum 0.9, weight decay 0.0001
Batch size: 16
Starting LR determined using procedure similar to LR find from course - 5e-2
LR schedule - cosine annealing from maximum LR, cycle length - 50 epochs, 10 cycles per experiment
Best snapshots according to metric were saved independently for each cycle, final solution uses 2 best cycles per fold
第二位队员的Model: Modified Unet
Backbone: SE-ResNeXt-50, pretrained on ImageNet
Decoder features:
Dilated convolutions with dilation from 1 to 5
ASP OCModule before last Convolution layer
Deep supervision (zero/nonzero mask, nonzero mask segmentation)
SGD: momentum = 0.9, weight decay = 0.0001
Batch size: 16
Lr schedule. Pretrain for 32 epochs with lr = 0.01. Then SGDR was applied for 4 cycles with cosine annealing: lr from 0.01 to 0.0001. Each cycle lasts for 64 epochs.
14th place solution
SE-ResNeXt50 encoder. Standard decoder blocks enriched with custom-built FPN-style layers.
Loss: Lovasz hinge loss with elu + 1. See details here
Optimizer: SGD with LR 0.01, momentum 0.9, weight_decay 0.0001
Train stages:
EarlyStopping with patience 100; ReduceLROnPlateau with patience=30, factor=0.64, min_lr=1e-8; Lovasz * 0.75 + BCE empty * 0.25.
Cosine annealing learning rate 300 epochs, 50 per cycle; Lovasz * 0.5 + BCE empty * 0.5.
part of the 15th place solution
表示提升主要来自于添加Topology-aware loss、Spatial and channel squeeze & excitation module、Guided upsampling module
A small U-shape model (~8MB) with 介绍了DICE loss f,Lovasz Softmax loss, Focal loss ,作者也在知乎上讲了相关问题,见有关语义分割的奇技淫巧有哪些?
9th place solution(single model) modifying Lovasz to symmetric which gives a good boost in the LB (on public LB +0.008 on private LB +0.02)
11th place solution 分类loss(BCE)和分割loss(BCE + Lavasz Hinge*0.5)的组合
30th: strong baseline 前面的epoch用bce然后再后面改为 Lavasz
得出的结果的mask由rle算法输出 有对rle的快速实现。 crf(Conditional Random Fields)可能可以提升分数
5th place solution(知乎)
这个比赛是kaggle在Notebook里提供GPU的第一个比赛,也是这个比赛开始,CV类比赛新手不需要有实验室专业背景或者抱大腿,只需要跟着论坛好好学,就可以独立做比赛甚至出成绩了,据第一名说也是这个比赛才第一次接触图像分割,很多大佬也是从这个比赛开始崭露头角。按时间顺序从前往后看下去,论坛和Notebook在蛙神的带领下从trained-from-scratch U-Net(0.65)一直做到最后的0.89+,每一步提升都有体现,挺有意思的。