timm是由Ross Wightman创建的深度学习库,是一个关于SOTA的计算机视觉模型、层、实用工具、optimizers, schedulers, data-loaders, augmentations,可以复现ImageNet训练结果的训练/验证代码。
代码网址:https://github.com/rwightman/pytorch-image-models
简略文档:https://rwightman.github.io/pytorch-image-models/
详细文档:https://fastai.github.io/timmdocs/
pip install timm
所有的开发和测试都是在 Linux x86-64系统上的 Conda Python 3环境中完成的,尤其是 Python 3.6和3.7 3.8 3.9
PyTorch 版本1.4、1.5. x、1.6、1.7. x 和1.8已经使用此代码进行了测试。
import timm
import timm
m = timm.create_model('mobilenetv3_large_100', pretrained=True)
m.eval()
MobileNetV3(
(conv_stem): Conv2d(3, 16, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
(bn1): BatchNorm2d(16, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act1): Hardswish()
(blocks): Sequential(
(0): Sequential(
(0): DepthwiseSeparableConv(
(conv_dw): Conv2d(16, 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=16, bias=False)
(bn1): BatchNorm2d(16, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act1): ReLU(inplace=True)
(se): Identity()
(conv_pw): Conv2d(16, 16, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn2): BatchNorm2d(16, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act2): Identity()
)
)
(1): Sequential(
(0): InvertedResidual(
(conv_pw): Conv2d(16, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act1): ReLU(inplace=True)
(conv_dw): Conv2d(64, 64, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), groups=64, bias=False)
(bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act2): ReLU(inplace=True)
(se): Identity()
(conv_pwl): Conv2d(64, 24, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(24, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(1): InvertedResidual(
(conv_pw): Conv2d(24, 72, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(72, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act1): ReLU(inplace=True)
(conv_dw): Conv2d(72, 72, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=72, bias=False)
(bn2): BatchNorm2d(72, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act2): ReLU(inplace=True)
(se): Identity()
(conv_pwl): Conv2d(72, 24, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(24, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(2): Sequential(
(0): InvertedResidual(
(conv_pw): Conv2d(24, 72, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(72, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act1): ReLU(inplace=True)
(conv_dw): Conv2d(72, 72, kernel_size=(5, 5), stride=(2, 2), padding=(2, 2), groups=72, bias=False)
(bn2): BatchNorm2d(72, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act2): ReLU(inplace=True)
(se): SqueezeExcite(
(conv_reduce): Conv2d(72, 24, kernel_size=(1, 1), stride=(1, 1))
(act1): ReLU(inplace=True)
(conv_expand): Conv2d(24, 72, kernel_size=(1, 1), stride=(1, 1))
(gate): Hardsigmoid()
)
(conv_pwl): Conv2d(72, 40, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(40, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(1): InvertedResidual(
(conv_pw): Conv2d(40, 120, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(120, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act1): ReLU(inplace=True)
(conv_dw): Conv2d(120, 120, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2), groups=120, bias=False)
(bn2): BatchNorm2d(120, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act2): ReLU(inplace=True)
(se): SqueezeExcite(
(conv_reduce): Conv2d(120, 32, kernel_size=(1, 1), stride=(1, 1))
(act1): ReLU(inplace=True)
(conv_expand): Conv2d(32, 120, kernel_size=(1, 1), stride=(1, 1))
(gate): Hardsigmoid()
)
(conv_pwl): Conv2d(120, 40, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(40, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(2): InvertedResidual(
(conv_pw): Conv2d(40, 120, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(120, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act1): ReLU(inplace=True)
(conv_dw): Conv2d(120, 120, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2), groups=120, bias=False)
(bn2): BatchNorm2d(120, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act2): ReLU(inplace=True)
(se): SqueezeExcite(
(conv_reduce): Conv2d(120, 32, kernel_size=(1, 1), stride=(1, 1))
(act1): ReLU(inplace=True)
(conv_expand): Conv2d(32, 120, kernel_size=(1, 1), stride=(1, 1))
(gate): Hardsigmoid()
)
(conv_pwl): Conv2d(120, 40, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(40, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(3): Sequential(
(0): InvertedResidual(
(conv_pw): Conv2d(40, 240, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(240, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act1): Hardswish()
(conv_dw): Conv2d(240, 240, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), groups=240, bias=False)
(bn2): BatchNorm2d(240, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act2): Hardswish()
(se): Identity()
(conv_pwl): Conv2d(240, 80, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(80, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(1): InvertedResidual(
(conv_pw): Conv2d(80, 200, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(200, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act1): Hardswish()
(conv_dw): Conv2d(200, 200, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=200, bias=False)
(bn2): BatchNorm2d(200, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act2): Hardswish()
(se): Identity()
(conv_pwl): Conv2d(200, 80, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(80, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(2): InvertedResidual(
(conv_pw): Conv2d(80, 184, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(184, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act1): Hardswish()
(conv_dw): Conv2d(184, 184, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=184, bias=False)
(bn2): BatchNorm2d(184, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act2): Hardswish()
(se): Identity()
(conv_pwl): Conv2d(184, 80, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(80, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(3): InvertedResidual(
(conv_pw): Conv2d(80, 184, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(184, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act1): Hardswish()
(conv_dw): Conv2d(184, 184, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=184, bias=False)
(bn2): BatchNorm2d(184, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act2): Hardswish()
(se): Identity()
(conv_pwl): Conv2d(184, 80, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(80, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(4): Sequential(
(0): InvertedResidual(
(conv_pw): Conv2d(80, 480, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(480, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act1): Hardswish()
(conv_dw): Conv2d(480, 480, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=480, bias=False)
(bn2): BatchNorm2d(480, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act2): Hardswish()
(se): SqueezeExcite(
(conv_reduce): Conv2d(480, 120, kernel_size=(1, 1), stride=(1, 1))
(act1): ReLU(inplace=True)
(conv_expand): Conv2d(120, 480, kernel_size=(1, 1), stride=(1, 1))
(gate): Hardsigmoid()
)
(conv_pwl): Conv2d(480, 112, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(112, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(1): InvertedResidual(
(conv_pw): Conv2d(112, 672, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(672, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act1): Hardswish()
(conv_dw): Conv2d(672, 672, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=672, bias=False)
(bn2): BatchNorm2d(672, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act2): Hardswish()
(se): SqueezeExcite(
(conv_reduce): Conv2d(672, 168, kernel_size=(1, 1), stride=(1, 1))
(act1): ReLU(inplace=True)
(conv_expand): Conv2d(168, 672, kernel_size=(1, 1), stride=(1, 1))
(gate): Hardsigmoid()
)
(conv_pwl): Conv2d(672, 112, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(112, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(5): Sequential(
(0): InvertedResidual(
(conv_pw): Conv2d(112, 672, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(672, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act1): Hardswish()
(conv_dw): Conv2d(672, 672, kernel_size=(5, 5), stride=(2, 2), padding=(2, 2), groups=672, bias=False)
(bn2): BatchNorm2d(672, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act2): Hardswish()
(se): SqueezeExcite(
(conv_reduce): Conv2d(672, 168, kernel_size=(1, 1), stride=(1, 1))
(act1): ReLU(inplace=True)
(conv_expand): Conv2d(168, 672, kernel_size=(1, 1), stride=(1, 1))
(gate): Hardsigmoid()
)
(conv_pwl): Conv2d(672, 160, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(160, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(1): InvertedResidual(
(conv_pw): Conv2d(160, 960, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(960, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act1): Hardswish()
(conv_dw): Conv2d(960, 960, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2), groups=960, bias=False)
(bn2): BatchNorm2d(960, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act2): Hardswish()
(se): SqueezeExcite(
(conv_reduce): Conv2d(960, 240, kernel_size=(1, 1), stride=(1, 1))
(act1): ReLU(inplace=True)
(conv_expand): Conv2d(240, 960, kernel_size=(1, 1), stride=(1, 1))
(gate): Hardsigmoid()
)
(conv_pwl): Conv2d(960, 160, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(160, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(2): InvertedResidual(
(conv_pw): Conv2d(160, 960, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(960, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act1): Hardswish()
(conv_dw): Conv2d(960, 960, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2), groups=960, bias=False)
(bn2): BatchNorm2d(960, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act2): Hardswish()
(se): SqueezeExcite(
(conv_reduce): Conv2d(960, 240, kernel_size=(1, 1), stride=(1, 1))
(act1): ReLU(inplace=True)
(conv_expand): Conv2d(240, 960, kernel_size=(1, 1), stride=(1, 1))
(gate): Hardsigmoid()
)
(conv_pwl): Conv2d(960, 160, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(160, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(6): Sequential(
(0): ConvBnAct(
(conv): Conv2d(160, 960, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(960, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act1): Hardswish()
)
)
)
(global_pool): SelectAdaptivePool2d (pool_type=avg, flatten=Identity())
(conv_head): Conv2d(960, 1280, kernel_size=(1, 1), stride=(1, 1))
(act2): Hardswish()
(flatten): Flatten(start_dim=1, end_dim=-1)
(classifier): Linear(in_features=1280, out_features=1000, bias=True)
)
import timm
from pprint import pprint
model_names = timm.list_models(pretrained=True)
pprint(model_names)
['adv_inception_v3',
'cait_m36_384',
'cait_m48_448',
'cait_s24_224',
'cait_s24_384',
'cait_s36_384',
'cait_xs24_384',
'cait_xxs24_224',
'cait_xxs24_384',
'cait_xxs36_224',
'cait_xxs36_384',
'coat_lite_mini',
'coat_lite_small',
'coat_lite_tiny',
'coat_mini',
'coat_tiny',
'convit_base',
'convit_small',
'convit_tiny',
'cspdarknet53',
'cspresnet50',
'cspresnext50',
'deit_base_distilled_patch16_224',
'deit_base_distilled_patch16_384',
'deit_base_patch16_224',
'deit_base_patch16_384',
'deit_small_distilled_patch16_224',
'deit_small_patch16_224',
'deit_tiny_distilled_patch16_224',
'deit_tiny_patch16_224',
'densenet121',
'densenet161',
'densenet169',
'densenet201',
'densenetblur121d',
'dla34',
'dla46_c',
'dla46x_c',
'dla60',
'dla60_res2net',
'dla60_res2next',
'dla60x',
'dla60x_c',
'dla102',
'dla102x',
'dla102x2',
'dla169',
'dm_nfnet_f0',
'dm_nfnet_f1',
'dm_nfnet_f2',
'dm_nfnet_f3',
'dm_nfnet_f4',
'dm_nfnet_f5',
'dm_nfnet_f6',
'dpn68',
'dpn68b',
'dpn92',
'dpn98',
'dpn107',
'dpn131',
'eca_nfnet_l0',
'eca_nfnet_l1',
'eca_nfnet_l2',
'ecaresnet26t',
'ecaresnet50d',
'ecaresnet50d_pruned',
'ecaresnet50t',
'ecaresnet101d',
'ecaresnet101d_pruned',
'ecaresnet269d',
'ecaresnetlight',
'efficientnet_b0',
'efficientnet_b1',
'efficientnet_b1_pruned',
'efficientnet_b2',
'efficientnet_b2_pruned',
'efficientnet_b3',
'efficientnet_b3_pruned',
'efficientnet_b4',
'efficientnet_el',
'efficientnet_el_pruned',
'efficientnet_em',
'efficientnet_es',
'efficientnet_es_pruned',
'efficientnet_lite0',
'efficientnetv2_rw_m',
'efficientnetv2_rw_s',
'ens_adv_inception_resnet_v2',
'ese_vovnet19b_dw',
'ese_vovnet39b',
'fbnetc_100',
'gernet_l',
'gernet_m',
'gernet_s',
'ghostnet_100',
'gluon_inception_v3',
'gluon_resnet18_v1b',
'gluon_resnet34_v1b',
'gluon_resnet50_v1b',
'gluon_resnet50_v1c',
'gluon_resnet50_v1d',
'gluon_resnet50_v1s',
'gluon_resnet101_v1b',
'gluon_resnet101_v1c',
'gluon_resnet101_v1d',
'gluon_resnet101_v1s',
'gluon_resnet152_v1b',
'gluon_resnet152_v1c',
'gluon_resnet152_v1d',
'gluon_resnet152_v1s',
'gluon_resnext50_32x4d',
'gluon_resnext101_32x4d',
'gluon_resnext101_64x4d',
'gluon_senet154',
'gluon_seresnext50_32x4d',
'gluon_seresnext101_32x4d',
'gluon_seresnext101_64x4d',
'gluon_xception65',
'gmixer_24_224',
'gmlp_s16_224',
'hardcorenas_a',
'hardcorenas_b',
'hardcorenas_c',
'hardcorenas_d',
'hardcorenas_e',
'hardcorenas_f',
'hrnet_w18',
'hrnet_w18_small',
'hrnet_w18_small_v2',
'hrnet_w30',
'hrnet_w32',
'hrnet_w40',
'hrnet_w44',
'hrnet_w48',
'hrnet_w64',
'ig_resnext101_32x8d',
'ig_resnext101_32x16d',
'ig_resnext101_32x32d',
'ig_resnext101_32x48d',
'inception_resnet_v2',
'inception_v3',
'inception_v4',
'legacy_senet154',
'legacy_seresnet18',
'legacy_seresnet34',
'legacy_seresnet50',
'legacy_seresnet101',
'legacy_seresnet152',
'legacy_seresnext26_32x4d',
'legacy_seresnext50_32x4d',
'legacy_seresnext101_32x4d',
'levit_128',
'levit_128s',
'levit_192',
'levit_256',
'levit_384',
'mixer_b16_224',
'mixer_b16_224_in21k',
'mixer_b16_224_miil',
'mixer_b16_224_miil_in21k',
'mixer_l16_224',
'mixer_l16_224_in21k',
'mixnet_l',
'mixnet_m',
'mixnet_s',
'mixnet_xl',
'mnasnet_100',
'mobilenetv2_100',
'mobilenetv2_110d',
'mobilenetv2_120d',
'mobilenetv2_140',
'mobilenetv3_large_100',
'mobilenetv3_large_100_miil',
'mobilenetv3_large_100_miil_in21k',
'mobilenetv3_rw',
'nasnetalarge',
'nf_regnet_b1',
'nf_resnet50',
'nfnet_l0',
'pit_b_224',
'pit_b_distilled_224',
'pit_s_224',
'pit_s_distilled_224',
'pit_ti_224',
'pit_ti_distilled_224',
'pit_xs_224',
'pit_xs_distilled_224',
'pnasnet5large',
'regnetx_002',
'regnetx_004',
'regnetx_006',
'regnetx_008',
'regnetx_016',
'regnetx_032',
'regnetx_040',
'regnetx_064',
'regnetx_080',
'regnetx_120',
'regnetx_160',
'regnetx_320',
'regnety_002',
'regnety_004',
'regnety_006',
'regnety_008',
'regnety_016',
'regnety_032',
'regnety_040',
'regnety_064',
'regnety_080',
'regnety_120',
'regnety_160',
'regnety_320',
'repvgg_a2',
'repvgg_b0',
'repvgg_b1',
'repvgg_b1g4',
'repvgg_b2',
'repvgg_b2g4',
'repvgg_b3',
'repvgg_b3g4',
'res2net50_14w_8s',
'res2net50_26w_4s',
'res2net50_26w_6s',
'res2net50_26w_8s',
'res2net50_48w_2s',
'res2net101_26w_4s',
'res2next50',
'resmlp_12_224',
'resmlp_12_distilled_224',
'resmlp_24_224',
'resmlp_24_distilled_224',
'resmlp_36_224',
'resmlp_36_distilled_224',
'resmlp_big_24_224',
'resmlp_big_24_224_in22ft1k',
'resmlp_big_24_distilled_224',
'resnest14d',
'resnest26d',
'resnest50d',
'resnest50d_1s4x24d',
'resnest50d_4s2x40d',
'resnest101e',
'resnest200e',
'resnest269e',
'resnet18',
'resnet18d',
'resnet26',
'resnet26d',
'resnet34',
'resnet34d',
'resnet50',
'resnet50d',
'resnet51q',
'resnet101d',
'resnet152d',
'resnet200d',
'resnetblur50',
'resnetrs50',
'resnetrs101',
'resnetrs152',
'resnetrs200',
'resnetrs270',
'resnetrs350',
'resnetrs420',
'resnetv2_50x1_bit_distilled',
'resnetv2_50x1_bitm',
'resnetv2_50x1_bitm_in21k',
'resnetv2_50x3_bitm',
'resnetv2_50x3_bitm_in21k',
'resnetv2_101x1_bitm',
'resnetv2_101x1_bitm_in21k',
'resnetv2_101x3_bitm',
'resnetv2_101x3_bitm_in21k',
'resnetv2_152x2_bit_teacher',
'resnetv2_152x2_bit_teacher_384',
'resnetv2_152x2_bitm',
'resnetv2_152x2_bitm_in21k',
'resnetv2_152x4_bitm',
'resnetv2_152x4_bitm_in21k',
'resnext50_32x4d',
'resnext50d_32x4d',
'resnext101_32x8d',
'rexnet_100',
'rexnet_130',
'rexnet_150',
'rexnet_200',
'selecsls42b',
'selecsls60',
'selecsls60b',
'semnasnet_100',
'seresnet50',
'seresnet152d',
'seresnext26d_32x4d',
'seresnext26t_32x4d',
'seresnext50_32x4d',
'skresnet18',
'skresnet34',
'skresnext50_32x4d',
'spnasnet_100',
'ssl_resnet18',
'ssl_resnet50',
'ssl_resnext50_32x4d',
'ssl_resnext101_32x4d',
'ssl_resnext101_32x8d',
'ssl_resnext101_32x16d',
'swin_base_patch4_window7_224',
'swin_base_patch4_window7_224_in22k',
'swin_base_patch4_window12_384',
'swin_base_patch4_window12_384_in22k',
'swin_large_patch4_window7_224',
'swin_large_patch4_window7_224_in22k',
'swin_large_patch4_window12_384',
'swin_large_patch4_window12_384_in22k',
'swin_small_patch4_window7_224',
'swin_tiny_patch4_window7_224',
'swsl_resnet18',
'swsl_resnet50',
'swsl_resnext50_32x4d',
'swsl_resnext101_32x4d',
'swsl_resnext101_32x8d',
'swsl_resnext101_32x16d',
'tf_efficientnet_b0',
'tf_efficientnet_b0_ap',
'tf_efficientnet_b0_ns',
'tf_efficientnet_b1',
'tf_efficientnet_b1_ap',
'tf_efficientnet_b1_ns',
'tf_efficientnet_b2',
'tf_efficientnet_b2_ap',
'tf_efficientnet_b2_ns',
'tf_efficientnet_b3',
'tf_efficientnet_b3_ap',
'tf_efficientnet_b3_ns',
'tf_efficientnet_b4',
'tf_efficientnet_b4_ap',
'tf_efficientnet_b4_ns',
'tf_efficientnet_b5',
'tf_efficientnet_b5_ap',
'tf_efficientnet_b5_ns',
'tf_efficientnet_b6',
'tf_efficientnet_b6_ap',
'tf_efficientnet_b6_ns',
'tf_efficientnet_b7',
'tf_efficientnet_b7_ap',
'tf_efficientnet_b7_ns',
'tf_efficientnet_b8',
'tf_efficientnet_b8_ap',
'tf_efficientnet_cc_b0_4e',
'tf_efficientnet_cc_b0_8e',
'tf_efficientnet_cc_b1_8e',
'tf_efficientnet_el',
'tf_efficientnet_em',
'tf_efficientnet_es',
'tf_efficientnet_l2_ns',
'tf_efficientnet_l2_ns_475',
'tf_efficientnet_lite0',
'tf_efficientnet_lite1',
'tf_efficientnet_lite2',
'tf_efficientnet_lite3',
'tf_efficientnet_lite4',
'tf_efficientnetv2_b0',
'tf_efficientnetv2_b1',
'tf_efficientnetv2_b2',
'tf_efficientnetv2_b3',
'tf_efficientnetv2_l',
'tf_efficientnetv2_l_in21ft1k',
'tf_efficientnetv2_l_in21k',
'tf_efficientnetv2_m',
'tf_efficientnetv2_m_in21ft1k',
'tf_efficientnetv2_m_in21k',
'tf_efficientnetv2_s',
'tf_efficientnetv2_s_in21ft1k',
'tf_efficientnetv2_s_in21k',
'tf_inception_v3',
'tf_mixnet_l',
'tf_mixnet_m',
'tf_mixnet_s',
'tf_mobilenetv3_large_075',
'tf_mobilenetv3_large_100',
'tf_mobilenetv3_large_minimal_100',
'tf_mobilenetv3_small_075',
'tf_mobilenetv3_small_100',
'tf_mobilenetv3_small_minimal_100',
'tnt_s_patch16_224',
'tresnet_l',
'tresnet_l_448',
'tresnet_m',
'tresnet_m_448',
'tresnet_m_miil_in21k',
'tresnet_xl',
'tresnet_xl_448',
'tv_densenet121',
'tv_resnet34',
'tv_resnet50',
'tv_resnet101',
'tv_resnet152',
'tv_resnext50_32x4d',
'twins_pcpvt_base',
'twins_pcpvt_large',
'twins_pcpvt_small',
'twins_svt_base',
'twins_svt_large',
'twins_svt_small',
'vgg11',
'vgg11_bn',
'vgg13',
'vgg13_bn',
'vgg16',
'vgg16_bn',
'vgg19',
'vgg19_bn',
'visformer_small',
'vit_base_patch16_224',
'vit_base_patch16_224_in21k',
'vit_base_patch16_224_miil',
'vit_base_patch16_224_miil_in21k',
'vit_base_patch16_384',
'vit_base_patch32_224',
'vit_base_patch32_224_in21k',
'vit_base_patch32_384',
'vit_base_r50_s16_224_in21k',
'vit_base_r50_s16_384',
'vit_huge_patch14_224_in21k',
'vit_large_patch16_224',
'vit_large_patch16_224_in21k',
'vit_large_patch16_384',
'vit_large_patch32_224_in21k',
'vit_large_patch32_384',
'vit_large_r50_s32_224',
'vit_large_r50_s32_224_in21k',
'vit_large_r50_s32_384',
'vit_small_patch16_224',
'vit_small_patch16_224_in21k',
'vit_small_patch16_384',
'vit_small_patch32_224',
'vit_small_patch32_224_in21k',
'vit_small_patch32_384',
'vit_small_r26_s32_224',
'vit_small_r26_s32_224_in21k',
'vit_small_r26_s32_384',
'vit_tiny_patch16_224',
'vit_tiny_patch16_224_in21k',
'vit_tiny_patch16_384',
'vit_tiny_r_s16_p8_224',
'vit_tiny_r_s16_p8_224_in21k',
'vit_tiny_r_s16_p8_384',
'wide_resnet50_2',
'wide_resnet101_2',
'xception',
'xception41',
'xception65',
'xception71']
model_names = timm.list_models('*resne*t*')
pprint(model_names)
['bat_resnext26ts',
'cspresnet50',
'cspresnet50d',
'cspresnet50w',
'cspresnext50',
'cspresnext50_iabn',
'eca_lambda_resnext26ts',
'ecaresnet26t',
'ecaresnet50d',
'ecaresnet50d_pruned',
'ecaresnet50t',
'ecaresnet101d',
'ecaresnet101d_pruned',
'ecaresnet200d',
'ecaresnet269d',
'ecaresnetlight',
'ecaresnext26t_32x4d',
'ecaresnext50t_32x4d',
'ens_adv_inception_resnet_v2',
'gcresnet50t',
'gcresnext26ts',
'geresnet50t',
'gluon_resnet18_v1b',
'gluon_resnet34_v1b',
'gluon_resnet50_v1b',
'gluon_resnet50_v1c',
'gluon_resnet50_v1d',
'gluon_resnet50_v1s',
'gluon_resnet101_v1b',
'gluon_resnet101_v1c',
'gluon_resnet101_v1d',
'gluon_resnet101_v1s',
'gluon_resnet152_v1b',
'gluon_resnet152_v1c',
'gluon_resnet152_v1d',
'gluon_resnet152_v1s',
'gluon_resnext50_32x4d',
'gluon_resnext101_32x4d',
'gluon_resnext101_64x4d',
'gluon_seresnext50_32x4d',
'gluon_seresnext101_32x4d',
'gluon_seresnext101_64x4d',
'ig_resnext101_32x8d',
'ig_resnext101_32x16d',
'ig_resnext101_32x32d',
'ig_resnext101_32x48d',
'inception_resnet_v2',
'lambda_resnet26t',
'lambda_resnet50t',
'legacy_seresnet18',
'legacy_seresnet34',
'legacy_seresnet50',
'legacy_seresnet101',
'legacy_seresnet152',
'legacy_seresnext26_32x4d',
'legacy_seresnext50_32x4d',
'legacy_seresnext101_32x4d',
'nf_ecaresnet26',
'nf_ecaresnet50',
'nf_ecaresnet101',
'nf_resnet26',
'nf_resnet50',
'nf_resnet101',
'nf_seresnet26',
'nf_seresnet50',
'nf_seresnet101',
'resnest14d',
'resnest26d',
'resnest50d',
'resnest50d_1s4x24d',
'resnest50d_4s2x40d',
'resnest101e',
'resnest200e',
'resnest269e',
'resnet18',
'resnet18d',
'resnet26',
'resnet26d',
'resnet26t',
'resnet34',
'resnet34d',
'resnet50',
'resnet50d',
'resnet50t',
'resnet51q',
'resnet61q',
'resnet101',
'resnet101d',
'resnet152',
'resnet152d',
'resnet200',
'resnet200d',
'resnetblur18',
'resnetblur50',
'resnetrs50',
'resnetrs101',
'resnetrs152',
'resnetrs200',
'resnetrs270',
'resnetrs350',
'resnetrs420',
'resnetv2_50',
'resnetv2_50d',
'resnetv2_50t',
'resnetv2_50x1_bit_distilled',
'resnetv2_50x1_bitm',
'resnetv2_50x1_bitm_in21k',
'resnetv2_50x3_bitm',
'resnetv2_50x3_bitm_in21k',
'resnetv2_101',
'resnetv2_101d',
'resnetv2_101x1_bitm',
'resnetv2_101x1_bitm_in21k',
'resnetv2_101x3_bitm',
'resnetv2_101x3_bitm_in21k',
'resnetv2_152',
'resnetv2_152d',
'resnetv2_152x2_bit_teacher',
'resnetv2_152x2_bit_teacher_384',
'resnetv2_152x2_bitm',
'resnetv2_152x2_bitm_in21k',
'resnetv2_152x4_bitm',
'resnetv2_152x4_bitm_in21k',
'resnext50_32x4d',
'resnext50d_32x4d',
'resnext101_32x4d',
'resnext101_32x8d',
'resnext101_64x4d',
'seresnet18',
'seresnet34',
'seresnet50',
'seresnet50t',
'seresnet101',
'seresnet152',
'seresnet152d',
'seresnet200d',
'seresnet269d',
'seresnext26d_32x4d',
'seresnext26t_32x4d',
'seresnext26tn_32x4d',
'seresnext50_32x4d',
'seresnext101_32x4d',
'seresnext101_32x8d',
'skresnet18',
'skresnet34',
'skresnet50',
'skresnet50d',
'skresnext50_32x4d',
'ssl_resnet18',
'ssl_resnet50',
'ssl_resnext50_32x4d',
'ssl_resnext101_32x4d',
'ssl_resnext101_32x8d',
'ssl_resnext101_32x16d',
'swsl_resnet18',
'swsl_resnet50',
'swsl_resnext50_32x4d',
'swsl_resnext101_32x4d',
'swsl_resnext101_32x8d',
'swsl_resnext101_32x16d',
'tresnet_l',
'tresnet_l_448',
'tresnet_m',
'tresnet_m_448',
'tresnet_m_miil_in21k',
'tresnet_xl',
'tresnet_xl_448',
'tv_resnet34',
'tv_resnet50',
'tv_resnet101',
'tv_resnet152',
'tv_resnext50_32x4d',
'vit_base_resnet26d_224',
'vit_base_resnet50_224_in21k',
'vit_base_resnet50_384',
'vit_base_resnet50d_224',
'vit_small_resnet26d_224',
'vit_small_resnet50d_s16_224',
'wide_resnet50_2',
'wide_resnet101_2']
https://rwightman.github.io/pytorch-image-models/models/ 介绍了timm实现的一些网络模型及其论文和参考代码
https://paperswithcode.com/lib/timm 也有列出
Inception v3是Inception家族中的一个卷积神经网络/分类架构,它做了一些改进,包括使用 Label smooth,Factorized 7 x 7卷积,以及使用一个辅助分类器将标签信息传播到网络的下层(以及在侧面的层使用批量标准化)。关键的构建块是Inception模块。
这个模型的权重是从Tensorflow/models移植过来的。
import timm
model = timm.create_model('adv_inception_v3', pretrained=True)
model.eval()
InceptionV3(
(Conv2d_1a_3x3): BasicConv2d(
(conv): Conv2d(3, 32, kernel_size=(3, 3), stride=(2, 2), bias=False)
(bn): BatchNorm2d(32, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(Conv2d_2a_3x3): BasicConv2d(
(conv): Conv2d(32, 32, kernel_size=(3, 3), stride=(1, 1), bias=False)
(bn): BatchNorm2d(32, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(Conv2d_2b_3x3): BasicConv2d(
(conv): Conv2d(32, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn): BatchNorm2d(64, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(Pool1): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)
(Conv2d_3b_1x1): BasicConv2d(
(conv): Conv2d(64, 80, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(80, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(Conv2d_4a_3x3): BasicConv2d(
(conv): Conv2d(80, 192, kernel_size=(3, 3), stride=(1, 1), bias=False)
(bn): BatchNorm2d(192, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(Pool2): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)
(Mixed_5b): InceptionA(
(branch1x1): BasicConv2d(
(conv): Conv2d(192, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(64, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(branch5x5_1): BasicConv2d(
(conv): Conv2d(192, 48, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(48, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(branch5x5_2): BasicConv2d(
(conv): Conv2d(48, 64, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2), bias=False)
(bn): BatchNorm2d(64, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(branch3x3dbl_1): BasicConv2d(
(conv): Conv2d(192, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(64, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(branch3x3dbl_2): BasicConv2d(
(conv): Conv2d(64, 96, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn): BatchNorm2d(96, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(branch3x3dbl_3): BasicConv2d(
(conv): Conv2d(96, 96, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn): BatchNorm2d(96, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(branch_pool): BasicConv2d(
(conv): Conv2d(192, 32, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(32, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
)
(Mixed_5c): InceptionA(
(branch1x1): BasicConv2d(
(conv): Conv2d(256, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(64, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(branch5x5_1): BasicConv2d(
(conv): Conv2d(256, 48, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(48, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(branch5x5_2): BasicConv2d(
(conv): Conv2d(48, 64, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2), bias=False)
(bn): BatchNorm2d(64, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(branch3x3dbl_1): BasicConv2d(
(conv): Conv2d(256, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(64, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(branch3x3dbl_2): BasicConv2d(
(conv): Conv2d(64, 96, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn): BatchNorm2d(96, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(branch3x3dbl_3): BasicConv2d(
(conv): Conv2d(96, 96, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn): BatchNorm2d(96, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(branch_pool): BasicConv2d(
(conv): Conv2d(256, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(64, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
)
(Mixed_5d): InceptionA(
(branch1x1): BasicConv2d(
(conv): Conv2d(288, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(64, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(branch5x5_1): BasicConv2d(
(conv): Conv2d(288, 48, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(48, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(branch5x5_2): BasicConv2d(
(conv): Conv2d(48, 64, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2), bias=False)
(bn): BatchNorm2d(64, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(branch3x3dbl_1): BasicConv2d(
(conv): Conv2d(288, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(64, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(branch3x3dbl_2): BasicConv2d(
(conv): Conv2d(64, 96, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn): BatchNorm2d(96, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(branch3x3dbl_3): BasicConv2d(
(conv): Conv2d(96, 96, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn): BatchNorm2d(96, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(branch_pool): BasicConv2d(
(conv): Conv2d(288, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(64, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
)
(Mixed_6a): InceptionB(
(branch3x3): BasicConv2d(
(conv): Conv2d(288, 384, kernel_size=(3, 3), stride=(2, 2), bias=False)
(bn): BatchNorm2d(384, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(branch3x3dbl_1): BasicConv2d(
(conv): Conv2d(288, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(64, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(branch3x3dbl_2): BasicConv2d(
(conv): Conv2d(64, 96, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn): BatchNorm2d(96, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(branch3x3dbl_3): BasicConv2d(
(conv): Conv2d(96, 96, kernel_size=(3, 3), stride=(2, 2), bias=False)
(bn): BatchNorm2d(96, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
)
(Mixed_6b): InceptionC(
(branch1x1): BasicConv2d(
(conv): Conv2d(768, 192, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(192, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(branch7x7_1): BasicConv2d(
(conv): Conv2d(768, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(128, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(branch7x7_2): BasicConv2d(
(conv): Conv2d(128, 128, kernel_size=(1, 7), stride=(1, 1), padding=(0, 3), bias=False)
(bn): BatchNorm2d(128, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(branch7x7_3): BasicConv2d(
(conv): Conv2d(128, 192, kernel_size=(7, 1), stride=(1, 1), padding=(3, 0), bias=False)
(bn): BatchNorm2d(192, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(branch7x7dbl_1): BasicConv2d(
(conv): Conv2d(768, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(128, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(branch7x7dbl_2): BasicConv2d(
(conv): Conv2d(128, 128, kernel_size=(7, 1), stride=(1, 1), padding=(3, 0), bias=False)
(bn): BatchNorm2d(128, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(branch7x7dbl_3): BasicConv2d(
(conv): Conv2d(128, 128, kernel_size=(1, 7), stride=(1, 1), padding=(0, 3), bias=False)
(bn): BatchNorm2d(128, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(branch7x7dbl_4): BasicConv2d(
(conv): Conv2d(128, 128, kernel_size=(7, 1), stride=(1, 1), padding=(3, 0), bias=False)
(bn): BatchNorm2d(128, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(branch7x7dbl_5): BasicConv2d(
(conv): Conv2d(128, 192, kernel_size=(1, 7), stride=(1, 1), padding=(0, 3), bias=False)
(bn): BatchNorm2d(192, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(branch_pool): BasicConv2d(
(conv): Conv2d(768, 192, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(192, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
)
(Mixed_6c): InceptionC(
(branch1x1): BasicConv2d(
(conv): Conv2d(768, 192, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(192, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(branch7x7_1): BasicConv2d(
(conv): Conv2d(768, 160, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(160, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(branch7x7_2): BasicConv2d(
(conv): Conv2d(160, 160, kernel_size=(1, 7), stride=(1, 1), padding=(0, 3), bias=False)
(bn): BatchNorm2d(160, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(branch7x7_3): BasicConv2d(
(conv): Conv2d(160, 192, kernel_size=(7, 1), stride=(1, 1), padding=(3, 0), bias=False)
(bn): BatchNorm2d(192, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(branch7x7dbl_1): BasicConv2d(
(conv): Conv2d(768, 160, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(160, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(branch7x7dbl_2): BasicConv2d(
(conv): Conv2d(160, 160, kernel_size=(7, 1), stride=(1, 1), padding=(3, 0), bias=False)
(bn): BatchNorm2d(160, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(branch7x7dbl_3): BasicConv2d(
(conv): Conv2d(160, 160, kernel_size=(1, 7), stride=(1, 1), padding=(0, 3), bias=False)
(bn): BatchNorm2d(160, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(branch7x7dbl_4): BasicConv2d(
(conv): Conv2d(160, 160, kernel_size=(7, 1), stride=(1, 1), padding=(3, 0), bias=False)
(bn): BatchNorm2d(160, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(branch7x7dbl_5): BasicConv2d(
(conv): Conv2d(160, 192, kernel_size=(1, 7), stride=(1, 1), padding=(0, 3), bias=False)
(bn): BatchNorm2d(192, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(branch_pool): BasicConv2d(
(conv): Conv2d(768, 192, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(192, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
)
(Mixed_6d): InceptionC(
(branch1x1): BasicConv2d(
(conv): Conv2d(768, 192, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(192, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(branch7x7_1): BasicConv2d(
(conv): Conv2d(768, 160, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(160, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(branch7x7_2): BasicConv2d(
(conv): Conv2d(160, 160, kernel_size=(1, 7), stride=(1, 1), padding=(0, 3), bias=False)
(bn): BatchNorm2d(160, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(branch7x7_3): BasicConv2d(
(conv): Conv2d(160, 192, kernel_size=(7, 1), stride=(1, 1), padding=(3, 0), bias=False)
(bn): BatchNorm2d(192, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(branch7x7dbl_1): BasicConv2d(
(conv): Conv2d(768, 160, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(160, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(branch7x7dbl_2): BasicConv2d(
(conv): Conv2d(160, 160, kernel_size=(7, 1), stride=(1, 1), padding=(3, 0), bias=False)
(bn): BatchNorm2d(160, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(branch7x7dbl_3): BasicConv2d(
(conv): Conv2d(160, 160, kernel_size=(1, 7), stride=(1, 1), padding=(0, 3), bias=False)
(bn): BatchNorm2d(160, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(branch7x7dbl_4): BasicConv2d(
(conv): Conv2d(160, 160, kernel_size=(7, 1), stride=(1, 1), padding=(3, 0), bias=False)
(bn): BatchNorm2d(160, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(branch7x7dbl_5): BasicConv2d(
(conv): Conv2d(160, 192, kernel_size=(1, 7), stride=(1, 1), padding=(0, 3), bias=False)
(bn): BatchNorm2d(192, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(branch_pool): BasicConv2d(
(conv): Conv2d(768, 192, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(192, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
)
(Mixed_6e): InceptionC(
(branch1x1): BasicConv2d(
(conv): Conv2d(768, 192, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(192, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(branch7x7_1): BasicConv2d(
(conv): Conv2d(768, 192, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(192, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(branch7x7_2): BasicConv2d(
(conv): Conv2d(192, 192, kernel_size=(1, 7), stride=(1, 1), padding=(0, 3), bias=False)
(bn): BatchNorm2d(192, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(branch7x7_3): BasicConv2d(
(conv): Conv2d(192, 192, kernel_size=(7, 1), stride=(1, 1), padding=(3, 0), bias=False)
(bn): BatchNorm2d(192, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(branch7x7dbl_1): BasicConv2d(
(conv): Conv2d(768, 192, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(192, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(branch7x7dbl_2): BasicConv2d(
(conv): Conv2d(192, 192, kernel_size=(7, 1), stride=(1, 1), padding=(3, 0), bias=False)
(bn): BatchNorm2d(192, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(branch7x7dbl_3): BasicConv2d(
(conv): Conv2d(192, 192, kernel_size=(1, 7), stride=(1, 1), padding=(0, 3), bias=False)
(bn): BatchNorm2d(192, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(branch7x7dbl_4): BasicConv2d(
(conv): Conv2d(192, 192, kernel_size=(7, 1), stride=(1, 1), padding=(3, 0), bias=False)
(bn): BatchNorm2d(192, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(branch7x7dbl_5): BasicConv2d(
(conv): Conv2d(192, 192, kernel_size=(1, 7), stride=(1, 1), padding=(0, 3), bias=False)
(bn): BatchNorm2d(192, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(branch_pool): BasicConv2d(
(conv): Conv2d(768, 192, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(192, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
)
(Mixed_7a): InceptionD(
(branch3x3_1): BasicConv2d(
(conv): Conv2d(768, 192, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(192, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(branch3x3_2): BasicConv2d(
(conv): Conv2d(192, 320, kernel_size=(3, 3), stride=(2, 2), bias=False)
(bn): BatchNorm2d(320, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(branch7x7x3_1): BasicConv2d(
(conv): Conv2d(768, 192, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(192, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(branch7x7x3_2): BasicConv2d(
(conv): Conv2d(192, 192, kernel_size=(1, 7), stride=(1, 1), padding=(0, 3), bias=False)
(bn): BatchNorm2d(192, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(branch7x7x3_3): BasicConv2d(
(conv): Conv2d(192, 192, kernel_size=(7, 1), stride=(1, 1), padding=(3, 0), bias=False)
(bn): BatchNorm2d(192, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(branch7x7x3_4): BasicConv2d(
(conv): Conv2d(192, 192, kernel_size=(3, 3), stride=(2, 2), bias=False)
(bn): BatchNorm2d(192, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
)
(Mixed_7b): InceptionE(
(branch1x1): BasicConv2d(
(conv): Conv2d(1280, 320, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(320, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(branch3x3_1): BasicConv2d(
(conv): Conv2d(1280, 384, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(384, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(branch3x3_2a): BasicConv2d(
(conv): Conv2d(384, 384, kernel_size=(1, 3), stride=(1, 1), padding=(0, 1), bias=False)
(bn): BatchNorm2d(384, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(branch3x3_2b): BasicConv2d(
(conv): Conv2d(384, 384, kernel_size=(3, 1), stride=(1, 1), padding=(1, 0), bias=False)
(bn): BatchNorm2d(384, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(branch3x3dbl_1): BasicConv2d(
(conv): Conv2d(1280, 448, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(448, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(branch3x3dbl_2): BasicConv2d(
(conv): Conv2d(448, 384, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn): BatchNorm2d(384, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(branch3x3dbl_3a): BasicConv2d(
(conv): Conv2d(384, 384, kernel_size=(1, 3), stride=(1, 1), padding=(0, 1), bias=False)
(bn): BatchNorm2d(384, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(branch3x3dbl_3b): BasicConv2d(
(conv): Conv2d(384, 384, kernel_size=(3, 1), stride=(1, 1), padding=(1, 0), bias=False)
(bn): BatchNorm2d(384, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(branch_pool): BasicConv2d(
(conv): Conv2d(1280, 192, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(192, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
)
(Mixed_7c): InceptionE(
(branch1x1): BasicConv2d(
(conv): Conv2d(2048, 320, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(320, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(branch3x3_1): BasicConv2d(
(conv): Conv2d(2048, 384, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(384, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(branch3x3_2a): BasicConv2d(
(conv): Conv2d(384, 384, kernel_size=(1, 3), stride=(1, 1), padding=(0, 1), bias=False)
(bn): BatchNorm2d(384, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(branch3x3_2b): BasicConv2d(
(conv): Conv2d(384, 384, kernel_size=(3, 1), stride=(1, 1), padding=(1, 0), bias=False)
(bn): BatchNorm2d(384, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(branch3x3dbl_1): BasicConv2d(
(conv): Conv2d(2048, 448, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(448, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(branch3x3dbl_2): BasicConv2d(
(conv): Conv2d(448, 384, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn): BatchNorm2d(384, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(branch3x3dbl_3a): BasicConv2d(
(conv): Conv2d(384, 384, kernel_size=(1, 3), stride=(1, 1), padding=(0, 1), bias=False)
(bn): BatchNorm2d(384, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(branch3x3dbl_3b): BasicConv2d(
(conv): Conv2d(384, 384, kernel_size=(3, 1), stride=(1, 1), padding=(1, 0), bias=False)
(bn): BatchNorm2d(384, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
(branch_pool): BasicConv2d(
(conv): Conv2d(2048, 192, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(192, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
)
)
(global_pool): SelectAdaptivePool2d (pool_type=avg, flatten=Flatten(start_dim=1, end_dim=-1))
(fc): Linear(in_features=2048, out_features=1000, bias=True)
)
import urllib
from PIL import Image
from timm.data import resolve_data_config
from timm.data.transforms_factory import create_transform
config = resolve_data_config({}, model=model)
transform = create_transform(**config)
url, filename = ("https://www.zhifure.com/upload/images/2018/7/2314416884.jpg", "dog.jpg")
urllib.request.urlretrieve(url, filename)
img = Image.open(filename).convert('RGB')
tensor = transform(img).unsqueeze(0) # transform and add batch dimension
import torch
with torch.no_grad():
out = model(tensor)
probabilities = torch.nn.functional.softmax(out[0], dim=0)
print(probabilities.shape)
# prints: torch.Size([1000])
torch.Size([1000])
max(probabilities)
tensor(0.5639)
# Get imagenet class mappings
url, filename = ("https://raw.githubusercontent.com/pytorch/hub/master/imagenet_classes.txt", "imagenet_classes.txt")
urllib.request.urlretrieve(url, filename)
with open("imagenet_classes.txt", "r") as f:
categories = [s.strip() for s in f.readlines()]
# Print top categories per image
top5_prob, top5_catid = torch.topk(probabilities, 5)
for i in range(top5_prob.size(0)):
print(categories[top5_catid[i]], top5_prob[i].item())
# prints class names and probabilities like:
# [('Samoyed', 0.6425196528434753), ('Pomeranian', 0.04062102362513542), ('keeshond', 0.03186424449086189), ('white wolf', 0.01739676296710968), ('Eskimo dog', 0.011717947199940681)]
Pembroke 0.5638615489006042
Cardigan 0.29800280928611755
basenji 0.0031976643949747086
dingo 0.002436365932226181
red fox 0.0011594058014452457
包含 ImageNet-1K 和 out-of-distribution (OOD)测试集验证结果的 CSV 文件位于存储库结果文件夹中。
Github root 文件夹中包含的一个训练、验证、推断和检查点清理脚本。当前在 pip 版本中没有打包脚本。
训练和验证脚本是从早期版本的 PyTorch Imagenet 示例演变而来的。我已经增加了重要的功能,随着时间的推移,包括 CUDA 的具体性能增强的基础上 NVIDIA 的 APEX 例子。
各种各样的训练参数很多,并非所有的参数组合(或甚至参数)已经充分测试。对于训练数据集文件夹,指定包含训练和验证文件夹的基文件夹。
To train an SE-ResNet34 on ImageNet, locally distributed, 4 GPUs, one process per GPU w/ cosine schedule, random-erasing prob of 50% and per-pixel random value:
./distributed_train.sh 4 /data/imagenet --model seresnet34 --sched cosine --epochs 150 --warmup-epochs 5 --lr 0.4 --reprob 0.5 --remode pixel --batch-size 256 --amp -j 4
NOTE: It is recommended to use PyTorch 1.7+ w/ PyTorch native AMP and DDP instead of APEX AMP. --amp defaults to native AMP as of timm ver 0.4.3. --apex-amp will force use of APEX components if they are installed.
验证和推断脚本在用法上是相似的。一个输出验证集上的指标,另一个输出在 csv 中输出topK类 id。指定包含验证图像的文件夹,而不是训练脚本中的基本文件夹。
带预训练权重的模型验证(如果存在的话) :
python validate.py /imagenet/validation/ --model seresnext26_32x4d --pretrained
从checkpoint运行推理:
python inference.py /imagenet/validation/ --model mobilenetv3_large_100 --checkpoint ./output/train/model_best.pth.tar
训练脚本的示例:https://rwightman.github.io/pytorch-image-models/training_hparam_examples/
所有的timm中的模型都具有一致的机制,除了分类之外,还可以从其他任务模型中获得各种类型的特征。
倒数第二个模型层的特征可以通过几种方式获得,不需要模型手术(尽管你可以随意做手术)。人们必须首先决定他们是想要合并特性还是不合并特性。
在不修改网络的情况下,可以在任何模型上调用 model.forward _ features (input) ,而不用调用通常的模型(input)。这将绕过头分类器和网络的全局pooling。
import torch
import timm
m = timm.create_model('xception41', pretrained=True)
o = m(torch.randn(2, 3, 299, 299))
print(f'Original shape: {o.shape}')
o = m.forward_features(torch.randn(2, 3, 299, 299))
print(f'Unpooled shape: {o.shape}')
Original shape: torch.Size([2, 1000])
Unpooled shape: torch.Size([2, 2048, 10, 10])
import torch
import timm
m = timm.create_model('resnet50', pretrained=True, num_classes=0, global_pool='')
o = m(torch.randn(2, 3, 224, 224))
print(f'Unpooled shape: {o.shape}')
Unpooled shape: torch.Size([2, 2048, 7, 7])
import torch
import timm
m = timm.create_model('densenet121', pretrained=True)
o = m(torch.randn(2, 3, 224, 224))
print(f'Original shape: {o.shape}')
m.reset_classifier(0, '')
o = m(torch.randn(2, 3, 224, 224))
print(f'Unpooled shape: {o.shape}')
Original shape: torch.Size([2, 1000])
Unpooled shape: torch.Size([2, 1024, 7, 7])
要修改网络以返回池特性,可以使用 forward _ features ()和 pool/flatten 结果本身,或者像上面那样修改网络但保持池完整。
import torch
import timm
m = timm.create_model('resnet50', pretrained=True, num_classes=0)
o = m(torch.randn(2, 3, 224, 224))
print(f'Pooled shape: {o.shape}')
Pooled shape: torch.Size([2, 2048])
import torch
import timm
m = timm.create_model('ese_vovnet19b_dw', pretrained=True)
o = m(torch.randn(2, 3, 224, 224))
print(f'Original shape: {o.shape}')
m.reset_classifier(0)
o = m(torch.randn(2, 3, 224, 224))
print(f'Pooled shape: {o.shape}')
Downloading: "https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-weights/ese_vovnet19b_dw-a8741004.pth" to C:\Users\Administrator/.cache\torch\hub\checkpoints\ese_vovnet19b_dw-a8741004.pth
Original shape: torch.Size([2, 1000])
Pooled shape: torch.Size([2, 1024])
目标检测、分割、关键点以及各种密集的像素任务都需要从骨干网络中获取不同尺度的特征映射。这通常是通过修改原始的分类网络来完成的。由于每个网络在结构上有很大的不同,所以在任何给定的目标检测或分割库中只支持少量的主干网并不罕见。
Timm 允许创建一个一致的接口,用于创建任何包含的模型作为特征主干,输出特征映射到选定的级别。
通过在任何 create _ model 调用中添加 only = True 的参数特性,可以创建一个特征backbone。默认情况下,大多数模型(不是所有模型都有那么多)都会输出5个步长,第一个从2开始(有些从1或4开始)。
import torch
import timm
m = timm.create_model('resnest26d', features_only=True, pretrained=True)
o = m(torch.randn(2, 3, 224, 224))
for x in o:
print(x.shape)
在创建了特征主干之后,可以查询它来向下游主干提供通道或分辨率降低信息,而不需要静态配置或硬编码常量。这个。Feature _ info 属性是一个封装特征提取点信息的类。
import torch
import timm
m = timm.create_model('regnety_032', features_only=True, pretrained=True)
print(f'Feature channels: {m.feature_info.channels()}')
o = m(torch.randn(2, 3, 224, 224))
for x in o:
print(x.shape)
import torch
import timm
m = timm.create_model('ecaresnet101d', features_only=True, output_stride=8, out_indices=(2, 4), pretrained=True)
print(f'Feature channels: {m.feature_info.channels()}')
print(f'Feature reduction: {m.feature_info.reduction()}')
o = m(torch.randn(2, 3, 320, 320))
for x in o:
print(x.shape)