文章目录
- 概述
- 一、利用torchstat
-
- 二、利用ptflops
-
- 三、利用thop
-
概述
Params
:是指网络模型中需要训练的参数总数,理解为参数量。
FLOPs
:是指浮点运算次数,s表示复数,理解为计算量,用于衡量模型的复杂度。(注意与FLOPS区别,FLOPS是每秒浮点运算次数,用来衡量硬件的性能。)
一、利用torchstat
1.1 方法
pip install torchstat
1.2 代码
from torchstat import stat
import torchvision.models as models
net = models.resnet18()
stat(net, (3, 224, 224))
1.3 输出
[MAdd]: AdaptiveAvgPool2d is not supported!
[Flops]: AdaptiveAvgPool2d is not supported!
[Memory]: AdaptiveAvgPool2d is not supported!
module name input shape output shape params memory(MB) MAdd Flops MemRead(B) MemWrite(B) duration[%] MemR+W(B)
0 conv1 3 224 224 64 112 112 9408.0 3.06 235,225,088.0 118,013,952.0 639744.0 3211264.0 6.88% 3851008.0
1 bn1 64 112 112 64 112 112 128.0 3.06 3,211,264.0 1,605,632.0 3211776.0 3211264.0 1.24% 6423040.0
2 relu 64 112 112 64 112 112 0.0 3.06 802,816.0 802,816.0 3211264.0 3211264.0 2.48% 6422528.0
3 maxpool 64 112 112 64 56 56 0.0 0.77 1,605,632.0 802,816.0 3211264.0 802816.0 14.10% 4014080.0
4 layer1.0.conv1 64 56 56 64 56 56 36864.0 0.77 231,010,304.0 115,605,504.0 950272.0 802816.0 4.95% 1753088.0
5 layer1.0.bn1 64 56 56 64 56 56 128.0 0.77 802,816.0 401,408.0 803328.0 802816.0 0.00% 1606144.0
6 layer1.0.relu 64 56 56 64 56 56 0.0 0.77 200,704.0 200,704.0 802816.0 802816.0 1.24% 1605632.0
7 layer1.0.conv2 64 56 56 64 56 56 36864.0 0.77 231,010,304.0 115,605,504.0 950272.0 802816.0 4.95% 1753088.0
8 layer1.0.bn2 64 56 56 64 56 56 128.0 0.77 802,816.0 401,408.0 803328.0 802816.0 0.00% 1606144.0
9 layer1.1.conv1 64 56 56 64 56 56 36864.0 0.77 231,010,304.0 115,605,504.0 950272.0 802816.0 4.40% 1753088.0
10 layer1.1.bn1 64 56 56 64 56 56 128.0 0.77 802,816.0 401,408.0 803328.0 802816.0 1.24% 1606144.0
11 layer1.1.relu 64 56 56 64 56 56 0.0 0.77 200,704.0 200,704.0 802816.0 802816.0 0.00% 1605632.0
12 layer1.1.conv2 64 56 56 64 56 56 36864.0 0.77 231,010,304.0 115,605,504.0 950272.0 802816.0 4.95% 1753088.0
13 layer1.1.bn2 64 56 56 64 56 56 128.0 0.77 802,816.0 401,408.0 803328.0 802816.0 1.24% 1606144.0
14 layer2.0.conv1 64 56 56 128 28 28 73728.0 0.38 115,505,152.0 57,802,752.0 1097728.0 401408.0 2.47% 1499136.0
15 layer2.0.bn1 128 28 28 128 28 28 256.0 0.38 401,408.0 200,704.0 402432.0 401408.0 1.24% 803840.0
16 layer2.0.relu 128 28 28 128 28 28 0.0 0.38 100,352.0 100,352.0 401408.0 401408.0 1.29% 802816.0
17 layer2.0.conv2 128 28 28 128 28 28 147456.0 0.38 231,110,656.0 115,605,504.0 991232.0 401408.0 3.71% 1392640.0
18 layer2.0.bn2 128 28 28 128 28 28 256.0 0.38 401,408.0 200,704.0 402432.0 401408.0 0.00% 803840.0
19 layer2.0.downsample.0 64 56 56 128 28 28 8192.0 0.38 12,744,704.0 6,422,528.0 835584.0 401408.0 0.00% 1236992.0
20 layer2.0.downsample.1 128 28 28 128 28 28 256.0 0.38 401,408.0 200,704.0 402432.0 401408.0 0.00% 803840.0
21 layer2.1.conv1 128 28 28 128 28 28 147456.0 0.38 231,110,656.0 115,605,504.0 991232.0 401408.0 4.19% 1392640.0
22 layer2.1.bn1 128 28 28 128 28 28 256.0 0.38 401,408.0 200,704.0 402432.0 401408.0 0.00% 803840.0
23 layer2.1.relu 128 28 28 128 28 28 0.0 0.38 100,352.0 100,352.0 401408.0 401408.0 0.00% 802816.0
24 layer2.1.conv2 128 28 28 128 28 28 147456.0 0.38 231,110,656.0 115,605,504.0 991232.0 401408.0 3.71% 1392640.0
25 layer2.1.bn2 128 28 28 128 28 28 256.0 0.38 401,408.0 200,704.0 402432.0 401408.0 0.00% 803840.0
26 layer3.0.conv1 128 28 28 256 14 14 294912.0 0.19 115,555,328.0 57,802,752.0 1581056.0 200704.0 1.24% 1781760.0
27 layer3.0.bn1 256 14 14 256 14 14 512.0 0.19 200,704.0 100,352.0 202752.0 200704.0 1.24% 403456.0
28 layer3.0.relu 256 14 14 256 14 14 0.0 0.19 50,176.0 50,176.0 200704.0 200704.0 0.00% 401408.0
29 layer3.0.conv2 256 14 14 256 14 14 589824.0 0.19 231,160,832.0 115,605,504.0 2560000.0 200704.0 3.71% 2760704.0
30 layer3.0.bn2 256 14 14 256 14 14 512.0 0.19 200,704.0 100,352.0 202752.0 200704.0 0.00% 403456.0
31 layer3.0.downsample.0 128 28 28 256 14 14 32768.0 0.19 12,794,880.0 6,422,528.0 532480.0 200704.0 1.24% 733184.0
32 layer3.0.downsample.1 256 14 14 256 14 14 512.0 0.19 200,704.0 100,352.0 202752.0 200704.0 0.00% 403456.0
33 layer3.1.conv1 256 14 14 256 14 14 589824.0 0.19 231,160,832.0 115,605,504.0 2560000.0 200704.0 4.39% 2760704.0
34 layer3.1.bn1 256 14 14 256 14 14 512.0 0.19 200,704.0 100,352.0 202752.0 200704.0 0.00% 403456.0
35 layer3.1.relu 256 14 14 256 14 14 0.0 0.19 50,176.0 50,176.0 200704.0 200704.0 0.00% 401408.0
36 layer3.1.conv2 256 14 14 256 14 14 589824.0 0.19 231,160,832.0 115,605,504.0 2560000.0 200704.0 3.71% 2760704.0
37 layer3.1.bn2 256 14 14 256 14 14 512.0 0.19 200,704.0 100,352.0 202752.0 200704.0 0.00% 403456.0
38 layer4.0.conv1 256 14 14 512 7 7 1179648.0 0.10 115,580,416.0 57,802,752.0 4919296.0 100352.0 2.48% 5019648.0
39 layer4.0.bn1 512 7 7 512 7 7 1024.0 0.10 100,352.0 50,176.0 104448.0 100352.0 0.00% 204800.0
40 layer4.0.relu 512 7 7 512 7 7 0.0 0.10 25,088.0 25,088.0 100352.0 100352.0 0.00% 200704.0
41 layer4.0.conv2 512 7 7 512 7 7 2359296.0 0.10 231,185,920.0 115,605,504.0 9537536.0 100352.0 4.95% 9637888.0
42 layer4.0.bn2 512 7 7 512 7 7 1024.0 0.10 100,352.0 50,176.0 104448.0 100352.0 0.00% 204800.0
43 layer4.0.downsample.0 256 14 14 512 7 7 131072.0 0.10 12,819,968.0 6,422,528.0 724992.0 100352.0 1.24% 825344.0
44 layer4.0.downsample.1 512 7 7 512 7 7 1024.0 0.10 100,352.0 50,176.0 104448.0 100352.0 0.00% 204800.0
45 layer4.1.conv1 512 7 7 512 7 7 2359296.0 0.10 231,185,920.0 115,605,504.0 9537536.0 100352.0 5.36% 9637888.0
46 layer4.1.bn1 512 7 7 512 7 7 1024.0 0.10 100,352.0 50,176.0 104448.0 100352.0 0.00% 204800.0
47 layer4.1.relu 512 7 7 512 7 7 0.0 0.10 25,088.0 25,088.0 100352.0 100352.0 0.00% 200704.0
48 layer4.1.conv2 512 7 7 512 7 7 2359296.0 0.10 231,185,920.0 115,605,504.0 9537536.0 100352.0 4.95% 9637888.0
49 layer4.1.bn2 512 7 7 512 7 7 1024.0 0.10 100,352.0 50,176.0 104448.0 100352.0 0.00% 204800.0
50 avgpool 512 7 7 512 1 1 0.0 0.00 0.0 0.0 0.0 0.0 1.24% 0.0
51 fc 512 1000 513000.0 0.00 1,023,000.0 512,000.0 2054048.0 4000.0 0.00% 2058048.0
total 11689512.0 25.65 3,638,757,912.0 1,821,399,040.0 2054048.0 4000.0 100.00% 101756992.0
=================================================================================================================================================================
Total params: 11,689,512
-----------------------------------------------------------------------------------------------------------------------------------------------------------------
Total memory: 25.65MB
Total MAdd: 3.64GMAdd
Total Flops: 1.82GFlops
Total MemR+W: 97.04MB
Process finished with exit code 0
二、利用ptflops
2.1 方法
pip install ptflops
2.2 代码
import torchvision.models as models
import torch
from ptflops import get_model_complexity_info
with torch.cuda.device(0):
net = models.resnet18()
flops, params = get_model_complexity_info(net, (3, 224, 224), as_strings=True, print_per_layer_stat=True, verbose=True)
print('Flops: ', flops)
print('Params: ', params)
2.3 输出
Warning: module BasicBlock is treated as a zero-op.
Warning: module ResNet is treated as a zero-op.
ResNet(
11.69 M, 100.000% Params, 1.822 GMac, 100.000% MACs,
(conv1): Conv2d(0.009 M, 0.080% Params, 0.118 GMac, 6.477% MACs, 3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
(bn1): BatchNorm2d(0.0 M, 0.001% Params, 0.002 GMac, 0.088% MACs, 64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(0.0 M, 0.000% Params, 0.001 GMac, 0.044% MACs, inplace)
(maxpool): MaxPool2d(0.0 M, 0.000% Params, 0.001 GMac, 0.044% MACs, kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
(layer1): Sequential(
0.148 M, 1.266% Params, 0.465 GMac, 25.510% MACs,
(0): BasicBlock(
0.074 M, 0.633% Params, 0.232 GMac, 12.755% MACs,
(conv1): Conv2d(0.037 M, 0.315% Params, 0.116 GMac, 6.344% MACs, 64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn1): BatchNorm2d(0.0 M, 0.001% Params, 0.0 GMac, 0.022% MACs, 64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(0.0 M, 0.000% Params, 0.0 GMac, 0.022% MACs, inplace)
(conv2): Conv2d(0.037 M, 0.315% Params, 0.116 GMac, 6.344% MACs, 64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(0.0 M, 0.001% Params, 0.0 GMac, 0.022% MACs, 64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(1): BasicBlock(
0.074 M, 0.633% Params, 0.232 GMac, 12.755% MACs,
(conv1): Conv2d(0.037 M, 0.315% Params, 0.116 GMac, 6.344% MACs, 64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn1): BatchNorm2d(0.0 M, 0.001% Params, 0.0 GMac, 0.022% MACs, 64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(0.0 M, 0.000% Params, 0.0 GMac, 0.022% MACs, inplace)
(conv2): Conv2d(0.037 M, 0.315% Params, 0.116 GMac, 6.344% MACs, 64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(0.0 M, 0.001% Params, 0.0 GMac, 0.022% MACs, 64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(layer2): Sequential(
0.526 M, 4.496% Params, 0.412 GMac, 22.635% MACs,
(0): BasicBlock(
0.23 M, 1.969% Params, 0.181 GMac, 9.913% MACs,
(conv1): Conv2d(0.074 M, 0.631% Params, 0.058 GMac, 3.172% MACs, 64, 128, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
(bn1): BatchNorm2d(0.0 M, 0.002% Params, 0.0 GMac, 0.011% MACs, 128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(0.0 M, 0.000% Params, 0.0 GMac, 0.011% MACs, inplace)
(conv2): Conv2d(0.147 M, 1.261% Params, 0.116 GMac, 6.344% MACs, 128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(0.0 M, 0.002% Params, 0.0 GMac, 0.011% MACs, 128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(downsample): Sequential(
0.008 M, 0.072% Params, 0.007 GMac, 0.363% MACs,
(0): Conv2d(0.008 M, 0.070% Params, 0.006 GMac, 0.352% MACs, 64, 128, kernel_size=(1, 1), stride=(2, 2), bias=False)
(1): BatchNorm2d(0.0 M, 0.002% Params, 0.0 GMac, 0.011% MACs, 128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(1): BasicBlock(
0.295 M, 2.527% Params, 0.232 GMac, 12.722% MACs,
(conv1): Conv2d(0.147 M, 1.261% Params, 0.116 GMac, 6.344% MACs, 128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn1): BatchNorm2d(0.0 M, 0.002% Params, 0.0 GMac, 0.011% MACs, 128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(0.0 M, 0.000% Params, 0.0 GMac, 0.011% MACs, inplace)
(conv2): Conv2d(0.147 M, 1.261% Params, 0.116 GMac, 6.344% MACs, 128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(0.0 M, 0.002% Params, 0.0 GMac, 0.011% MACs, 128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(layer3): Sequential(
2.1 M, 17.962% Params, 0.412 GMac, 22.596% MACs,
(0): BasicBlock(
0.919 M, 7.862% Params, 0.18 GMac, 9.891% MACs,
(conv1): Conv2d(0.295 M, 2.523% Params, 0.058 GMac, 3.172% MACs, 128, 256, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
(bn1): BatchNorm2d(0.001 M, 0.004% Params, 0.0 GMac, 0.006% MACs, 256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(0.0 M, 0.000% Params, 0.0 GMac, 0.006% MACs, inplace)
(conv2): Conv2d(0.59 M, 5.046% Params, 0.116 GMac, 6.344% MACs, 256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(0.001 M, 0.004% Params, 0.0 GMac, 0.006% MACs, 256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(downsample): Sequential(
0.033 M, 0.285% Params, 0.007 GMac, 0.358% MACs,
(0): Conv2d(0.033 M, 0.280% Params, 0.006 GMac, 0.352% MACs, 128, 256, kernel_size=(1, 1), stride=(2, 2), bias=False)
(1): BatchNorm2d(0.001 M, 0.004% Params, 0.0 GMac, 0.006% MACs, 256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(1): BasicBlock(
1.181 M, 10.100% Params, 0.232 GMac, 12.705% MACs,
(conv1): Conv2d(0.59 M, 5.046% Params, 0.116 GMac, 6.344% MACs, 256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn1): BatchNorm2d(0.001 M, 0.004% Params, 0.0 GMac, 0.006% MACs, 256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(0.0 M, 0.000% Params, 0.0 GMac, 0.006% MACs, inplace)
(conv2): Conv2d(0.59 M, 5.046% Params, 0.116 GMac, 6.344% MACs, 256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(0.001 M, 0.004% Params, 0.0 GMac, 0.006% MACs, 256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(layer4): Sequential(
8.394 M, 71.806% Params, 0.411 GMac, 22.577% MACs,
(0): BasicBlock(
3.673 M, 31.422% Params, 0.18 GMac, 9.880% MACs,
(conv1): Conv2d(1.18 M, 10.092% Params, 0.058 GMac, 3.172% MACs, 256, 512, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
(bn1): BatchNorm2d(0.001 M, 0.009% Params, 0.0 GMac, 0.003% MACs, 512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(0.0 M, 0.000% Params, 0.0 GMac, 0.003% MACs, inplace)
(conv2): Conv2d(2.359 M, 20.183% Params, 0.116 GMac, 6.344% MACs, 512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(0.001 M, 0.009% Params, 0.0 GMac, 0.003% MACs, 512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(downsample): Sequential(
0.132 M, 1.130% Params, 0.006 GMac, 0.355% MACs,
(0): Conv2d(0.131 M, 1.121% Params, 0.006 GMac, 0.352% MACs, 256, 512, kernel_size=(1, 1), stride=(2, 2), bias=False)
(1): BatchNorm2d(0.001 M, 0.009% Params, 0.0 GMac, 0.003% MACs, 512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(1): BasicBlock(
4.721 M, 40.384% Params, 0.231 GMac, 12.697% MACs,
(conv1): Conv2d(2.359 M, 20.183% Params, 0.116 GMac, 6.344% MACs, 512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn1): BatchNorm2d(0.001 M, 0.009% Params, 0.0 GMac, 0.003% MACs, 512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(0.0 M, 0.000% Params, 0.0 GMac, 0.003% MACs, inplace)
(conv2): Conv2d(2.359 M, 20.183% Params, 0.116 GMac, 6.344% MACs, 512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(0.001 M, 0.009% Params, 0.0 GMac, 0.003% MACs, 512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(avgpool): AdaptiveAvgPool2d(0.0 M, 0.000% Params, 0.0 GMac, 0.001% MACs, output_size=(1, 1))
(fc): Linear(0.513 M, 4.389% Params, 0.001 GMac, 0.028% MACs, in_features=512, out_features=1000, bias=True)
)
Flops: 1.82 GMac
Params: 11.69 M
Process finished with exit code 0
三、利用thop
3.1 方法
pip install thop
3.2 代码
import torchvision.models as models
import torch
from thop import profile
net = models.resnet18()
inputs = torch.randn(1, 3, 224, 224)
flops, params = profile(net, (inputs,))
print('flops: ', flops)
print('params: ', params)
3.3 输出
[INFO] Register count_convNd() for <class 'torch.nn.modules.conv.Conv2d'>.
[INFO] Register count_bn() for <class 'torch.nn.modules.batchnorm.BatchNorm2d'>.
[INFO] Register zero_ops() for <class 'torch.nn.modules.activation.ReLU'>.
[INFO] Register zero_ops() for <class 'torch.nn.modules.pooling.MaxPool2d'>.
[WARN] Cannot find rule for <class 'torchvision.models.resnet.BasicBlock'>. Treat it as zero Macs and zero Params.
[WARN] Cannot find rule for <class 'torch.nn.modules.container.Sequential'>. Treat it as zero Macs and zero Params.
[INFO] Register count_adap_avgpool() for <class 'torch.nn.modules.pooling.AdaptiveAvgPool2d'>.
[INFO] Register count_linear() for <class 'torch.nn.modules.linear.Linear'>.
[WARN] Cannot find rule for <class 'torchvision.models.resnet.ResNet'>. Treat it as zero Macs and zero Params.
flops: 1819066368.0
params: 11689512.0
Process finished with exit code 0