如果原理你已经了解,请直接到跳转ResNet50实现:卷积神经网络 第三周作业:Residual+Networks±+v1
假设经过网络中某几层所隐含的映射是 H ( X ) H(X) H(X),其中X表示这几层网络的首层输入。如果多个非线性层表示一个足够复杂的Hypothesis,那么H(X)等价于一个同样逐渐逼近该Hypothesis的残差函数(residual function) F ( X ) = H ( X ) − X F(X) = H(X) - X F(X)=H(X)−X, 原函数可以表示为 F ( X ) + X F(X) +X F(X)+X。 H ( X ) H(X) H(X)和 F ( X ) F(X) F(X)本质上都是对Hypothesis的一种逼近(近似)。
基于该原理,ResNet中提出了2种映射,恒等映射(identity mapping)和残差映射(residual mapping), 恒等映射就是上图中跳过2层权重而把X直接送到后2层relu部分的映射,残差映射指平凡网络原来的部分。之所以称为恒等,因为你跳过了权重层,没有经过任何计算,即 G ( X ) = X G(X)=X G(X)=X。
吴恩达在视频里对此的解释是,我们直接取某层之后的输出X作为输入,直接跳过一些连续的网络层,送到后面某层的relu之前。这么做使得一个Residual Block很容易的学习(因为它只做了一个relu操作)。
当然这需要满足X的维度和relu的维度一致。上图给出如何在维度不一致情况下通过 W s × a [ l ] W_s\times a^{[l]} Ws×a[l]使得维度一致, W s W_s Ws可以仅仅是一个用0填充的矩阵,或者是需要学习的参数矩阵。
如图,曲线即表示了Shortcut Connection(近道连接),它跳过了2个权重层(抄了近道)。平凡网络的一部分层加上shortcut connection即构成了一个Residual Block。shortcut使得每一个残差块很容易地学习到恒等映射函数,并且在反向传播时使得梯度直接传播到更浅的层。
在平凡网络上多次使用Residual Block,就形成了Residual Network。当然使用多少次,在什么网络的什么位置使用,就需要高深的洞察力和对深度神经网络的充分了解,目前大家参考的就是原论文和相近论文,改变这些也可以构造出新的近似网络。
其中左边的residual block保留了输入的dimension,而右边则是一个"bottleneck design",第一层把输入256维降低到64维,然后在第三层回复到256维,而shortcut/skip connection跳过这三层直接把输入送到了relu部分。
ResNet即共50层的参差网络,其中没有需要训练的参数的层,比如pooling layer,不参与计数。
50层的ResNet包含了Identity block(恒等块)和convolutional block(卷积块)2种结构,如下所示。
2种结构的主要差别是shortcut connection上是否进行了卷积操作。
(注:原论文把max pooling作为Stage 2的起始阶段)
filter of size 是三个卷积块的filter数目,而不是卷积核大小f,参数f如上表中50层ResNet那列所示,下面也有说明。
卷积神经网络 第三周作业:Residual+Networks±+v1
import numpy as np
import tensorflow as tf
from keras import layers
from keras.layers import Input, Add, Dense, Activation, ZeroPadding2D, BatchNormalization, Flatten, Conv2D, AveragePooling2D, MaxPooling2D, GlobalMaxPooling2D
from keras.models import Model, load_model
from keras.preprocessing import image
from keras.utils import layer_utils
from keras.utils.data_utils import get_file
from keras.applications.imagenet_utils import preprocess_input
# import pydot
# from IPython.display import SVG
from keras.utils.vis_utils import model_to_dot
from keras.utils import plot_model
from resnets_utils import *
from keras.initializers import glorot_uniform
import scipy.misc
from matplotlib.pyplot import imshow
%matplotlib inline
import keras.backend as K
# GRADED FUNCTION: identity_block
def identity_block(X, f, filters, stage, block):
""" Implementation of the identity block as defined in Figure 4 Arguments: X -- input tensor of shape (m, n_H_prev, n_W_prev, n_C_prev) f -- integer, specifying the shape of the middle CONV's window for the main path filters -- python list of integers, defining the number of filters in the CONV layers of the main path stage -- integer, used to name the layers, depending on their position in the network block -- string/character, used to name the layers, depending on their position in the network Returns: X -- output of the identity block, tensor of shape (n_H, n_W, n_C) """
# defining name basis
conv_name_base = "res" + str(stage) + block + "_branch"
bn_name_base = "bn" + str(stage) + block + "_branch"
# Retrieve Filters
F1, F2, F3 = filters
# Save the input value. You'll need this later to add back to the main path.
X_shortcut = X
# First component of main path
X = Conv2D(filters=F1, kernel_size=(1, 1), strides=(1, 1), padding="valid",
name=conv_name_base+"2a", kernel_initializer=glorot_uniform(seed=0))(X)
#valid mean no padding / glorot_uniform equal to Xaiver initialization - Steve
X = BatchNormalization(axis=3, name=bn_name_base + "2a")(X)
X = Activation("relu")(X)
# Second component of main path (≈3 lines)
X = Conv2D(filters=F2, kernel_size=(f, f), strides=(1, 1), padding="same",
name=conv_name_base+"2b", kernel_initializer=glorot_uniform(seed=0))(X)
X = BatchNormalization(axis=3, name=bn_name_base+"2b")(X)
X = Activation("relu")(X)
# Third component of main path (≈2 lines)
# Final step: Add shortcut value to main path, and pass it through a RELU activation (≈2 lines)
X = Conv2D(filters=F3, kernel_size=(1, 1), strides=(1, 1), padding="valid",
name=conv_name_base+"2c", kernel_initializer=glorot_uniform(seed=0))(X)
X = BatchNormalization(axis=3, name=bn_name_base+"2c")(X)
X = Add()([X, X_shortcut])
X = Activation("relu")(X)
return X
# GRADED FUNCTION: convolutional_block
def convolutional_block(X, f, filters, stage, block, s = 2):
""" Implementation of the convolutional block as defined in Figure 4 Arguments: X -- input tensor of shape (m, n_H_prev, n_W_prev, n_C_prev) f -- integer, specifying the shape of the middle CONV's window for the main path filters -- python list of integers, defining the number of filters in the CONV layers of the main path stage -- integer, used to name the layers, depending on their position in the network block -- string/character, used to name the layers, depending on their position in the network s -- Integer, specifying the stride to be used Returns: X -- output of the convolutional block, tensor of shape (n_H, n_W, n_C) """
# defining name basis
conv_name_base = 'res' + str(stage) + block + '_branch'
bn_name_base = 'bn' + str(stage) + block + '_branch'
# Retrieve Filters
F1, F2, F3 = filters
# Save the input value
X_shortcut = X
##### MAIN PATH #####
# First component of main path
X = Conv2D(F1, (1, 1), strides = (s,s), name = conv_name_base + '2a', padding='valid', kernel_initializer = glorot_uniform(seed=0))(X)
X = BatchNormalization(axis = 3, name = bn_name_base + '2a')(X)
X = Activation('relu')(X)
# Second component of main path (≈3 lines)
X = Conv2D(F2, (f, f), strides = (1, 1), name = conv_name_base + '2b',padding='same', kernel_initializer = glorot_uniform(seed=0))(X)
X = BatchNormalization(axis = 3, name = bn_name_base + '2b')(X)
X = Activation('relu')(X)
# Third component of main path (≈2 lines)
X = Conv2D(F3, (1, 1), strides = (1, 1), name = conv_name_base + '2c',padding='valid', kernel_initializer = glorot_uniform(seed=0))(X)
X = BatchNormalization(axis = 3, name = bn_name_base + '2c')(X)
##### SHORTCUT PATH #### (≈2 lines)
X_shortcut = Conv2D(F3, (1, 1), strides = (s, s), name = conv_name_base + '1',padding='valid', kernel_initializer = glorot_uniform(seed=0))(X_shortcut)
X_shortcut = BatchNormalization(axis = 3, name = bn_name_base + '1')(X_shortcut)
# Final step: Add shortcut value to main path, and pass it through a RELU activation (≈2 lines)
X = layers.add([X, X_shortcut])
X = Activation('relu')(X)
return X
def ResNet50(input_shape = (64, 64, 3), classes = 6):
""" Implementation of the popular ResNet50 the following architecture: CONV2D -> BATCHNORM -> RELU -> MAXPOOL -> CONVBLOCK -> IDBLOCK*2 -> CONVBLOCK -> IDBLOCK*3 -> CONVBLOCK -> IDBLOCK*5 -> CONVBLOCK -> IDBLOCK*2 -> AVGPOOL -> TOPLAYER Arguments: input_shape -- shape of the images of the dataset classes -- integer, number of classes Returns: model -- a Model() instance in Keras """
# Define the input as a tensor with shape input_shape
X_input = Input(input_shape)
# Zero-Padding
X = ZeroPadding2D((3, 3))(X_input)
# Stage 1
X = Conv2D(filters=64, kernel_size=(7, 7), strides=(2, 2), name="conv",
X = BatchNormalization(axis=3, name="bn_conv1")(X)
X = Activation("relu")(X)
X = MaxPooling2D(pool_size=(3, 3), strides=(2, 2))(X)
# Stage 2
X = convolutional_block(X, f=3, filters=[64, 64, 256], stage=2, block="a", s=1)
X = identity_block(X, f=3, filters=[64, 64, 256], stage=2, block="b")
X = identity_block(X, f=3, filters=[64, 64, 256], stage=2, block="c")
# Stage 3 (≈4 lines)
# The convolutional block uses three set of filters of size [128,128,512], "f" is 3, "s" is 2 and the block is "a".
# The 3 identity blocks use three set of filters of size [128,128,512], "f" is 3 and the blocks are "b", "c" and "d".
X = convolutional_block(X, f=3, filters=[128, 128, 512], stage=3, block="a", s=1)
X = identity_block(X, f=3, filters=[128, 128, 512], stage=3, block="b")
X = identity_block(X, f=3, filters=[128, 128, 512], stage=3, block="c")
X = identity_block(X, f=3, filters=[128, 128, 512], stage=3, block="d")
# Stage 4 (≈6 lines)
# The convolutional block uses three set of filters of size [256, 256, 1024], "f" is 3, "s" is 2 and the block is "a".
# The 5 identity blocks use three set of filters of size [256, 256, 1024], "f" is 3 and the blocks are "b", "c", "d", "e" and "f".
X = convolutional_block(X, f=3, filters=[256, 256, 1024], stage=4, block="a", s=2)
X = identity_block(X, f=3, filters=[256, 256, 1024], stage=4, block="b")
X = identity_block(X, f=3, filters=[256, 256, 1024], stage=4, block="c")
X = identity_block(X, f=3, filters=[256, 256, 1024], stage=4, block="d")
X = identity_block(X, f=3, filters=[256, 256, 1024], stage=4, block="e")
X = identity_block(X, f=3, filters=[256, 256, 1024], stage=4, block="f")
# Stage 5 (≈3 lines)
# The convolutional block uses three set of filters of size [512, 512, 2048], "f" is 3, "s" is 2 and the block is "a".
# The 2 identity blocks use three set of filters of size [256, 256, 2048], "f" is 3 and the blocks are "b" and "c".
X = convolutional_block(X, f=3, filters=[512, 512, 2048], stage=5, block="a", s=2)
X = identity_block(X, f=3, filters=[512, 512, 2048], stage=5, block="b")
X = identity_block(X, f=3, filters=[512, 512, 2048], stage=5, block="c")
# filters should be [256, 256, 2048], but it fail to be graded. Use [512, 512, 2048] to pass the grading
# AVGPOOL (≈1 line). Use "X = AveragePooling2D(...)(X)"
# The 2D Average Pooling uses a window of shape (2,2) and its name is "avg_pool".
X = AveragePooling2D(pool_size=(2, 2), padding="same")(X)
# output layer
X = Flatten()(X)
X = Dense(classes, activation="softmax", name="fc"+str(classes), kernel_initializer=glorot_uniform(seed=0))(X)
# Create model
model = Model(inputs=X_input, outputs=X, name="ResNet50")
return model
model = ResNet50(input_shape=(64, 64, 3), classes=6)
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
X_train_orig, Y_train_orig, X_test_orig, Y_test_orig, classes = load_dataset()
# Normalize image vectors
X_train = X_train_orig/255.
X_test = X_test_orig/255.
""" def convert_to_one_hot(Y, C): Y = np.eye(C)[Y.reshape(-1)].T return Y """
# Convert training and test labels to one hot matrices
Y_train = convert_to_one_hot(Y_train_orig, 6).T
Y_test = convert_to_one_hot(Y_test_orig, 6).T
print ("number of training examples = " + str(X_train.shape[0]))
print ("number of test examples = " + str(X_test.shape[0]))
print ("X_train shape: " + str(X_train.shape))
print ("Y_train shape: " + str(Y_train.shape))
print ("X_test shape: " + str(X_test.shape))
print ("Y_test shape: " + str(Y_test.shape))
number of training examples = 1080
number of test examples = 120
X_train shape: (1080, 64, 64, 3)
Y_train shape: (1080, 6)
X_test shape: (120, 64, 64, 3)
Y_test shape: (120, 6)
model.fit(X_train, Y_train, epochs = 20, batch_size = 32)
Epoch 1/20
1080/1080 [==============================] - 268s 248ms/step - loss: 2.9721 - acc: 0.2898
Epoch 2/20
1080/1080 [==============================] - 270s 250ms/step - loss: 1.8968 - acc: 0.3639
Epoch 3/20
1080/1080 [==============================] - 268s 248ms/step - loss: 1.5796 - acc: 0.4463
Epoch 4/20
1080/1080 [==============================] - 251s 233ms/step - loss: 1.2796 - acc: 0.5213
Epoch 5/20
1080/1080 [==============================] - 260s 241ms/step - loss: 0.9278 - acc: 0.6722
Epoch 6/20
1080/1080 [==============================] - 261s 242ms/step - loss: 0.7286 - acc: 0.7315
Epoch 7/20
1080/1080 [==============================] - 258s 239ms/step - loss: 0.4950 - acc: 0.8324
Epoch 8/20
1080/1080 [==============================] - 261s 241ms/step - loss: 0.3646 - acc: 0.8889
Epoch 9/20
1080/1080 [==============================] - 258s 238ms/step - loss: 0.3135 - acc: 0.9019
Epoch 10/20
1080/1080 [==============================] - 255s 237ms/step - loss: 0.1291 - acc: 0.9639
Epoch 11/20
1080/1080 [==============================] - 253s 235ms/step - loss: 0.0814 - acc: 0.9704
Epoch 12/20
1080/1080 [==============================] - 260s 240ms/step - loss: 0.0901 - acc: 0.9685
Epoch 13/20
1080/1080 [==============================] - 260s 240ms/step - loss: 0.0848 - acc: 0.9694
Epoch 14/20
1080/1080 [==============================] - 261s 242ms/step - loss: 0.0740 - acc: 0.9741
Epoch 15/20
1080/1080 [==============================] - 258s 239ms/step - loss: 0.0488 - acc: 0.9833
Epoch 16/20
1080/1080 [==============================] - 260s 241ms/step - loss: 0.0257 - acc: 0.9981
Epoch 17/20
1080/1080 [==============================] - 259s 240ms/step - loss: 0.0029 - acc: 1.0000
Epoch 18/20
1080/1080 [==============================] - 260s 241ms/step - loss: 0.0014 - acc: 1.0000
Epoch 19/20
1080/1080 [==============================] - 257s 238ms/step - loss: 8.9325e-04 - acc: 1.0000
Epoch 20/20
1080/1080 [==============================] - 255s 236ms/step - loss: 6.9667e-04 - acc: 1.0000
preds = model.evaluate(X_test, Y_test)
print ("Loss = " + str(preds[0]))
print ("Test Accuracy = " + str(preds[1]))
120/120 [==============================] - 6s 49ms/step
Loss = 0.11732131155828635
Test Accuracy = 0.9666666666666667
Layer (type) Output Shape Param # Connected to
input_3 (InputLayer) (None, 64, 64, 3) 0
zero_padding2d_3 (ZeroPadding2D (None, 70, 70, 3) 0 input_3[0][0]
conv (Conv2D) (None, 32, 32, 64) 9472 zero_padding2d_3[0][0]
bn_conv1 (BatchNormalization) (None, 32, 32, 64) 256 conv[0][0]
activation_96 (Activation) (None, 32, 32, 64) 0 bn_conv1[0][0]
max_pooling2d_3 (MaxPooling2D) (None, 15, 15, 64) 0 activation_96[0][0]
res2a_branch2a (Conv2D) (None, 15, 15, 64) 4160 max_pooling2d_3[0][0]
bn2a_branch2a (BatchNormalizati (None, 15, 15, 64) 256 res2a_branch2a[0][0]
activation_97 (Activation) (None, 15, 15, 64) 0 bn2a_branch2a[0][0]
res2a_branch2b (Conv2D) (None, 15, 15, 64) 36928 activation_97[0][0]
bn2a_branch2b (BatchNormalizati (None, 15, 15, 64) 256 res2a_branch2b[0][0]
activation_98 (Activation) (None, 15, 15, 64) 0 bn2a_branch2b[0][0]
res2a_branch2c (Conv2D) (None, 15, 15, 256) 16640 activation_98[0][0]
res2a_branch1 (Conv2D) (None, 15, 15, 256) 16640 max_pooling2d_3[0][0]
bn2a_branch2c (BatchNormalizati (None, 15, 15, 256) 1024 res2a_branch2c[0][0]
bn2a_branch1 (BatchNormalizatio (None, 15, 15, 256) 1024 res2a_branch1[0][0]
add_32 (Add) (None, 15, 15, 256) 0 bn2a_branch2c[0][0]
activation_99 (Activation) (None, 15, 15, 256) 0 add_32[0][0]
res2b_branch2a (Conv2D) (None, 15, 15, 64) 16448 activation_99[0][0]
bn2b_branch2a (BatchNormalizati (None, 15, 15, 64) 256 res2b_branch2a[0][0]
activation_100 (Activation) (None, 15, 15, 64) 0 bn2b_branch2a[0][0]
res2b_branch2b (Conv2D) (None, 15, 15, 64) 36928 activation_100[0][0]
bn2b_branch2b (BatchNormalizati (None, 15, 15, 64) 256 res2b_branch2b[0][0]
activation_101 (Activation) (None, 15, 15, 64) 0 bn2b_branch2b[0][0]
res2b_branch2c (Conv2D) (None, 15, 15, 256) 16640 activation_101[0][0]
bn2b_branch2c (BatchNormalizati (None, 15, 15, 256) 1024 res2b_branch2c[0][0]
add_33 (Add) (None, 15, 15, 256) 0 bn2b_branch2c[0][0]
activation_102 (Activation) (None, 15, 15, 256) 0 add_33[0][0]
res2c_branch2a (Conv2D) (None, 15, 15, 64) 16448 activation_102[0][0]
bn2c_branch2a (BatchNormalizati (None, 15, 15, 64) 256 res2c_branch2a[0][0]
activation_103 (Activation) (None, 15, 15, 64) 0 bn2c_branch2a[0][0]
res2c_branch2b (Conv2D) (None, 15, 15, 64) 36928 activation_103[0][0]
bn2c_branch2b (BatchNormalizati (None, 15, 15, 64) 256 res2c_branch2b[0][0]
activation_104 (Activation) (None, 15, 15, 64) 0 bn2c_branch2b[0][0]
res2c_branch2c (Conv2D) (None, 15, 15, 256) 16640 activation_104[0][0]
bn2c_branch2c (BatchNormalizati (None, 15, 15, 256) 1024 res2c_branch2c[0][0]
add_34 (Add) (None, 15, 15, 256) 0 bn2c_branch2c[0][0]
activation_105 (Activation) (None, 15, 15, 256) 0 add_34[0][0]
res3a_branch2a (Conv2D) (None, 15, 15, 128) 32896 activation_105[0][0]
bn3a_branch2a (BatchNormalizati (None, 15, 15, 128) 512 res3a_branch2a[0][0]
activation_106 (Activation) (None, 15, 15, 128) 0 bn3a_branch2a[0][0]
res3a_branch2b (Conv2D) (None, 15, 15, 128) 147584 activation_106[0][0]
bn3a_branch2b (BatchNormalizati (None, 15, 15, 128) 512 res3a_branch2b[0][0]
activation_107 (Activation) (None, 15, 15, 128) 0 bn3a_branch2b[0][0]
res3a_branch2c (Conv2D) (None, 15, 15, 512) 66048 activation_107[0][0]
res3a_branch1 (Conv2D) (None, 15, 15, 512) 131584 activation_105[0][0]
bn3a_branch2c (BatchNormalizati (None, 15, 15, 512) 2048 res3a_branch2c[0][0]
bn3a_branch1 (BatchNormalizatio (None, 15, 15, 512) 2048 res3a_branch1[0][0]
add_35 (Add) (None, 15, 15, 512) 0 bn3a_branch2c[0][0]
activation_108 (Activation) (None, 15, 15, 512) 0 add_35[0][0]
res3b_branch2a (Conv2D) (None, 15, 15, 128) 65664 activation_108[0][0]
bn3b_branch2a (BatchNormalizati (None, 15, 15, 128) 512 res3b_branch2a[0][0]
activation_109 (Activation) (None, 15, 15, 128) 0 bn3b_branch2a[0][0]
res3b_branch2b (Conv2D) (None, 15, 15, 128) 147584 activation_109[0][0]
bn3b_branch2b (BatchNormalizati (None, 15, 15, 128) 512 res3b_branch2b[0][0]
activation_110 (Activation) (None, 15, 15, 128) 0 bn3b_branch2b[0][0]
res3b_branch2c (Conv2D) (None, 15, 15, 512) 66048 activation_110[0][0]
bn3b_branch2c (BatchNormalizati (None, 15, 15, 512) 2048 res3b_branch2c[0][0]
add_36 (Add) (None, 15, 15, 512) 0 bn3b_branch2c[0][0]
activation_111 (Activation) (None, 15, 15, 512) 0 add_36[0][0]
res3c_branch2a (Conv2D) (None, 15, 15, 128) 65664 activation_111[0][0]
bn3c_branch2a (BatchNormalizati (None, 15, 15, 128) 512 res3c_branch2a[0][0]
activation_112 (Activation) (None, 15, 15, 128) 0 bn3c_branch2a[0][0]
res3c_branch2b (Conv2D) (None, 15, 15, 128) 147584 activation_112[0][0]
bn3c_branch2b (BatchNormalizati (None, 15, 15, 128) 512 res3c_branch2b[0][0]
activation_113 (Activation) (None, 15, 15, 128) 0 bn3c_branch2b[0][0]
res3c_branch2c (Conv2D) (None, 15, 15, 512) 66048 activation_113[0][0]
bn3c_branch2c (BatchNormalizati (None, 15, 15, 512) 2048 res3c_branch2c[0][0]
add_37 (Add) (None, 15, 15, 512) 0 bn3c_branch2c[0][0]
activation_114 (Activation) (None, 15, 15, 512) 0 add_37[0][0]
res3d_branch2a (Conv2D) (None, 15, 15, 128) 65664 activation_114[0][0]
bn3d_branch2a (BatchNormalizati (None, 15, 15, 128) 512 res3d_branch2a[0][0]
activation_115 (Activation) (None, 15, 15, 128) 0 bn3d_branch2a[0][0]
res3d_branch2b (Conv2D) (None, 15, 15, 128) 147584 activation_115[0][0]
bn3d_branch2b (BatchNormalizati (None, 15, 15, 128) 512 res3d_branch2b[0][0]
activation_116 (Activation) (None, 15, 15, 128) 0 bn3d_branch2b[0][0]
res3d_branch2c (Conv2D) (None, 15, 15, 512) 66048 activation_116[0][0]
bn3d_branch2c (BatchNormalizati (None, 15, 15, 512) 2048 res3d_branch2c[0][0]
add_38 (Add) (None, 15, 15, 512) 0 bn3d_branch2c[0][0]
activation_117 (Activation) (None, 15, 15, 512) 0 add_38[0][0]
res4a_branch2a (Conv2D) (None, 8, 8, 256) 131328 activation_117[0][0]
bn4a_branch2a (BatchNormalizati (None, 8, 8, 256) 1024 res4a_branch2a[0][0]
activation_118 (Activation) (None, 8, 8, 256) 0 bn4a_branch2a[0][0]
res4a_branch2b (Conv2D) (None, 8, 8, 256) 590080 activation_118[0][0]
bn4a_branch2b (BatchNormalizati (None, 8, 8, 256) 1024 res4a_branch2b[0][0]
activation_119 (Activation) (None, 8, 8, 256) 0 bn4a_branch2b[0][0]
res4a_branch2c (Conv2D) (None, 8, 8, 1024) 263168 activation_119[0][0]
res4a_branch1 (Conv2D) (None, 8, 8, 1024) 525312 activation_117[0][0]
bn4a_branch2c (BatchNormalizati (None, 8, 8, 1024) 4096 res4a_branch2c[0][0]
bn4a_branch1 (BatchNormalizatio (None, 8, 8, 1024) 4096 res4a_branch1[0][0]
add_39 (Add) (None, 8, 8, 1024) 0 bn4a_branch2c[0][0]
activation_120 (Activation) (None, 8, 8, 1024) 0 add_39[0][0]
res4b_branch2a (Conv2D) (None, 8, 8, 256) 262400 activation_120[0][0]
bn4b_branch2a (BatchNormalizati (None, 8, 8, 256) 1024 res4b_branch2a[0][0]
activation_121 (Activation) (None, 8, 8, 256) 0 bn4b_branch2a[0][0]
res4b_branch2b (Conv2D) (None, 8, 8, 256) 590080 activation_121[0][0]
bn4b_branch2b (BatchNormalizati (None, 8, 8, 256) 1024 res4b_branch2b[0][0]
activation_122 (Activation) (None, 8, 8, 256) 0 bn4b_branch2b[0][0]
res4b_branch2c (Conv2D) (None, 8, 8, 1024) 263168 activation_122[0][0]
bn4b_branch2c (BatchNormalizati (None, 8, 8, 1024) 4096 res4b_branch2c[0][0]
add_40 (Add) (None, 8, 8, 1024) 0 bn4b_branch2c[0][0]
activation_123 (Activation) (None, 8, 8, 1024) 0 add_40[0][0]
res4c_branch2a (Conv2D) (None, 8, 8, 256) 262400 activation_123[0][0]
bn4c_branch2a (BatchNormalizati (None, 8, 8, 256) 1024 res4c_branch2a[0][0]
activation_124 (Activation) (None, 8, 8, 256) 0 bn4c_branch2a[0][0]
res4c_branch2b (Conv2D) (None, 8, 8, 256) 590080 activation_124[0][0]
bn4c_branch2b (BatchNormalizati (None, 8, 8, 256) 1024 res4c_branch2b[0][0]
activation_125 (Activation) (None, 8, 8, 256) 0 bn4c_branch2b[0][0]
res4c_branch2c (Conv2D) (None, 8, 8, 1024) 263168 activation_125[0][0]
bn4c_branch2c (BatchNormalizati (None, 8, 8, 1024) 4096 res4c_branch2c[0][0]
add_41 (Add) (None, 8, 8, 1024) 0 bn4c_branch2c[0][0]
activation_126 (Activation) (None, 8, 8, 1024) 0 add_41[0][0]
res4d_branch2a (Conv2D) (None, 8, 8, 256) 262400 activation_126[0][0]
bn4d_branch2a (BatchNormalizati (None, 8, 8, 256) 1024 res4d_branch2a[0][0]
activation_127 (Activation) (None, 8, 8, 256) 0 bn4d_branch2a[0][0]
res4d_branch2b (Conv2D) (None, 8, 8, 256) 590080 activation_127[0][0]
bn4d_branch2b (BatchNormalizati (None, 8, 8, 256) 1024 res4d_branch2b[0][0]
activation_128 (Activation) (None, 8, 8, 256) 0 bn4d_branch2b[0][0]
res4d_branch2c (Conv2D) (None, 8, 8, 1024) 263168 activation_128[0][0]
bn4d_branch2c (BatchNormalizati (None, 8, 8, 1024) 4096 res4d_branch2c[0][0]
add_42 (Add) (None, 8, 8, 1024) 0 bn4d_branch2c[0][0]
activation_129 (Activation) (None, 8, 8, 1024) 0 add_42[0][0]
res4e_branch2a (Conv2D) (None, 8, 8, 256) 262400 activation_129[0][0]
bn4e_branch2a (BatchNormalizati (None, 8, 8, 256) 1024 res4e_branch2a[0][0]
activation_130 (Activation) (None, 8, 8, 256) 0 bn4e_branch2a[0][0]
res4e_branch2b (Conv2D) (None, 8, 8, 256) 590080 activation_130[0][0]
bn4e_branch2b (BatchNormalizati (None, 8, 8, 256) 1024 res4e_branch2b[0][0]
activation_131 (Activation) (None, 8, 8, 256) 0 bn4e_branch2b[0][0]
res4e_branch2c (Conv2D) (None, 8, 8, 1024) 263168 activation_131[0][0]
bn4e_branch2c (BatchNormalizati (None, 8, 8, 1024) 4096 res4e_branch2c[0][0]
add_43 (Add) (None, 8, 8, 1024) 0 bn4e_branch2c[0][0]
activation_132 (Activation) (None, 8, 8, 1024) 0 add_43[0][0]
res5a_branch2a (Conv2D) (None, 4, 4, 512) 524800 activation_132[0][0]
bn5a_branch2a (BatchNormalizati (None, 4, 4, 512) 2048 res5a_branch2a[0][0]
activation_133 (Activation) (None, 4, 4, 512) 0 bn5a_branch2a[0][0]
res5a_branch2b (Conv2D) (None, 4, 4, 512) 2359808 activation_133[0][0]
bn5a_branch2b (BatchNormalizati (None, 4, 4, 512) 2048 res5a_branch2b[0][0]
activation_134 (Activation) (None, 4, 4, 512) 0 bn5a_branch2b[0][0]
res5a_branch2c (Conv2D) (None, 4, 4, 2048) 1050624 activation_134[0][0]
res5a_branch1 (Conv2D) (None, 4, 4, 2048) 2099200 activation_132[0][0]
bn5a_branch2c (BatchNormalizati (None, 4, 4, 2048) 8192 res5a_branch2c[0][0]
bn5a_branch1 (BatchNormalizatio (None, 4, 4, 2048) 8192 res5a_branch1[0][0]
add_44 (Add) (None, 4, 4, 2048) 0 bn5a_branch2c[0][0]
activation_135 (Activation) (None, 4, 4, 2048) 0 add_44[0][0]
res5b_branch2a (Conv2D) (None, 4, 4, 512) 1049088 activation_135[0][0]
bn5b_branch2a (BatchNormalizati (None, 4, 4, 512) 2048 res5b_branch2a[0][0]
activation_136 (Activation) (None, 4, 4, 512) 0 bn5b_branch2a[0][0]
res5b_branch2b (Conv2D) (None, 4, 4, 512) 2359808 activation_136[0][0]
bn5b_branch2b (BatchNormalizati (None, 4, 4, 512) 2048 res5b_branch2b[0][0]
activation_137 (Activation) (None, 4, 4, 512) 0 bn5b_branch2b[0][0]
res5b_branch2c (Conv2D) (None, 4, 4, 2048) 1050624 activation_137[0][0]
bn5b_branch2c (BatchNormalizati (None, 4, 4, 2048) 8192 res5b_branch2c[0][0]
add_45 (Add) (None, 4, 4, 2048) 0 bn5b_branch2c[0][0]
activation_138 (Activation) (None, 4, 4, 2048) 0 add_45[0][0]
res5c_branch2a (Conv2D) (None, 4, 4, 512) 1049088 activation_138[0][0]
bn5c_branch2a (BatchNormalizati (None, 4, 4, 512) 2048 res5c_branch2a[0][0]
activation_139 (Activation) (None, 4, 4, 512) 0 bn5c_branch2a[0][0]
res5c_branch2b (Conv2D) (None, 4, 4, 512) 2359808 activation_139[0][0]
bn5c_branch2b (BatchNormalizati (None, 4, 4, 512) 2048 res5c_branch2b[0][0]
activation_140 (Activation) (None, 4, 4, 512) 0 bn5c_branch2b[0][0]
res5c_branch2c (Conv2D) (None, 4, 4, 2048) 1050624 activation_140[0][0]
bn5c_branch2c (BatchNormalizati (None, 4, 4, 2048) 8192 res5c_branch2c[0][0]
add_46 (Add) (None, 4, 4, 2048) 0 bn5c_branch2c[0][0]
activation_141 (Activation) (None, 4, 4, 2048) 0 add_46[0][0]
average_pooling2d_3 (AveragePoo (None, 2, 2, 2048) 0 activation_141[0][0]
flatten_3 (Flatten) (None, 8192) 0 average_pooling2d_3[0][0]
fc6 (Dense) (None, 6) 49158 flatten_3[0][0]
Total params: 22,515,078
Trainable params: 22,465,030
Non-trainable params: 50,048