本文是《pytorch-tensorflow-Comparative study》,pytorch和tensorflow对比学习专栏,第四章——搭建模型范式(构建模型方法、训练模型范式)。
虽然说这两个框架在语法和接口的命名上有很多地方是不同的,但是深度学习的建模过程确实基本上都是一个套路的。
所以该笔记的笔记方式是:在使用相同的处理功能模块上,对比记录pytorch和tensorflow两者的API接口,和语法。
1,有利于深入理解深度学习建模过程流程。
2,有利于理解pytorch,和tensorflow设计上的不同,更加灵活的使用在自己的项目中。
3,有利于深入理解各个功能模块的使用。
本章节主要对比学习pytorch 和tensorflow有关搭建模型范式(构建模型方法、训练模型范式)部分的API接口,和语法。
构建模型的方式大致分为这几类:
1,继承nn.Module(tensorflow中为Model子类化)基类构建自定义模型。
2,使用nn.Sequential(tensorflow中为Sequential)按层顺序构建模型。
3,继承nn.Module(tensorflow中Model子类化)基类构建模型并辅助应用模型容器进行封装(nn.Sequential,nn.ModuleList,nn.ModuleDict),(tensorflow中为Sequential)。也就是方法1+方法2的结合。
4,tensorflow的特有方法,使用函数式API进行创建模型。
pytorch中: 第1种方式最为常见,第2种方式最简单,第3种方式最为灵活也较为复杂。推荐使用第1种方式构建模型。
**tensorflow中:**对于顺序结构的模型,优先使用Sequential方法构建。
如果模型有多输入或者多输出,或者模型需要共享权重,或者模型具有残差连接等非顺序结构,推荐使用函数式API进行创建。
如果无特定必要,尽可能避免使用Model子类化的方式构建模型,这种方式提供了极大的灵活性,但也有更大的概率出错。
pytorch
使用nn.Sequential按层顺序构建模型无需定义forward方法。仅仅适合于简单的模型。
以下是使用nn.Sequential搭建模型的一些等价方法。
利用add_module方法
net = nn.Sequential()
net.add_module("conv1",nn.Conv2d(in_channels=3,out_channels=32,kernel_size = 3))
net.add_module("pool1",nn.MaxPool2d(kernel_size = 2,stride = 2))
net.add_module("conv2",nn.Conv2d(in_channels=32,out_channels=64,kernel_size = 5))
net.add_module("pool2",nn.MaxPool2d(kernel_size = 2,stride = 2))
net.add_module("dropout",nn.Dropout2d(p = 0.1))
net.add_module("adaptive_pool",nn.AdaptiveMaxPool2d((1,1)))
net.add_module("flatten",nn.Flatten())
net.add_module("linear1",nn.Linear(64,32))
net.add_module("relu",nn.ReLU())
net.add_module("linear2",nn.Linear(32,1))
net.add_module("sigmoid",nn.Sigmoid())
print(net)
# Sequential(
# (conv1): Conv2d(3, 32, kernel_size=(3, 3), stride=(1, 1))
# (pool1): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
# (conv2): Conv2d(32, 64, kernel_size=(5, 5), stride=(1, 1))
# (pool2): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
# (dropout): Dropout2d(p=0.1, inplace=False)
# (adaptive_pool): AdaptiveMaxPool2d(output_size=(1, 1))
# (flatten): Flatten()
# (linear1): Linear(in_features=64, out_features=32, bias=True)
# (relu): ReLU()
# (linear2): Linear(in_features=32, out_features=1, bias=True)
# (sigmoid): Sigmoid()
# )
利用Sequential参数
net = nn.Sequential(
nn.Conv2d(in_channels=3,out_channels=32,kernel_size = 3),
nn.MaxPool2d(kernel_size = 2,stride = 2),
nn.Conv2d(in_channels=32,out_channels=64,kernel_size = 5),
nn.MaxPool2d(kernel_size = 2,stride = 2),
nn.Dropout2d(p = 0.1),
nn.AdaptiveMaxPool2d((1,1)),
nn.Flatten(),
nn.Linear(64,32),
nn.ReLU(),
nn.Linear(32,1),
nn.Sigmoid()
)
print(net)
# Sequential(
# (0): Conv2d(3, 32, kernel_size=(3, 3), stride=(1, 1))
# (1): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
# (2): Conv2d(32, 64, kernel_size=(5, 5), stride=(1, 1))
# (3): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
# (4): Dropout2d(p=0.1, inplace=False)
# (5): AdaptiveMaxPool2d(output_size=(1, 1))
# (6): Flatten()
# (7): Linear(in_features=64, out_features=32, bias=True)
# (8): ReLU()
# (9): Linear(in_features=32, out_features=1, bias=True)
# (10): Sigmoid()
# )
利用OrderedDict参数
from collections import OrderedDict
net = nn.Sequential(OrderedDict(
[("conv1",nn.Conv2d(in_channels=3,out_channels=32,kernel_size = 3)),
("pool1",nn.MaxPool2d(kernel_size = 2,stride = 2)),
("conv2",nn.Conv2d(in_channels=32,out_channels=64,kernel_size = 5)),
("pool2",nn.MaxPool2d(kernel_size = 2,stride = 2)),
("dropout",nn.Dropout2d(p = 0.1)),
("adaptive_pool",nn.AdaptiveMaxPool2d((1,1))),
("flatten",nn.Flatten()),
("linear1",nn.Linear(64,32)),
("relu",nn.ReLU()),
("linear2",nn.Linear(32,1)),
("sigmoid",nn.Sigmoid())
])
)
print(net)
# Sequential(
# (conv1): Conv2d(3, 32, kernel_size=(3, 3), stride=(1, 1))
# (pool1): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
# (conv2): Conv2d(32, 64, kernel_size=(5, 5), stride=(1, 1))
# (pool2): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
# (dropout): Dropout2d(p=0.1, inplace=False)
# (adaptive_pool): AdaptiveMaxPool2d(output_size=(1, 1))
# (flatten): Flatten()
# (linear1): Linear(in_features=64, out_features=32, bias=True)
# (relu): ReLU()
# (linear2): Linear(in_features=32, out_features=1, bias=True)
# (sigmoid): Sigmoid()
# )
tensorflow
使用add方法
tf.keras.backend.clear_session()
model = models.Sequential()
model.add(layers.Embedding(MAX_WORDS,7,input_length=MAX_LEN))
model.add(layers.Conv1D(filters = 64,kernel_size = 5,activation = "relu"))
model.add(layers.MaxPool1D(2))
model.add(layers.Conv1D(filters = 32,kernel_size = 3,activation = "relu"))
model.add(layers.MaxPool1D(2))
model.add(layers.Flatten())
model.add(layers.Dense(1,activation = "sigmoid"))
model.compile(optimizer='Nadam',
loss='binary_crossentropy',
metrics=['accuracy',"AUC"])
model.summary()
pytorch
以下是继承nn.Module基类构建自定义模型的一个范例。模型中的用到的层一般在__init__
函数中定义层,然后在forward
方法中定义模型的正向传播逻辑。
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
self.conv1 = nn.Conv2d(in_channels=3,out_channels=32,kernel_size = 3)
self.pool1 = nn.MaxPool2d(kernel_size = 2,stride = 2)
self.conv2 = nn.Conv2d(in_channels=32,out_channels=64,kernel_size = 5)
self.pool2 = nn.MaxPool2d(kernel_size = 2,stride = 2)
self.dropout = nn.Dropout2d(p = 0.1)
self.adaptive_pool = nn.AdaptiveMaxPool2d((1,1))
self.flatten = nn.Flatten()
self.linear1 = nn.Linear(64,32)
self.relu = nn.ReLU()
self.linear2 = nn.Linear(32,1)
self.sigmoid = nn.Sigmoid()
def forward(self,x):
x = self.conv1(x)
x = self.pool1(x)
x = self.conv2(x)
x = self.pool2(x)
x = self.dropout(x)
x = self.adaptive_pool(x)
x = self.flatten(x)
x = self.linear1(x)
x = self.relu(x)
x = self.linear2(x)
y = self.sigmoid(x)
return y
net = Net()
print(net)
# Net(
# (conv1): Conv2d(3, 32, kernel_size=(3, 3), stride=(1, 1))
# (pool1): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
# (conv2): Conv2d(32, 64, kernel_size=(5, 5), stride=(1, 1))
# (pool2): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
# (dropout): Dropout2d(p=0.1, inplace=False)
# (adaptive_pool): AdaptiveMaxPool2d(output_size=(1, 1))
# (flatten): Flatten()
# (linear1): Linear(in_features=64, out_features=32, bias=True)
# (relu): ReLU()
# (linear2): Linear(in_features=32, out_features=1, bias=True)
# (sigmoid): Sigmoid()
# )
tensorflow
以下是继承Module(layers.Layer继承自Module类)基类构建自定义模型的一个范例。模型中的用到的变量在__init__
函数中定义(pytorch的变量直接使用python定义的就可以),用到的层在build
函数中定义,然后在call
方法中定义模型的正向传播逻辑。如果要让自定义的Layer通过Functional API 组合成模型时可以序列化,需要自定义get_config
方法。
# 先自定义一个残差模块,为自定义Layer
class ResBlock(layers.Layer):
def __init__(self, kernel_size, **kwargs):
super(ResBlock, self).__init__(**kwargs)
self.kernel_size = kernel_size
def build(self,input_shape):
self.conv1 = layers.Conv1D(filters=64,kernel_size=self.kernel_size,
activation = "relu",padding="same")
self.conv2 = layers.Conv1D(filters=32,kernel_size=self.kernel_size,
activation = "relu",padding="same")
self.conv3 = layers.Conv1D(filters=input_shape[-1],
kernel_size=self.kernel_size,activation = "relu",padding="same")
self.maxpool = layers.MaxPool1D(2)
super(ResBlock,self).build(input_shape) # 相当于设置self.built = True
def call(self, inputs):
x = self.conv1(inputs)
x = self.conv2(x)
x = self.conv3(x)
x = layers.Add()([inputs,x])
x = self.maxpool(x)
return x
#如果要让自定义的Layer通过Functional API 组合成模型时可以序列化,需要自定义get_config方法。
def get_config(self):
config = super(ResBlock, self).get_config()
config.update({'kernel_size': self.kernel_size})
return config
# 测试ResBlock
resblock = ResBlock(kernel_size = 3)
resblock.build(input_shape = (None,200,7))
resblock.compute_output_shape(input_shape=(None,200,7))
# TensorShape([None, 100, 7])
pytorch
当模型的结构比较复杂时,我们可以应用模型容器(nn.Sequential,nn.ModuleList,nn.ModuleDict)对模型的部分结构进行封装。
这样做会让模型整体更加有层次感,有时候也能减少代码量。
注意,在下面的范例中我们每次仅仅使用一种模型容器,但实际上这些模型容器的使用是非常灵活的,可以在一个模型中任意组合任意嵌套使用。
nn.Sequential作为模型容器
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
self.conv = nn.Sequential(
nn.Conv2d(in_channels=3,out_channels=32,kernel_size = 3),
nn.MaxPool2d(kernel_size = 2,stride = 2),
nn.Conv2d(in_channels=32,out_channels=64,kernel_size = 5),
nn.MaxPool2d(kernel_size = 2,stride = 2),
nn.Dropout2d(p = 0.1),
nn.AdaptiveMaxPool2d((1,1))
)
self.dense = nn.Sequential(
nn.Flatten(),
nn.Linear(64,32),
nn.ReLU(),
nn.Linear(32,1),
nn.Sigmoid()
)
def forward(self,x):
x = self.conv(x)
y = self.dense(x)
return y
net = Net()
print(net)
# Net(
# (conv): Sequential(
# (0): Conv2d(3, 32, kernel_size=(3, 3), stride=(1, 1))
# (1): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
# (2): Conv2d(32, 64, kernel_size=(5, 5), stride=(1, 1))
# (3): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
# (4): Dropout2d(p=0.1, inplace=False)
# (5): AdaptiveMaxPool2d(output_size=(1, 1))
# )
# (dense): Sequential(
# (0): Flatten()
# (1): Linear(in_features=64, out_features=32, bias=True)
# (2): ReLU()
# (3): Linear(in_features=32, out_features=1, bias=True)
# (4): Sigmoid()
# )
# )
nn.ModuleList作为模型容器
注意下面中的ModuleList不能用Python中的列表代替。
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
self.layers = nn.ModuleList([
nn.Conv2d(in_channels=3,out_channels=32,kernel_size = 3),
nn.MaxPool2d(kernel_size = 2,stride = 2),
nn.Conv2d(in_channels=32,out_channels=64,kernel_size = 5),
nn.MaxPool2d(kernel_size = 2,stride = 2),
nn.Dropout2d(p = 0.1),
nn.AdaptiveMaxPool2d((1,1)),
nn.Flatten(),
nn.Linear(64,32),
nn.ReLU(),
nn.Linear(32,1),
nn.Sigmoid()]
)
def forward(self,x):
for layer in self.layers:
x = layer(x)
return x
net = Net()
print(net)
# Net(
# (layers): ModuleList(
# (0): Conv2d(3, 32, kernel_size=(3, 3), stride=(1, 1))
# (1): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
# (2): Conv2d(32, 64, kernel_size=(5, 5), stride=(1, 1))
# (3): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
# (4): Dropout2d(p=0.1, inplace=False)
# (5): AdaptiveMaxPool2d(output_size=(1, 1))
# (6): Flatten()
# (7): Linear(in_features=64, out_features=32, bias=True)
# (8): ReLU()
# (9): Linear(in_features=32, out_features=1, bias=True)
# (10): Sigmoid()
# )
# )
nn.ModuleDict作为模型容器
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
self.layers_dict = nn.ModuleDict({"conv1":nn.Conv2d(in_channels=3,out_channels=32,kernel_size = 3),
"pool": nn.MaxPool2d(kernel_size = 2,stride = 2),
"conv2":nn.Conv2d(in_channels=32,out_channels=64,kernel_size = 5),
"dropout": nn.Dropout2d(p = 0.1),
"adaptive":nn.AdaptiveMaxPool2d((1,1)),
"flatten": nn.Flatten(),
"linear1": nn.Linear(64,32),
"relu":nn.ReLU(),
"linear2": nn.Linear(32,1),
"sigmoid": nn.Sigmoid()
})
def forward(self,x):
layers = ["conv1","pool","conv2","pool","dropout","adaptive",
"flatten","linear1","relu","linear2","sigmoid"]
for layer in layers:
x = self.layers_dict[layer](x)
return x
net = Net()
print(net)
# Net(
# (layers_dict): ModuleDict(
# (adaptive): AdaptiveMaxPool2d(output_size=(1, 1))
# (conv1): Conv2d(3, 32, kernel_size=(3, 3), stride=(1, 1))
# (conv2): Conv2d(32, 64, kernel_size=(5, 5), stride=(1, 1))
# (dropout): Dropout2d(p=0.1, inplace=False)
# (flatten): Flatten()
# (linear1): Linear(in_features=64, out_features=32, bias=True)
# (linear2): Linear(in_features=32, out_features=1, bias=True)
# (pool): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
# (relu): ReLU()
# (sigmoid): Sigmoid()
# )
# )
tensorflow
自定义模型,实际上也可以使用Sequential或者FunctionalAPI
class ImdbModel(models.Model):
def __init__(self):
super(ImdbModel, self).__init__()
def build(self,input_shape):
# 使用FunctionalAPI
self.embedding = layers.Embedding(MAX_WORDS,7)
self.block1 = ResBlock(7)
self.block2 = ResBlock(5)
# 使用Sequential
self.dense = models.Sequential([ayers.Dense(1,activation = "sigmoid")])
"""
# 使用FunctionalAPI
self.dense = layers.Dense(1,activation = "sigmoid")
"""
super(ImdbModel,self).build(input_shape)
def call(self, x):
x = self.embedding(x)
x = self.block1(x)
x = self.block2(x)
x = layers.Flatten()(x)
x = self.dense(x)
return(x)
tf.keras.backend.clear_session()
model = ImdbModel()
model.build(input_shape =(None,200))
model.summary()
"""
Model: "imdb_model"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
embedding (Embedding) multiple 70000
_________________________________________________________________
res_block (ResBlock) multiple 19143
_________________________________________________________________
res_block_1 (ResBlock) multiple 13703
_________________________________________________________________
dense (Dense) multiple 351
=================================================================
Total params: 103,197
Trainable params: 103,197
Non-trainable params: 0
_________________________________________________________________
"""
这个方式只有tensorflow有
tensorflow
tf.keras.backend.clear_session()
inputs = layers.Input(shape=[MAX_LEN])
x = layers.Embedding(MAX_WORDS,7)(inputs)
branch1 = layers.SeparableConv1D(64,3,activation="relu")(x)
branch1 = layers.MaxPool1D(3)(branch1)
branch1 = layers.SeparableConv1D(32,3,activation="relu")(branch1)
branch1 = layers.GlobalMaxPool1D()(branch1)
branch2 = layers.SeparableConv1D(64,5,activation="relu")(x)
branch2 = layers.MaxPool1D(5)(branch2)
branch2 = layers.SeparableConv1D(32,5,activation="relu")(branch2)
branch2 = layers.GlobalMaxPool1D()(branch2)
branch3 = layers.SeparableConv1D(64,7,activation="relu")(x)
branch3 = layers.MaxPool1D(7)(branch3)
branch3 = layers.SeparableConv1D(32,7,activation="relu")(branch3)
branch3 = layers.GlobalMaxPool1D()(branch3)
concat = layers.Concatenate()([branch1,branch2,branch3])
outputs = layers.Dense(1,activation = "sigmoid")(concat)
model = models.Model(inputs = inputs,outputs = outputs)
model.compile(optimizer='Nadam',
loss='binary_crossentropy',
metrics=['accuracy',"AUC"])
model.summary()
"""
Model: "model"
____________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
============================================================================================
input_1 (InputLayer) [(None, 200)] 0
____________________________________________________________________________________________
embedding (Embedding) (None, 200, 7) 70000 input_1[0][0]
____________________________________________________________________________________________
separable_conv1d (SeparableConv (None, 198, 64) 533 embedding[0][0]
____________________________________________________________________________________________
separable_conv1d_2 (SeparableCo (None, 196, 64) 547 embedding[0][0]
____________________________________________________________________________________________
separable_conv1d_4 (SeparableCo (None, 194, 64) 561 embedding[0][0]
____________________________________________________________________________________________
max_pooling1d (MaxPooling1D) (None, 66, 64) 0 separable_conv1d[0][0]
____________________________________________________________________________________________
max_pooling1d_1 (MaxPooling1D) (None, 39, 64) 0 separable_conv1d_2[0][0]
____________________________________________________________________________________________
max_pooling1d_2 (MaxPooling1D) (None, 27, 64) 0 separable_conv1d_4[0][0]
____________________________________________________________________________________________
separable_conv1d_1 (SeparableCo (None, 64, 32) 2272 max_pooling1d[0][0]
____________________________________________________________________________________________
separable_conv1d_3 (SeparableCo (None, 35, 32) 2400 max_pooling1d_1[0][0]
____________________________________________________________________________________________
separable_conv1d_5 (SeparableCo (None, 21, 32) 2528 max_pooling1d_2[0][0]
____________________________________________________________________________________________
global_max_pooling1d (GlobalMax (None, 32) 0 separable_conv1d_1[0][0]
____________________________________________________________________________________________
global_max_pooling1d_1 (GlobalM (None, 32) 0 separable_conv1d_3[0][0]
____________________________________________________________________________________________
global_max_pooling1d_2 (GlobalM (None, 32) 0 separable_conv1d_5[0][0]
____________________________________________________________________________________________
concatenate (Concatenate) (None, 96) 0 global_max_pooling1d[0][0]
global_max_pooling1d_1[0][0
global_max_pooling1d_2[0][0
____________________________________________________________________________________________
dense (Dense) (None, 1) 97 concatenate[0][0]
============================================================================================
Total params: 78,938
Trainable params: 78,938
Non-trainable params: 0
____________________________________________________________________________________________
"""
**Pytorch中:**通常需要用户编写自定义训练循环,训练循环的代码风格因人而异。
有3类典型的训练循环代码风格:脚本形式训练循环,函数形式训练循环,类形式训练循环。
下面以minist数据集的分类模型的训练为例,演示这3种训练模型的风格。
其中类形式训练循环我们会使用torchkeras.Model和torchkeras.LightModel这两种方法(仿照tf.keras.Model的功能对Pytorch的nn.Module进行了封装,设计了torchkeras.Model类,实现了 fit, validate,predict, summary 方法,相当于用户自定义高阶API。)。
**tensorflow中:**模型的训练主要有自定义训练循环(和pytorch的函数风格一致),内置tran_on_batch方法、内置fit方法。
下面以新闻稿数据集的分类模型的训练为例,演示这3种训练模型的风格。
构建数据集
pytroch
import torch
from torch import nn
import torchvision
from torchvision import transforms
transform = transforms.Compose([transforms.ToTensor()])
# minist数据集
ds_train = torchvision.datasets.MNIST(root="./data/minist/",train=True,download=True,transform=transform)
ds_valid = torchvision.datasets.MNIST(root="./data/minist/",train=False,download=True,transform=transform)
dl_train = torch.utils.data.DataLoader(ds_train, batch_size=128, shuffle=True, num_workers=4)
dl_valid = torch.utils.data.DataLoader(ds_valid, batch_size=128, shuffle=False, num_workers=4)
print(len(ds_train))
print(len(ds_valid))
# 60000
# 10000
tensorflow
MAX_LEN = 300
BATCH_SIZE = 32
# 新闻稿数据集
(x_train,y_train),(x_test,y_test) = datasets.reuters.load_data()
x_train = preprocessing.sequence.pad_sequences(x_train,maxlen=MAX_LEN)
x_test = preprocessing.sequence.pad_sequences(x_test,maxlen=MAX_LEN)
MAX_WORDS = x_train.max()+1
CAT_NUM = y_train.max()+1
ds_train = tf.data.Dataset.from_tensor_slices((x_train,y_train)) \
.shuffle(buffer_size = 1000).batch(BATCH_SIZE) \
.prefetch(tf.data.experimental.AUTOTUNE).cache()
ds_test = tf.data.Dataset.from_tensor_slices((x_test,y_test)) \
.shuffle(buffer_size = 1000).batch(BATCH_SIZE) \
.prefetch(tf.data.experimental.AUTOTUNE).cache()
一般来说脚本风格的训练循环最为常见。tensorflow并不支持脚本风格,因为他的训练过程都是基于@tf.function修饰后的函数,这中间有个编译的过程,不能像原生的python语句一样脚本形式实现。
pytorch
# 定义模型
net = nn.Sequential()
net.add_module("conv1",nn.Conv2d(in_channels=1,out_channels=32,kernel_size = 3))
net.add_module("pool1",nn.MaxPool2d(kernel_size = 2,stride = 2))
net.add_module("conv2",nn.Conv2d(in_channels=32,out_channels=64,kernel_size = 5))
net.add_module("pool2",nn.MaxPool2d(kernel_size = 2,stride = 2))
net.add_module("dropout",nn.Dropout2d(p = 0.1))
net.add_module("adaptive_pool",nn.AdaptiveMaxPool2d((1,1)))
net.add_module("flatten",nn.Flatten())
net.add_module("linear1",nn.Linear(64,32))
net.add_module("relu",nn.ReLU())
net.add_module("linear2",nn.Linear(32,10))
print(net)
Sequential(
(conv1): Conv2d(1, 32, kernel_size=(3, 3), stride=(1, 1))
(pool1): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(conv2): Conv2d(32, 64, kernel_size=(5, 5), stride=(1, 1))
(pool2): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(dropout): Dropout2d(p=0.1, inplace=False)
(adaptive_pool): AdaptiveMaxPool2d(output_size=(1, 1))
(flatten): Flatten()
(linear1): Linear(in_features=64, out_features=32, bias=True)
(relu): ReLU()
(linear2): Linear(in_features=32, out_features=10, bias=True)
)
# 训练脚本
import datetime
import numpy as np
import pandas as pd
from sklearn.metrics import accuracy_score
def accuracy(y_pred,y_true):
y_pred_cls = torch.argmax(nn.Softmax(dim=1)(y_pred),dim=1).data
return accuracy_score(y_true,y_pred_cls)
loss_func = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(params=net.parameters(),lr = 0.01)
metric_func = accuracy
metric_name = "accuracy"
epochs = 3
log_step_freq = 100
dfhistory = pd.DataFrame(columns = ["epoch","loss",metric_name,"val_loss","val_"+metric_name])
print("Start Training...")
nowtime = datetime.datetime.now().strftime('%Y-%m-%d %H:%M:%S')
print("=========="*8 + "%s"%nowtime)
for epoch in range(1,epochs+1):
# 1,训练循环-------------------------------------------------
net.train()
loss_sum = 0.0
metric_sum = 0.0
step = 1
for step, (features,labels) in enumerate(dl_train, 1):
# 梯度清零
optimizer.zero_grad()
# 正向传播求损失
predictions = net(features)
loss = loss_func(predictions,labels)
metric = metric_func(predictions,labels)
# 反向传播求梯度
loss.backward()
optimizer.step()
# 打印batch级别日志
loss_sum += loss.item()
metric_sum += metric.item()
if step%log_step_freq == 0:
print(("[step = %d] loss: %.3f, "+metric_name+": %.3f") %
(step, loss_sum/step, metric_sum/step))
# 2,验证循环-------------------------------------------------
net.eval()
val_loss_sum = 0.0
val_metric_sum = 0.0
val_step = 1
for val_step, (features,labels) in enumerate(dl_valid, 1):
with torch.no_grad():
predictions = net(features)
val_loss = loss_func(predictions,labels)
val_metric = metric_func(predictions,labels)
val_loss_sum += val_loss.item()
val_metric_sum += val_metric.item()
# 3,记录日志-------------------------------------------------
info = (epoch, loss_sum/step, metric_sum/step,
val_loss_sum/val_step, val_metric_sum/val_step)
dfhistory.loc[epoch-1] = info
# 打印epoch级别日志
print(("\nEPOCH = %d, loss = %.3f,"+ metric_name + \
" = %.3f, val_loss = %.3f, "+"val_"+ metric_name+" = %.3f")
%info)
nowtime = datetime.datetime.now().strftime('%Y-%m-%d %H:%M:%S')
print("\n"+"=========="*8 + "%s"%nowtime)
print('Finished Training...')
Start Training...
================================================================================2020-06-26 12:49:16
[step = 100] loss: 0.742, accuracy: 0.745
[step = 200] loss: 0.466, accuracy: 0.843
[step = 300] loss: 0.363, accuracy: 0.880
[step = 400] loss: 0.310, accuracy: 0.898
EPOCH = 1, loss = 0.281,accuracy = 0.908, val_loss = 0.087, val_accuracy = 0.972
================================================================================2020-06-26 12:50:32
[step = 100] loss: 0.103, accuracy: 0.970
[step = 200] loss: 0.114, accuracy: 0.966
[step = 300] loss: 0.112, accuracy: 0.967
[step = 400] loss: 0.108, accuracy: 0.968
EPOCH = 2, loss = 0.111,accuracy = 0.967, val_loss = 0.082, val_accuracy = 0.976
...
pytorch中:该风格在脚本形式上作了简单的函数封装。
tensorflow中:自定义训练循环无需compile模型,直接利用优化器根据损失函数反向传播迭代参数,拥有最高的灵活性。
pytorch
# 定义模型
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
self.layers = nn.ModuleList([
nn.Conv2d(in_channels=1,out_channels=32,kernel_size = 3),
nn.MaxPool2d(kernel_size = 2,stride = 2),
nn.Conv2d(in_channels=32,out_channels=64,kernel_size = 5),
nn.MaxPool2d(kernel_size = 2,stride = 2),
nn.Dropout2d(p = 0.1),
nn.AdaptiveMaxPool2d((1,1)),
nn.Flatten(),
nn.Linear(64,32),
nn.ReLU(),
nn.Linear(32,10)]
)
def forward(self,x):
for layer in self.layers:
x = layer(x)
return x
net = Net()
print(net)
Net(
(layers): ModuleList(
(0): Conv2d(1, 32, kernel_size=(3, 3), stride=(1, 1))
(1): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(2): Conv2d(32, 64, kernel_size=(5, 5), stride=(1, 1))
(3): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(4): Dropout2d(p=0.1, inplace=False)
(5): AdaptiveMaxPool2d(output_size=(1, 1))
(6): Flatten()
(7): Linear(in_features=64, out_features=32, bias=True)
(8): ReLU()
(9): Linear(in_features=32, out_features=10, bias=True)
)
)
# 函数式训练脚本
import datetime
import numpy as np
import pandas as pd
from sklearn.metrics import accuracy_score
def accuracy(y_pred,y_true):
y_pred_cls = torch.argmax(nn.Softmax(dim=1)(y_pred),dim=1).data
return accuracy_score(y_true,y_pred_cls)
model = net
model.optimizer = torch.optim.SGD(model.parameters(),lr = 0.01)
model.loss_func = nn.CrossEntropyLoss()
model.metric_func = accuracy
model.metric_name = "accuracy"
def train_step(model,features,labels):
# 训练模式,dropout层发生作用
model.train()
# 梯度清零
model.optimizer.zero_grad()
# 正向传播求损失
predictions = model(features)
loss = model.loss_func(predictions,labels)
metric = model.metric_func(predictions,labels)
# 反向传播求梯度
loss.backward()
model.optimizer.step()
return loss.item(),metric.item()
@torch.no_grad()
def valid_step(model,features,labels):
# 预测模式,dropout层不发生作用
model.eval()
predictions = model(features)
loss = model.loss_func(predictions,labels)
metric = model.metric_func(predictions,labels)
return loss.item(), metric.item()
# 测试train_step效果
features,labels = next(iter(dl_train))
train_step(model,features,labels)
# (2.32741117477417, 0.1015625)
# 开始训练
def train_model(model,epochs,dl_train,dl_valid,log_step_freq):
metric_name = model.metric_name
dfhistory = pd.DataFrame(columns = ["epoch","loss",metric_name,"val_loss","val_"+metric_name])
print("Start Training...")
nowtime = datetime.datetime.now().strftime('%Y-%m-%d %H:%M:%S')
print("=========="*8 + "%s"%nowtime)
for epoch in range(1,epochs+1):
# 1,训练循环-------------------------------------------------
loss_sum = 0.0
metric_sum = 0.0
step = 1
for step, (features,labels) in enumerate(dl_train, 1):
loss,metric = train_step(model,features,labels)
# 打印batch级别日志
loss_sum += loss
metric_sum += metric
if step%log_step_freq == 0:
print(("[step = %d] loss: %.3f, "+metric_name+": %.3f") %
(step, loss_sum/step, metric_sum/step))
# 2,验证循环-------------------------------------------------
val_loss_sum = 0.0
val_metric_sum = 0.0
val_step = 1
for val_step, (features,labels) in enumerate(dl_valid, 1):
val_loss,val_metric = valid_step(model,features,labels)
val_loss_sum += val_loss
val_metric_sum += val_metric
# 3,记录日志-------------------------------------------------
info = (epoch, loss_sum/step, metric_sum/step,
val_loss_sum/val_step, val_metric_sum/val_step)
dfhistory.loc[epoch-1] = info
# 打印epoch级别日志
print(("\nEPOCH = %d, loss = %.3f,"+ metric_name + \
" = %.3f, val_loss = %.3f, "+"val_"+ metric_name+" = %.3f")
%info)
nowtime = datetime.datetime.now().strftime('%Y-%m-%d %H:%M:%S')
print("\n"+"=========="*8 + "%s"%nowtime)
print('Finished Training...')
return dfhistory
epochs = 3
dfhistory = train_model(model,epochs,dl_train,dl_valid,log_step_freq = 100)
Start Training...
================================================================================2020-06-26 13:10:00
[step = 100] loss: 2.298, accuracy: 0.137
[step = 200] loss: 2.288, accuracy: 0.145
[step = 300] loss: 2.278, accuracy: 0.165
[step = 400] loss: 2.265, accuracy: 0.183
EPOCH = 1, loss = 2.254,accuracy = 0.195, val_loss = 2.158, val_accuracy = 0.301
================================================================================2020-06-26 13:11:23
[step = 100] loss: 2.127, accuracy: 0.302
[step = 200] loss: 2.080, accuracy: 0.338
[step = 300] loss: 2.025, accuracy: 0.374
[step = 400] loss: 1.957, accuracy: 0.411
EPOCH = 2, loss = 1.905,accuracy = 0.435, val_loss = 1.469, val_accuracy = 0.710
tensorflow
# 定义模型
tf.keras.backend.clear_session()
def create_model():
model = models.Sequential()
model.add(layers.Embedding(MAX_WORDS,7,input_length=MAX_LEN))
model.add(layers.Conv1D(filters = 64,kernel_size = 5,activation = "relu"))
model.add(layers.MaxPool1D(2))
model.add(layers.Conv1D(filters = 32,kernel_size = 3,activation = "relu"))
model.add(layers.MaxPool1D(2))
model.add(layers.Flatten())
model.add(layers.Dense(CAT_NUM,activation = "softmax"))
return(model)
model = create_model()
model.summary()
# 自定义训练过程
optimizer = optimizers.Nadam()
loss_func = losses.SparseCategoricalCrossentropy()
train_loss = metrics.Mean(name='train_loss')
train_metric = metrics.SparseCategoricalAccuracy(name='train_accuracy')
valid_loss = metrics.Mean(name='valid_loss')
valid_metric = metrics.SparseCategoricalAccuracy(name='valid_accuracy')
@tf.function
def train_step(model, features, labels):
with tf.GradientTape() as tape:
predictions = model(features,training = True)
loss = loss_func(labels, predictions)
gradients = tape.gradient(loss, model.trainable_variables)
optimizer.apply_gradients(zip(gradients, model.trainable_variables))
train_loss.update_state(loss)
train_metric.update_state(labels, predictions)
@tf.function
def valid_step(model, features, labels):
predictions = model(features)
batch_loss = loss_func(labels, predictions)
valid_loss.update_state(batch_loss)
valid_metric.update_state(labels, predictions)
def train_model(model,ds_train,ds_valid,epochs):
for epoch in tf.range(1,epochs+1):
for features, labels in ds_train:
train_step(model,features,labels)
for features, labels in ds_valid:
valid_step(model,features,labels)
logs = 'Epoch={},Loss:{},Accuracy:{},Valid Loss:{},Valid Accuracy:{}'
if epoch%1 ==0:
printbar()
tf.print(tf.strings.format(logs,
(epoch,train_loss.result(),train_metric.result(),valid_loss.result(),valid_metric.result())))
tf.print("")
train_loss.reset_states()
valid_loss.reset_states()
train_metric.reset_states()
valid_metric.reset_states()
train_model(model,ds_train,ds_test,10)
================================================================================13:12:03
Epoch=1,Loss:2.02051544,Accuracy:0.460253835,Valid Loss:1.75700927,Valid Accuracy:0.536954582
================================================================================13:12:09
Epoch=2,Loss:1.510795,Accuracy:0.610665798,Valid Loss:1.55349839,Valid Accuracy:0.616206586
================================================================================13:12:17
Epoch=3,Loss:1.19221532,Accuracy:0.696170092,Valid Loss:1.52315605,Valid Accuracy:0.651380241
tensorflow
train_on_batch是tensorflow特有的功能API,其中封装了常用的训练过程代码,直接使用更方便。相当于函数式训练过程的进一步封装。该内置方法相比较fit方法更加灵活,可以不通过回调函数而直接在批次层次上更加精细地控制训练的过程。
# 创建模型
tf.keras.backend.clear_session()
def create_model():
model = models.Sequential()
model.add(layers.Embedding(MAX_WORDS,7,input_length=MAX_LEN))
model.add(layers.Conv1D(filters = 64,kernel_size = 5,activation = "relu"))
model.add(layers.MaxPool1D(2))
model.add(layers.Conv1D(filters = 32,kernel_size = 3,activation = "relu"))
model.add(layers.MaxPool1D(2))
model.add(layers.Flatten())
model.add(layers.Dense(CAT_NUM,activation = "softmax"))
return(model)
def compile_model(model):
model.compile(optimizer=optimizers.Nadam(),
loss=losses.SparseCategoricalCrossentropy(),
metrics=[metrics.SparseCategoricalAccuracy(),metrics.SparseTopKCategoricalAccuracy(5)])
return(model)
model = create_model()
model.summary()
model = compile_model(model)
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
embedding (Embedding) (None, 300, 7) 216874
_________________________________________________________________
conv1d (Conv1D) (None, 296, 64) 2304
_________________________________________________________________
max_pooling1d (MaxPooling1D) (None, 148, 64) 0
_________________________________________________________________
conv1d_1 (Conv1D) (None, 146, 32) 6176
_________________________________________________________________
max_pooling1d_1 (MaxPooling1 (None, 73, 32) 0
_________________________________________________________________
flatten (Flatten) (None, 2336) 0
_________________________________________________________________
dense (Dense) (None, 46) 107502
=================================================================
Total params: 332,856
Trainable params: 332,856
Non-trainable params: 0
_________________________________________________________________
# 训练过程
def train_model(model,ds_train,ds_valid,epoches):
for epoch in tf.range(1,epoches+1):
model.reset_metrics()
# 在后期降低学习率
if epoch == 5:
model.optimizer.lr.assign(model.optimizer.lr/2.0)
tf.print("Lowering optimizer Learning Rate...\n\n")
for x, y in ds_train:
train_result = model.train_on_batch(x, y)
for x, y in ds_valid:
valid_result = model.test_on_batch(x, y,reset_metrics=False)
if epoch%1 ==0:
printbar()
tf.print("epoch = ",epoch)
print("train:",dict(zip(model.metrics_names,train_result)))
print("valid:",dict(zip(model.metrics_names,valid_result)))
print("")
train_model(model,ds_train,ds_test,10)
================================================================================13:09:19
epoch = 1
train: {'loss': 0.82411176, 'sparse_categorical_accuracy': 0.77272725, 'sparse_top_k_categorical_accuracy': 0.8636364}
valid: {'loss': 1.9265995, 'sparse_categorical_accuracy': 0.5743544, 'sparse_top_k_categorical_accuracy': 0.75779164}
================================================================================13:09:27
epoch = 2
train: {'loss': 0.6006621, 'sparse_categorical_accuracy': 0.90909094, 'sparse_top_k_categorical_accuracy': 0.95454544}
valid: {'loss': 1.844159, 'sparse_categorical_accuracy': 0.6126447, 'sparse_top_k_categorical_accuracy': 0.7920748}
**tensorflow中:**内置fit方法,进行模型的训练,该方法功能非常强大, 支持对numpy array, tf.data.Dataset以及 Python generator数据进行训练。
并且可以通过设置回调函数实现对训练过程的复杂控制逻辑。
**pytroch中:**没有tf那种高阶API,也就 没有内置的fit方法,但是在类形式训练循环中使用torchkeras.Model和torchkeras.LightModel这两类。(仿照tf.keras.Model的功能对Pytorch的nn.Module进行了封装,设计了torchkeras.Model类,实现了 fit, validate,predict, summary 方法,相当于用户自定义高阶API)。
pytroch
类风格 torchkeras.Model
此处使用torchkeras.Model构建模型,并调用compile方法和fit方法训练模型。
使用该形式训练模型非常简洁明了。
import torchkeras
# 创建模型#
class CnnModel(nn.Module):
def __init__(self):
super().__init__()
self.layers = nn.ModuleList([
nn.Conv2d(in_channels=1,out_channels=32,kernel_size = 3),
nn.MaxPool2d(kernel_size = 2,stride = 2),
nn.Conv2d(in_channels=32,out_channels=64,kernel_size = 5),
nn.MaxPool2d(kernel_size = 2,stride = 2),
nn.Dropout2d(p = 0.1),
nn.AdaptiveMaxPool2d((1,1)),
nn.Flatten(),
nn.Linear(64,32),
nn.ReLU(),
nn.Linear(32,10)]
)
def forward(self,x):
for layer in self.layers:
x = layer(x)
return x
model = torchkeras.Model(CnnModel())
print(model)
# CnnModel(
# (layers): ModuleList(
# (0): Conv2d(1, 32, kernel_size=(3, 3), stride=(1, 1))
# (1): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
# (2): Conv2d(32, 64, kernel_size=(5, 5), stride=(1, 1))
# (3): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
# (4): Dropout2d(p=0.1, inplace=False)
# (5): AdaptiveMaxPool2d(output_size=(1, 1))
# (6): Flatten()
# (7): Linear(in_features=64, out_features=32, bias=True)
# (8): ReLU()
# (9): Linear(in_features=32, out_features=10, bias=True)
# )
# )
from sklearn.metrics import accuracy_score
def accuracy(y_pred,y_true):
y_pred_cls = torch.argmax(nn.Softmax(dim=1)(y_pred),dim=1).data
return accuracy_score(y_true.numpy(),y_pred_cls.numpy())
# 调用compile方法和fit方法训练模型。
model.compile(loss_func = nn.CrossEntropyLoss(),
optimizer= torch.optim.Adam(model.parameters(),lr = 0.02),
metrics_dict={"accuracy":accuracy})
dfhistory = model.fit(3,dl_train = dl_train, dl_val=dl_valid, log_step_freq=100)
Start Training ...
================================================================================2020-06-26 13:22:39
{'step': 100, 'loss': 0.976, 'accuracy': 0.664}
{'step': 200, 'loss': 0.611, 'accuracy': 0.795}
{'step': 300, 'loss': 0.478, 'accuracy': 0.841}
{'step': 400, 'loss': 0.403, 'accuracy': 0.868}
+-------+-------+----------+----------+--------------+
| epoch | loss | accuracy | val_loss | val_accuracy |
+-------+-------+----------+----------+--------------+
| 1 | 0.371 | 0.879 | 0.087 | 0.972 |
+-------+-------+----------+----------+--------------+
================================================================================2020-06-26 13:23:59
{'step': 100, 'loss': 0.182, 'accuracy': 0.948}
{'step': 200, 'loss': 0.176, 'accuracy': 0.949}
{'step': 300, 'loss': 0.173, 'accuracy': 0.95}
{'step': 400, 'loss': 0.174, 'accuracy': 0.951}
+-------+-------+----------+----------+--------------+
| epoch | loss | accuracy | val_loss | val_accuracy |
+-------+-------+----------+----------+--------------+
| 2 | 0.175 | 0.951 | 0.152 | 0.958 |
+-------+-------+----------+----------+--------------+
类风格 torchkeras.LightModel
下面示范torchkeras.LightModel的使用范例,详细用法可以参照
import torchkeras
import pytorch_lightning as pl
# 定义模型
class CnnNet(nn.Module):
def __init__(self):
super().__init__()
self.layers = nn.ModuleList([
nn.Conv2d(in_channels=1,out_channels=32,kernel_size = 3),
nn.MaxPool2d(kernel_size = 2,stride = 2),
nn.Conv2d(in_channels=32,out_channels=64,kernel_size = 5),
nn.MaxPool2d(kernel_size = 2,stride = 2),
nn.Dropout2d(p = 0.1),
nn.AdaptiveMaxPool2d((1,1)),
nn.Flatten(),
nn.Linear(64,32),
nn.ReLU(),
nn.Linear(32,10)]
)
def forward(self,x):
for layer in self.layers:
x = layer(x)
return x
class Model(torchkeras.LightModel):
#loss,and optional metrics
def shared_step(self,batch)->dict:
x, y = batch
prediction = self(x)
loss = nn.CrossEntropyLoss()(prediction,y)
preds = torch.argmax(nn.Softmax(dim=1)(prediction),dim=1).data
acc = pl.metrics.functional.accuracy(preds, y)
dic = {"loss":loss,"acc":acc}
return dic
#optimizer,and optional lr_scheduler
def configure_optimizers(self):
optimizer = torch.optim.Adam(self.parameters(), lr=1e-2)
lr_scheduler = torch.optim.lr_scheduler.StepLR(optimizer, step_size=10, gamma=0.0001)
return {"optimizer":optimizer,"lr_scheduler":lr_scheduler}
pl.seed_everything(1234)
net = CnnNet()
model = Model(net)
ckpt_cb = pl.callbacks.ModelCheckpoint(monitor='val_loss')
# set gpus=0 will use cpu,
# set gpus=1 will use 1 gpu
# set gpus=2 will use 2gpus
# set gpus = -1 will use all gpus
# you can also set gpus = [0,1] to use the given gpus
# you can even set tpu_cores=2 to use two tpus
trainer = pl.Trainer(max_epochs=10,gpus = 0, callbacks=[ckpt_cb])
# 使用fit方法训练模型
trainer.fit(model,dl_train,dl_valid)
================================================================================2021-01-16 22:32:45
epoch = 0
{'val_loss': 0.0954340249300003, 'val_acc': 0.9727057218551636}
{'acc': 0.910403311252594, 'loss': 0.27809813618659973}
================================================================================2021-01-16 22:34:12
epoch = 1
{'val_loss': 0.06748798489570618, 'val_acc': 0.9809137582778931}
{'acc': 0.9663013219833374, 'loss': 0.10915637016296387}
tensorflow
tensorflow高阶API有内置的compile方法和fit方法训练模型。并且可以通过设置回调函数实现对训练过程的复杂控制逻辑。
tf.keras.backend.clear_session()
# 定义模型
def create_model():
model = models.Sequential()
model.add(layers.Embedding(MAX_WORDS,7,input_length=MAX_LEN))
model.add(layers.Conv1D(filters = 64,kernel_size = 5,activation = "relu"))
model.add(layers.MaxPool1D(2))
model.add(layers.Conv1D(filters = 32,kernel_size = 3,activation = "relu"))
model.add(layers.MaxPool1D(2))
model.add(layers.Flatten())
model.add(layers.Dense(CAT_NUM,activation = "softmax"))
return(model)
# compile模型
def compile_model(model):
model.compile(optimizer=optimizers.Nadam(),
loss=losses.SparseCategoricalCrossentropy(),
metrics=[metrics.SparseCategoricalAccuracy(),metrics.SparseTopKCategoricalAccuracy(5)])
return(model)
model = create_model()
model.summary()
model = compile_model(model)
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
embedding (Embedding) (None, 300, 7) 216874
_________________________________________________________________
conv1d (Conv1D) (None, 296, 64) 2304
_________________________________________________________________
max_pooling1d (MaxPooling1D) (None, 148, 64) 0
_________________________________________________________________
conv1d_1 (Conv1D) (None, 146, 32) 6176
_________________________________________________________________
max_pooling1d_1 (MaxPooling1 (None, 73, 32) 0
_________________________________________________________________
flatten (Flatten) (None, 2336) 0
_________________________________________________________________
dense (Dense) (None, 46) 107502
=================================================================
Total params: 332,856
Trainable params: 332,856
Non-trainable params: 0
_________________________________________________________________
# 使用fit训练模型
history = model.fit(ds_train,validation_data = ds_test,epochs = 10)
Train for 281 steps, validate for 71 steps
Epoch 1/10
281/281 [==============================] - 11s 37ms/step - loss: 2.0231 - sparse_categorical_accuracy: 0.4636 - sparse_top_k_categorical_accuracy: 0.7450 - val_loss: 1.7346 - val_sparse_categorical_accuracy: 0.5534 - val_sparse_top_k_categorical_accuracy: 0.7560
Epoch 2/10
281/281 [==============================] - 9s 31ms/step - loss: 1.5079 - sparse_categorical_accuracy: 0.6091 - sparse_top_k_categorical_accuracy: 0.7901 - val_loss: 1.5475 - val_sparse_categorical_accuracy: 0.6109 - val_sparse_top_k_categorical_accuracy: 0.7792
Epoch 3/10
281/281 [==============================] - 9s 33ms/step - loss: 1.2204 - sparse_categorical_accuracy: 0.6823 - sparse_top_k_categorical_accuracy: 0.8448 - val_loss: 1.5455 - val_sparse_categorical_accuracy: 0.6367 - val_sparse_top_k_categorical_accuracy: 0.8001
Epoch 4/10
笔记中很多代码案例来自于:
《20天吃掉那只Pytorch》
《30天吃掉那只TensorFlow2》
感兴趣的同学可以进入学习。
===========================================================================
我的笔记一部分是将这两项目中内容整理归纳,一部分是相应功能的内容自己找资料整理归纳。
笔记以MD格式存入我的git仓库,另外代码案例所需要数据集文件也在其中:可以clone下来学习使用。
《pytorch-tensorflow对比学习笔记》
github项目地址: https://github.com/Boris-2021/pytorch-tensorflow-Comparative-study
===========================================================================
笔记中增加了很多趣味性的图片,增加阅读乐趣。
===========================================================================