4. 构建模型


数据集和代码均已上传到Github中,欢迎大家下载使用。

Github地址:https://github.com/JasonZhang156/Sound-Recognition-Tutorial

如果这个教程对您有所帮助,请不吝贡献您的小星星Q^Q.


构建模型

本节使用keras搭建一个简单的CNN模型。该CNN模型包括3个卷积层、3个池化层、2个全连接层,中间层激活函数使用ReLU,最后一层使用softmax,每个卷积层后使用 Batch Normalization加速训练。优化器使用SGD,损失函数使用交叉熵(Cross Entropy)。模型详细配置如下:

4. 构建模型_第1张图片

Keras实现代码如下:

# -*- coding: utf-8 -*-
"""
@author: Jason Zhang
@github: https://github.com/JasonZhang156/Sound-Recognition-Tutorial
"""

from keras.layers import Input
from keras.layers import Conv2D, MaxPool2D, Dense, Dropout, BatchNormalization, Activation, GlobalAvgPool2D
from keras.models import Model
from keras import optimizers
from keras.utils import plot_model


def CNN(input_shape=(60,65,1), nclass=10):
    """
    build a simple cnn model using keras with TensorFlow backend.
    :param input_shape: input shape of network, default as (60,65,1)
    :param nclass: numbers of class(output shape of network), default as 10
    :return: cnn model
    """
    input_ = Input(shape=input_shape)

    # Conv1
    x = Conv2D(64, kernel_size=(3, 3), strides=(1, 1), padding='same')(input_)
    x = BatchNormalization()(x)
    x = Activation('relu')(x)
    x = MaxPool2D(pool_size=(2, 2), strides=(2, 2))(x)

    # Conv2
    x = Conv2D(128, kernel_size=(3, 3), strides=(1, 1), padding='same')(x)
    x = BatchNormalization()(x)
    x = Activation('relu')(x)
    x = MaxPool2D(pool_size=(2, 2), strides=(2, 2))(x)

    # Conv3
    x = Conv2D(256, kernel_size=(3, 3), strides=(1, 1), padding='same')(x)
    x = BatchNormalization()(x)
    x = Activation('relu')(x)
    x = MaxPool2D(pool_size=(2, 2), strides=(2, 2))(x)

    # GAP
    x = GlobalAvgPool2D()(x)
    # Dense_1
    x = Dense(256, activation='relu')(x)
    x = Dropout(0.5)(x)
    # Dense_2
    output_ = Dense(nclass, activation='softmax')(x)

    model = Model(inputs=input_, outputs=output_)
    # 输出模型的参数信息
    model.summary()
    # 配置模型训练过程
    sgd = optimizers.sgd(lr=0.01, momentum=0.9, nesterov=True)  # 优化器为SGD
    model.compile(loss='categorical_crossentropy', optimizer=sgd, metrics=['accuracy'])  # 交叉熵为cross entropy

    return model

if __name__ == '__main__':
    model = CNN()
    plot_model(model, './image/cnn.png')  # 保存模型图

上述代码包含在Github中的models.py文件中。

你可能感兴趣的:(声音识别,声音识别教程)