猫狗大战:融合了三种模型的Keras代码,准确率直升到99%

使用keras的resnet,inceptionV3,xception模型,首先加载预训练模型的权重,通过预训练权重生成对猫狗的训练值和测试值的特征向量
预训练模型下载地址:http://pan.baidu.com/s/1geHmOpH

from keras.models import *
from keras.layers import *
from keras.applications import *
from keras.preprocessing.image import *

import h5py

def write_gap(MODEL, image_size, lambda_func=None):
    width = image_size[0]
    height = image_size[1]
    input_tensor = Input((height, width, 3))
    x = input_tensor
    if lambda_func:
        x = Lambda(lambda_func)(x)

    base_model = MODEL(input_tensor=x, weights='imagenet', include_top=False)
    #进行这一步时会从Keras网站中下载imagenet模型,如果网速慢或缺乏连接,
    #可以直接在上面的连接下载,放到C:\Users\用户名\.keras\models 下
    #(模型有可能不是最新的,导致进行到这一步仍然报错),
    #或者直接点击报错的GitHub链接进行下载
    model = Model(base_model.input, GlobalAveragePooling2D()(base_model.output))

    gen = ImageDataGenerator()
    train_generator = gen.flow_from_directory("train", image_size, shuffle=False,
                                              batch_size=1)
    test_generator = gen.flow_from_directory("test", image_size, shuffle=False,
                                             batch_size=1, class_mode=None)

    train = model.predict_generator(train_generator, train_generator.samples)
    test = model.predict_generator(test_generator, test_generator.samples)
    with h5py.File("gap_%s.h5"%MODEL.__name__) as h:
        h.create_dataset("train", data=train)
        h.create_dataset("test", data=test)
        h.create_dataset("label", data=train_generator.classes)

接着加载三个模型,并分别提取出训练集和测试集的权重,放入h5文件中

write_gap(ResNet50, (224, 224))
write_gap(InceptionV3, (299, 299), inception_v3.preprocess_input)
write_gap(Xception, (299, 299), xception.preprocess_input)

得到的每个图片的特征向量都是2048维,所以每个特征文件储存内容都是
trian:(25000,2048)
label:(25000,1)
test:(12500,2048)
接着,把三个模型合并在一起,每个图片就有2048*3个权重值了

import h5py
import numpy as np
from sklearn.utils import shuffle
from keras.layers import Dense,Input,Dropout
from keras.models import Model
import get_csv
np.random.seed(2017)

X_train = []
X_test = []
for filenames in ["gap_ResNet50.h5", "gap_Xception.h5", "gap_InceptionV3.h5"]:
    filename = filenames
    with h5py.File(filename, 'r') as h:
        X_train.append(np.array(h['train']))
        X_test.append(np.array(h['test']))
        y_train = np.array(h['label'])
X_train = np.concatenate(X_train, axis=1)
X_test = np.concatenate(X_test, axis=1)

然后我们基于这些权重值建立一个全连接

inputs = Input(X_train.shape[1:])#shape=(2048*3,)
x = Dropout(0.5)(inputs)
x = Dense(1, activation='sigmoid')(x)
model = Model(inputs, x)
model.compile(optimizer='adadelta',
              loss='binary_crossentropy',
              metrics=['accuracy'])

开始训练,会发现,准确度在第一次训练之后就已经到达99%了,全训练完不到半分钟,训练完成之后就直接用训练好的权重预测测试集吧

model.fit(X_train, y_train, batch_size=128, nb_epoch=8, validation_split=0.2,verbose=2)

y_pred = model.predict(X_test, verbose=1)
y_pred = y_pred.clip(min=0.005, max=0.995)

得到结果测试集放在csv文件里

import pandas as pd
from keras.preprocessing.image import *

df = pd.read_csv("D:\C_V_D\data\Keras_sever\sample_submission.csv")

gen = ImageDataGenerator()
test_generator = gen.flow_from_directory("D:\C_V_D\data\Keras_test", (224, 224), shuffle=False,
                                         batch_size=1, class_mode=None)

for i, fname in enumerate(test_generator.filenames):
    index = int(fname[fname.rfind('\\')+1:fname.rfind('.')])
    df.set_value(index-1, 'label', y_pred[i])

df.to_csv('pred.csv', index=None)
df.head(10)

提交到kaggle上,loss只有0.038
训练集和测试集在这里下载:https://www.kaggle.com/c/dogs-vs-cats-redux-kernels-edition

你可能感兴趣的:(Keras)