视频增强和超分辨率是CV的核心算法之一,对早期胶片视频的质量和清晰度的提升有着重大意义。本题就是给一堆视频(低分辨率和高分辨率),利用训练得到的模型将低分辨率视频预测得到高分辨率视频。
import cv2
import numpy as np
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.layers import Conv2D
from tensorflow.keras.layers import Conv2DTranspose
from tensorflow.keras.layers import InputLayer
from tensorflow.keras.models import Sequential
path = "./h_GT/Youku_00000_h_GT/001.bmp"
img_GT = cv2.imread(path)/255.0
img_GT.shape
path = "./l/Youku_00000_l/001.bmp"
img_l = cv2.imread(path)/255.0
img_l.shape
def fsrcnn():
model = Sequential()
model.add(InputLayer(input_shape=(270, 480, 3)))
# first_part
model.add(Conv2D(56, 5, padding='same', activation='relu'))
# mid_part
model.add(Conv2D(12, 1, padding='same', activation='relu'))
for i in range(4):
model.add(Conv2D(12, 3, padding='same', activation='relu'))
# last_part
model.add(Conv2DTranspose(3, 9, strides=4, padding='same',))
model.compile(optimizer=tf.optimizers.Adam(1e-1), loss=tf.losses.mse, metrics=['mse'])
return model
FSRCNN (Faster Super-Resolution Convolutional Neural Network) 解决了SRCNN的耗时问题。FSRCNN使用反卷积来替代插值的预处理进行上采样,可以直接进行端到端的学习;FSRCNN使用ResNet bottleneck 架构来提高模型精度,使用更小的卷积核和更多的卷积层来替代大的卷积核。所以在生成不同高清分辨率图片时,FSRCNN只需调节用于上采样的反卷积权重,其余卷积层不变,大大加快训练速度,甚至做到实时。
# 使用模型
model = fsrcnn()
# 模型监控:自动调节学习率
plateau = keras.callbacks.ReduceLROnPlateau(monitor='val_loss', verbose=0, mode='min', factor=0.10, patience=6)
# 模型在验证集达到最优停止
early_stopping = keras.callbacks.EarlyStopping(monitor='val_loss', verbose=0, mode='min', patience=25)
# 模型在最优点保持
checkpoint = keras.callbacks.ModelCheckpoint('fsrcnn.h5', monitor='val_loss', verbose=0, mode='min', save_best_only=True)
# 训练数据
x = np.array([img_l,img_l])
y = np.array([img_GT,img_GT])
# 模型训练
model.fit(x, y, epochs=10, batch_size=2, verbose=1, shuffle=True, validation_data=(x, y), callbacks=[plateau, early_stopping, checkpoint])
model.evaluate(x, y, verbose=0)
pic_super = model.predict(x, verbose=0, batch_size=1)
cv2.imwrite("./fsrcnn_00.bmp", pic_super[0])
ESPCN(Efficient Sub-Pixel Convolutional Neural Network)吸收了FSRCNN的精华,它只在模型末端使用亚像素卷积的方式进行上采样,这样可以在低分辨率空间中保留更多的纹理区域,也可以在视频超分中做到实时。
def espcn():
inputs = keras.layers.Input(shape=(270, 480, 3))
cnn = keras.layers.Conv2D(64, 5, padding='same', activation='relu')(inputs)
cnn = keras.layers.Conv2D(32, 3, padding='same', activation='relu')(cnn)
cnn = keras.layers.Conv2D(3 * 4 **2, 3, padding='same')(cnn)
cnn = tf.reshape(cnn, [-1, 270, 480, 4, 4, 3])
cnn = tf.transpose(cnn, perm=[0, 1, 3, 2, 4, 5])
outputs = tf.reshape(cnn, [-1, 270 * 4, 480 * 4, 3])
model = keras.models.Model(inputs=[inputs], outputs=[outputs])
model.compile(optimizer=tf.optimizers.Adam(1e-1), loss=tf.losses.mse, metrics=['mse'])
return model
# 使用模型
model = espcn()
# 模型监控:自动调节学习率
plateau = keras.callbacks.ReduceLROnPlateau(monitor='val_loss', verbose=0, mode='min', factor=0.10, patience=6)
# 模型在验证集达到最优停止
early_stopping = keras.callbacks.EarlyStopping(monitor='val_loss', verbose=0, mode='min', patience=25)
# 模型在最优点保持
checkpoint = keras.callbacks.ModelCheckpoint('espcn.h5', monitor='val_loss', verbose=0, mode='min', save_best_only=True)
# 训练数据
x = np.array([img_l,img_l])
y = np.array([img_GT,img_GT])
# 模型训练
model.fit(x, y, epochs=10, batch_size=2, verbose=1, shuffle=True, validation_data=(x, y), callbacks=[plateau, early_stopping, checkpoint])
model.evaluate(x, y, verbose=0)
pic_super = model.predict(x, verbose=0, batch_size=1)
cv2.imwrite("./espcn_00.bmp", pic_super[0])
以上内容和代码全部来自于《阿里云天池大赛赛题解析(深度学习篇)》这本好书,十分推荐大家去阅读原书!