在用keras框架下的vgg16模型进行迁移学习的时候,发现对输入数据有一定要求,大小不能小于48*48,输入的图像通道要为3。而笔者的数据事先按照教程已经处理成(16500,28,28)大小的数组了,因而处理起来不太方便。幸好看到有博客中可以使用opencv进行处理。
参考链接1 参考链接2
train_images = [cv2.cvtColor(cv2.resize(i, (48, 48)), cv2.COLOR_GRAY2BGR) for i in train_images]
train_images = np.concatenate([arr[np.newaxis] for arr in train_images]).astype('float32')
但是在使用上述代码时一直报错,而且还不明确是什么问题,而后笔者按照教程处理minist数据集,没发现问题,查看数据格式:
from keras.datasets import mnist
import cv2
import numpy as np
(x_train, y_train), (x_test, y_test) = mnist.load_data()
print(x_train.shape)
print(type(x_train))
print(x_train.dtype)
# converting it to RGB
x_train = [cv2.cvtColor(cv2.resize(i, (48, 48)), cv2.COLOR_GRAY2BGR) for i in x_train]
x_train = np.concatenate([arr[np.newaxis] for arr in x_train]).astype('float32')
print(x_train.shape)
print(type(x_train))
print(x_train.dtype)
结果如下:
(60000, 28, 28)
uint8
(60000, 48, 48, 3)
float32
经过比对才知道,原来的训练集数据应该是uint8类型的数组,而笔者所用的是float64的,因而opencv无法处理,修改以后就可以了。
下面来测试一下效果:
import numpy as np
from PIL import Image
import cv2
img_path = 'dog.jpg'
x = Image.open(img_path).convert('L')
x = np.array(x)
print(type(x))
print(x.shape)
print(x.dtype)
# 转为三通道图像,大小扩充两倍
x = cv2.cvtColor(cv2.resize(x, (1000, 632)), cv2.COLOR_GRAY2BGR)
print(x.shape)
x = Image.fromarray(x)
x.show()
结果如下所示:
(316, 500)
uint8
(632, 1000, 3)
在opencv之前,笔者先发现的是这个方法,通过缩放以及插值等方法改变图像大小,处理的也是numpy数组,适用笔者的情况。
import numpy as np
from PIL import Image
import scipy.ndimage
img_path = 'dog.jpg'
x = Image.open(img_path).convert('L')
x = np.array(x)
print(type(x))
print(x.shape)
print(x.dtype)
x = np.expand_dims(x, axis=2)
print(x.shape)
x = scipy.ndimage.zoom(x, [2,2,3])
print(x.shape)
x = Image.fromarray(x)
x.show()
结果如下:
(316, 500)
uint8
(316, 500, 1)
(632, 1000, 3)