opencv, pillow读取jpg图像的坑

比较jpg图像读取的差异

在搞深度学习训练时,会在意图片像素级别的差异对train和test影响,
一般情况下应该没人在意这种细微差别,基于此,踩踩jpg图像的各种坑。
png这种无损存储没有差异,如果不在乎存储空间,还是用png吧。

先贴结论:
不同版本的opencv读取图像不一致,opencv310 vs opencv420
同一版本opencv读取图像一致,opencv420 vs opencv420
内存数据,存储为jpg图像后重新读取,jpg做了有损压缩,与原内存数据不一致
pillow5.0.0和opencv420读取图像不一致,但和opencv310只有极少差异,怀疑pillow底层和opencv310相似

代码如下
当前环境: opencv-4.2.0.32,pillow-5.0.0

import cv2
import numpy as np
from PIL import Image

# test 1
# read data saved from opencv310
# 在opencv 3.1.0.4环境种读取图片,存成npy数组,用于对比
data_opencv_310 = np.load('data310.npy')
# read one jpg from file
data_opencv_420 = cv2.imread('a.jpg')
# diff if opencv version different
compare = (data_opencv_420 == data_opencv_310).all()
print(compare)  # False, 二进制不等,说明不同版本的opencv读取图像不一致

# test 2
# read one jpg again from file
data_opencv_420_copy = cv2.imread('a.jpg')
# same, if opencv version is same
compare = (data_opencv_420 == data_opencv_420_copy).all()
print(compare)  # True,二进制相等,说明相同版本的opencv读取图像一致

# test 3
# write copy to file,将读取的图像再次写入本地文件
cv2.imwrite('a_rewrite.jpg', data_opencv_420)
# reread, 重新读取
data_reread = cv2.imread('a_rewrite.jpg')
# diff, because imwrite() exist data compress
compare = (data_opencv_420 == data_reread).all()
print(compare)  # False, 二进制不等,重新写入文件时候jpg做了有损压缩

# test 4
# read data from pil
img_pil = Image.open('a.jpg')
img_pil = np.array(img_pil)
img_pil = img_pil[..., ::-1]
# diff, pillow and opencv
compare = (data_opencv_420 == img_pil).all()
print(compare)   # False, 二进制不等,说明pillow和opencv读取图像不一致

compare = (data_opencv_310 == img_pil).all()
# 不一致,但是只有极少像素差异,说明pillow调用的库和3.1.0.4接近
print(compare)  # False, but only 5 pixels different, indicate pillow may use opencv310 decoder,

你可能感兴趣的:(python,opencv)