先上基础,后面有许多的程序代码
滤波是指通过、修改或抑制图像的规定频率分量
通过低频的滤波器称为低通滤波器
通过模糊图像来平滑图像
通过高频的滤波器称为高通滤波器
线性空间滤波器
空间滤波通过把每个像素的值替换为该像素及其邻域的函数值来修改图像,如果 对图像像素执行的运算是线性的,则称为线性空间滤波器
空间域是通过卷积运算而达到滤波的效果。 而频率域滤波,是通过傅里叶变换到频率域,利用乘法运行达到滤波的效果
线性空间滤波器在图像 f f f和滤波器核 w w w之间执行乘积和运算。
滤波器的响应 g ( x , y ) g(x,y) g(x,y)是核系数和和核所覆盖图像像素的乘积之和:
g ( x , y ) = w ( − 1 , − 1 ) f ( x − 1 , y − 1 ) + w ( − 1 , 0 ) f ( x − 1 , y ) + ⋯ + w ( 0 , 0 ) f ( x , y ) + ⋯ + w ( 1 , 1 ) f ( x + 1 , y + 1 ) (3.30) g(x, y) = w(-1, -1)f(x-1, y-1) + w(-1, 0)f(x-1, y) + \dots + w(0,0)f(x,y)+ \dots + w(1, 1)f(x+1, y+1)\tag{3.30} g(x,y)=w(−1,−1)f(x−1,y−1)+w(−1,0)f(x−1,y)+⋯+w(0,0)f(x,y)+⋯+w(1,1)f(x+1,y+1)(3.30)
坐标 x x x和 y y y变化时,核的中心逐个像素地移动,并在移动过程中生产滤波后的图像 g g g
m × n m \times{n} m×n的核对大小为 M × N M \times{N} M×N的图像的线性空间滤波可以表示为:
g ( x , y ) = ∑ s = − a a ∑ t = − b b w ( s , t ) f ( x + s , y + t ) (3.31) g(x, y) = \sum_{s=-a}^{a} \sum_{t=-b}^{b} w(s,t)f(x+s, y+t) \tag{3.31} g(x,y)=s=−a∑at=−b∑bw(s,t)f(x+s,y+t)(3.31)
m = 2 a + 1 , n = 2 b + 1 m = 2a + 1, n = 2b + 1 m=2a+1,n=2b+1
离散冲激强度(振幅)
δ ( x − x 0 , y − y 0 ) = { A , x 0 = x a n d y 0 = y 0 , others (3.33) \delta(x - x_0, y - y_0) = \begin{cases} A, & x_0 = x and y_0 = y \\0, & \text{others} \end{cases} \tag{3.33} δ(x−x0,y−y0)={A,0,x0=xandy0=yothers(3.33)
式(3.31)(相关)重写如下:
( w ⋆ f ) ( x , y ) = ∑ s = − a a ∑ t = − b b w ( s , t ) f ( x + s , y + t ) (3.34) (w\star{f})(x,y) = \sum_{s=-a}^{a} \sum_{t=-b}^{b} w(s,t)f(x+s, y+t) \tag{3.34} (w⋆f)(x,y)=s=−a∑at=−b∑bw(s,t)f(x+s,y+t)(3.34)
注:式中的黑色星,在书上是空星星,LaTex找了半天也没有找到这个字符对应的。
线性空间滤波和空间卷积是同义的
性质 | 卷积 | 相关 |
---|---|---|
交换律 | f ⋆ g = g ⋆ f f \star{g} = g \star{f} f⋆g=g⋆f | - |
结合律 | f ⋆ ( g ⋆ h ) = ( f ⋆ g ) ⋆ h f \star{(g \star{h})} = (f \star{g}) \star{h} f⋆(g⋆h)=(f⋆g)⋆h | - |
分配律 | f ⋆ ( g + h ) = ( f ⋆ g ) + ( f ⋆ h ) f \star{(g + h)} = (f \star{g}) + (f \star{h}) f⋆(g+h)=(f⋆g)+(f⋆h) | f ⋆ ( g + h ) = ( f ⋆ g ) + ( f ⋆ h ) f \star{(g + h)} = (f \star{g}) + (f \star{h}) f⋆(g+h)=(f⋆g)+(f⋆h) |
图像有时是被顺序地、分段地滤波(即卷积)的,每个阶段会使用不同的核,对于 Q Q Q个阶段同的滤波,图像 f f f首先用核 w 1 w_1 w1滤波,结果再用核 w 2 w_2 w2滤波,以此类推,由于总面积满足交换律,因此这多阶段的滤波可在音效滤波运算 w ⋆ f w \star{f} w⋆f中完成。
w = w 1 ⋆ w 2 ⋆ w 3 ⋆ … ⋆ w Q (3.38) w = w_1 \star{w_2} \star{w_3} \star{\ldots} \star{w_{Q}} \tag{3.38} w=w1⋆w2⋆w3⋆…⋆wQ(3.38)
若二维函数 G ( x , y ) G(x, y) G(x,y)可写成一维函数 G 1 ( x ) G_{1}(x) G1(x)和 G 2 ( x ) G_{2}(x) G2(x)的乘积,即 G ( x , y ) = G 1 ( x ) G 2 ( y ) G(x, y) = G_{1}(x) G_{2}(y) G(x,y)=G1(x)G2(y),则它是可分离的。
空间滤波器核是一个矩阵,而可分离核是一个能够表示为两个向量的外积的矩阵。
w = [ 1 1 1 1 1 1 ] = c r T = [ 1 1 ] [ 1 1 1 ] w = \begin{bmatrix} 1 & 1 & 1 \\ 1 & 1 & 1 \end{bmatrix} = cr^{T} = \begin{bmatrix} 1 \\ 1 \end{bmatrix} \begin{bmatrix} 1 & 1 & 1 \end{bmatrix} w=[111111]=crT=[11][111]
大小为 m × n m \times{n} m×n的可分离核表示为两个向量 v v v和 w w w的外积;
w = v w T (3.41) w = vw^T \tag{3.41} w=vwT(3.41)
v v v和 w w w分别是是大小为 m × 1 m\times{1} m×1和 n × 1 n\times{1} n×1的向量
m × m m\times{m} m×m方形核可以写成
w = v v T (3.42) w = vv^T \tag{3.42} w=vvT(3.42)
如果一个核 w w w,可分解为两个更简单的核并且满足 w = w 1 ⋆ w 2 w = w_1 \star{w_2} w=w1⋆w2,则有
w ⋆ f = ( w 1 ⋆ w 2 ) ⋆ f = ( w 2 ⋆ w 1 ) ⋆ f = w 2 ⋆ ( w 1 ⋆ f ) = w 1 ⋆ ( w 2 ⋆ f ) (3.43) w \star{f}= (w_1 \star{w_2}) \star{f} = (w_2 \star{w_1}) \star{f} = w_2 \star{(w_1} \star{f}) = w_1 \star{(w_2} \star{f}) \tag{3.43} w⋆f=(w1⋆w2)⋆f=(w2⋆w1)⋆f=w2⋆(w1⋆f)=w1⋆(w2⋆f)(3.43)
对于大小为 M × N M \times{N} M×N的图像和大小为 m × n m \times{n} m×n的核,需要 M N m n MNmn MNmn将乘法和加法运算。如果是可分离核,分别运算,则需要 M N ( m + n ) MN(m+n) MN(m+n)次乘法和加法运算。因此使用可分离核执行卷积运算相对于不可分离核执行卷积的计算优势定义为:
C = M N m n M N ( m + n ) = m n m + n (3.44) C = \frac{MNmn}{MN(m + n)} = \frac{mn}{m+n} \tag{3.44} C=MN(m+n)MNmn=m+nmn(3.44)
怎样确定核是否可分离
确定某个核的矩阵秩为1后,就很容易求出两个向量 v v v和 w w w,因为它们的外积 v w T vw^T vwT等于这个核,求可分离核的步骤:
我们的目的是找到两个一维核 w 1 w_1 w1和 w 2 w_2 w2,以便实现一维卷积。根据前面的表示方法,有 w 1 = c = v , w 2 = r / E = w T w_1 = c = v, w_2 = r/E = w^T w1=c=v,w2=r/E=wT。对于圆对称的核,通过核的中心的列描述整个核,即 w = v v T / c w = vv^T /c w=vvT/c,其中 c c c是中心系数的值,则一维分量是 w 1 = v w_1 = v w1=v和 w 2 = v T / c w_2 = v^T /c w2=vT/c。
根据其数学性质
计算邻域像素平均值的滤波器会模糊图像,类似于积分
计算图像局部导数滤波器会锐化图像。
对形状具有所需性质的二维空间函数进行取样
高斯函数 的样本可以构建加权平均(低通)滤波器
设计具有规定频率响应的空间滤波器
一维滤波器值可以表示为向量 v v v,而二维可分离核可用式(3.42)得到
近似于圆对称函数的二维核
程序部分
程序里也许多注释,如有不明,也可以给我留言或者发信息
def kernel_seperate(kernel):
"""
get kernel seperate vector if separable
param: kernel: input kerel of the filter
return two separable vector c, r
"""
rank = np.linalg.matrix_rank(kernel)
if rank == 1:
#-----------------Numpy------------------
c = np.min(kernel, axis=1)
r = np.min(kernel, axis=0)
#------------------Loop------------------
# r = w[0, :]
# c = []
# for i in range(w.shape[0]):
# temp = int((w[i, :] / r)[0])
# c.append(temp)
else:
print('Not separable')
# c = np.array(c).reshape(-1, 1)
# r = np.array(r).reshape(-1, 1)
return c, r
根据公式用循环写的,方法的命名为算术平均,实际上是基础的一个离散卷积运算。
这种方法就是效率比较低,但是大核也可以运算,不会爆内存
def arithmentic_mean(image, kernel):
"""
:param image: input image
:param kernel: input kernel
:return: image after convolution
"""
img_h = image.shape[0]
img_w = image.shape[1]
m = kernel.shape[0]
n = kernel.shape[1]
# padding
padding_h = int((m - 1)/2)
padding_w = int((n - 1)/2)
image_pad = np.pad(image.copy(), [(padding_h, padding_h), (padding_w, padding_w)], mode="constant", constant_values=0)
image_convol = image.copy().astype(np.float)
for i in range(padding_h, img_h + padding_h):
for j in range(padding_w, img_w + padding_w):
temp = np.sum(image_pad[i-padding_h:i+padding_h+1, j-padding_w:j+padding_w+1] * kernel)
image_convol[i - padding_h][j - padding_w] = temp # 1/(m * n) * temp
return image_convol, image_pad
如果核不可分离的话,不可以用此方法
此种方法效率很高,而且大的核也不会爆内存,就是需要核可分离
def separate_kernel_conv2D(img, kernel):
"""
separable kernel convolution 2D, if 2D kernel not separable, then canot use this fuction. in the fuction. first need to
seperate the 2D kernel into two 1D vectors.
param: img: input image you want to implement 2d convolution with 2d kernel
param: kernel: 2D separable kernel, such as Gaussian kernel, Box Kernel
return image after 2D convolution
"""
m, n = kernel.shape[:2]
pad_h = int((m - 1) / 2)
pad_w = int((n - 1) / 2)
w_1, w_2 = kernel_seperate(kernel)
convovle = np.vectorize(np.convolve, signature='(x),(y)->(k)')
r_2 = convovle(img, w_2)
r_r = np.rot90(r_2)
r_1 = convovle(r_r, w_1)
r_1 = r_1.T
r_1 = r_1[:, ::-1]
img_dst = img.copy().astype(np.float)
img_dst = r_1[pad_h:-pad_h, pad_w:-pad_w]
return img_dst
此方效率虽然比第一种高,但比第二种低,但是需要大量的内存才能实现,尤其是大核,很容易爆内存错误
def img2col(image, kernel_shape, padding='same', mode='constant', strides=1):
"""
image to col fuction
param: image: input image you want to change to col
param: kernel_shape: input shape of the kernel, normally is a tuple, like (3, 3)
param: padding: string, valid values 'valid' and 'same', 'valid' then not padding, output shape will be smaller than input
image, 'same' padding the input image, output image will be same shape as input image
param: mode: padding mode, please refere to numpy.pad. valid string values are 'constant', 'edge', 'linear_ramp', 'maximum',
'mean', 'median', 'minimum', 'reflect', 'symmetric', 'wrap'
param: strides: strides of the slidding window, default is 1
"""
height, width = image.shape[:2]
kernel_h, kernel_w = kernel_shape[:2]
# padding
if padding == 'same':
'''padding here only consider convolution, not strides '''
result_shape = np.array([height , width])
pad_h = (kernel_h - 1) // 2
pad_w = (kernel_w - 1) // 2
image_new = np.pad(image, [(pad_h, pad_h), (pad_w, pad_w)], mode=mode)
elif padding == 'valid':
result_shape = np.array([(height - kernel_h + 1), (width - kernel_w + 1)])
image_new = image
image2col = np.zeros((kernel_h*kernel_w, (result_shape[0]//strides) * (result_shape[1]//strides)))
count = 0
for h in range(0, result_shape[0], strides):
for w in range(0, result_shape[1], strides):
cur_item = image_new[h:h+kernel_h, w:w+kernel_w].reshape(-1)
image2col[:, count] = cur_item
count += 1
return image2col, result_shape // strides
def my_conv2d(image, kernel, padding="same", mode='constant', strides=1):
"""
convolution implement using img2col
param: image: input image you want to change to implement convolution
param: kernel: kernel
param: padding: string, valid values 'valid' and 'same', 'valid' then not padding, output shape will be smaller than input
image, 'same' padding the input image, output image will be same shape as input image
param: mode: padding mode, please refere to numpy.pad. valid string values are 'constant', 'edge', 'linear_ramp', 'maximum',
'mean', 'median', 'minimum', 'reflect', 'symmetric', 'wrap'
param: strides: strides of the slidding window, default is 1
"""
image2col, result_shape = img2col(image, kernel.shape, padding=padding, mode=mode, strides=strides)
kernel_flatten = kernel.reshape(1, -1)
convolve = np.matmul(kernel_flatten, image2col)
return convolve.reshape(result_shape[0], result_shape[1])
这是为了实现效果而写的方法
def visualize_input(img, ax):
"""
add annotation to the image, values of each pixel
param: img: input image
param: ax: axes of the matplotlib
"""
ax.imshow(img, cmap='gray', vmin=0, vmax=255)
width, height = img.shape
thresh = 100#img.max()/2.5
for x in range(width):
for y in range(height):
ax.annotate(str(round(img[x][y],2)), xy=(y,x),
horizontalalignment='center',
verticalalignment='center',
color='white' if img[x][y]<thresh else 'black')
def visualize_kernel(img, ax):
"""
add annotation to kernel w(0, 0)
param: img: input image
param: ax: axes of the matplotlib
"""
ax.imshow(img, cmap='gray', vmin=0, vmax=255), ax.set_xticks([]), ax.set_yticks([])
width, height = img.shape
thresh = img.max()/2.5
for x in range(width):
for y in range(height):
ax.annotate('w'+ str((x - 1, y - 1)), xy=(y,x),
horizontalalignment='center',
verticalalignment='center',
color='white' if img[x][y]<thresh else 'black')
def visualize_image(img, ax):
"""
add annotation to kernel with f(x, y)
param: img: input image
param: ax: axes of the matplotlib
"""
ax.imshow(img, cmap='gray', vmin=0, vmax=255), ax.set_xticks([]), ax.set_yticks([])
width, height = img.shape
thresh = img.max()/2.5
for x in range(width):
for y in range(height):
ax.annotate('f'+ str(('x' + str(x - 1), 'y' + str(y - 1))), xy=(y,x),
horizontalalignment='center',
verticalalignment='center',
color='white' if img[x][y]<thresh else 'black')
# 线性空间滤波的原理图
height, width = 10, 10
img_ori = get_check(height, width, lower=230, upper=255)
grid_size = (3, 3)
m = grid_size[0]
n = grid_size[1]
img_ori[height//2-m:height//2, width//2-n:width//2] = img_ori[height//2-m:height//2, width//2-n:width//2] - 100
img_dst = img_ori[height//2-m:height//2, width//2-n:width//2]
fig = plt.figure(figsize=(15, 6))
plt.subplot(1, 3, 1), plt.imshow(img_ori, cmap='gray', vmin=0, vmax=255), plt.title(f'Convolution')
plt.xticks([]), plt.yticks([])
ax2 = fig.add_subplot(1, 3, 2)
ax2.set_title('Kernel'), visualize_kernel(img_dst, ax2)
ax3 = fig.add_subplot(1, 3, 3)
ax3.set_title('Image'), visualize_image(img_dst, ax3)
plt.tight_layout()
plt.show()
def visualize_show_annot(img_show, img_annot, ax):
"""
add annotation to the image, values of each pixel
param: img: input image
param: ax: axes of the matplotlib
"""
height, width = img_annot.shape
img_show = img_show[:height, :width]
ax.imshow(img_show, cmap='gray', vmin=0, vmax=255)
thresh = 100 #img_show.max()/2.5
for x in range(height):
for y in range(width):
ax.annotate(str(round(img_annot[x][y],2)), xy=(y,x),
horizontalalignment='center',
verticalalignment='center',
color='white' if img_annot[x][y]<thresh else 'black')
# 线性空间滤波的原理图,为了好看,原图像显示为灰色
height, width = 5, 5
img_show = np.ones([height, width], dtype=np.uint8) * 180
img_ori = np.zeros([height, width], dtype=np.uint8)
img_ori[2, 2] = 1
kernel = np.arange(1, 10).reshape(3, 3)
img_dst, img_pad = arithmentic_mean(img_ori, kernel)
fig = plt.figure(figsize=(10, 10))
# fig.subplots_adjust(left=None, bottom=None, right=None, top=None,wspace=0.1, hspace=0.1)
ax1 = fig.add_subplot(2, 2, 1)
ax1.set_title('Original f'), visualize_show_annot(img_show, img_ori, ax1), plt.xticks([]), plt.yticks([])
height, width = 7, 7
img_show = np.ones([height, width], dtype=np.uint8) * 180
ax2 = fig.add_subplot(2, 2, 2)
ax2.set_title('Padded f'), visualize_show_annot(img_show, img_pad, ax2), plt.xticks([]), plt.yticks([])
ax3 = fig.add_subplot(2, 2, 3)
ax3.set_title('Kernel'), visualize_show_annot(img_show, kernel, ax3), plt.xticks([]), plt.yticks([])
ax4 = fig.add_subplot(2, 2, 4)
ax4.set_title('Result'), visualize_show_annot(img_show, img_dst, ax4), plt.xticks([]), plt.yticks([])
plt.tight_layout()
plt.show()
# 旋转kernel得到的结果
height, width = 5, 5
img_show = np.ones([height, width], dtype=np.uint8) * 190
img_ori = np.zeros([height, width], dtype=np.uint8)
img_ori[2, 2] = 1
kernel = np.arange(1, 10)[::-1].reshape(3, 3)
img_dst, img_pad = arithmentic_mean(img_ori, kernel)
fig = plt.figure(figsize=(10, 10))
# fig.subplots_adjust(left=None, bottom=None, right=None, top=None,wspace=0.1, hspace=0.1)
ax1 = fig.add_subplot(2, 2, 1)
ax1.set_title('Original f'), visualize_show_annot(img_show, img_ori, ax1), plt.xticks([]), plt.yticks([])
height, width = 7, 7
img_show = np.ones([height, width], dtype=np.uint8) * 180
ax2 = fig.add_subplot(2, 2, 2)
ax2.set_title('Padded f'), visualize_show_annot(img_show, img_pad, ax2), plt.xticks([]), plt.yticks([])
ax3 = fig.add_subplot(2, 2, 3)
ax3.set_title('Kernel'), visualize_show_annot(img_show, kernel, ax3), plt.xticks([]), plt.yticks([])
ax4 = fig.add_subplot(2, 2, 4)
ax4.set_title('Result'), visualize_show_annot(img_show, img_dst, ax4), plt.xticks([]), plt.yticks([])
plt.tight_layout()
plt.show()
# 可分离卷积核与全核卷积结果一样(上面的结果)
height, width = 5, 5
img_show = np.ones([height, width], dtype=np.uint8) * 180
img_ori = np.zeros([height, width], dtype=np.uint8)
img_ori[1:4, 2] = 1
# Separate kernel
w = np.array([[1, 2, 1], [2, 4, 2], [1, 2, 1]])
w_1, w_2 = kernel_seperate(w)
w_1 = w_1.reshape(-1, 1)
w_2 = w_2.reshape(-1, 1)
img_w_1, img_pad_1 = arithmentic_mean(img_ori, w_1)
img_w_2, img_pad_2 = arithmentic_mean(img_w_1, w_2.T)
fig = plt.figure(figsize=(15, 10))
ax1 = fig.add_subplot(2, 3, 1)
ax1.set_title('Original f'), visualize_show_annot(img_show, img_ori, ax1), plt.xticks([]), plt.yticks([])
height, width = 7, 7
img_show = np.ones([height, width], dtype=np.uint8) * 180
ax2 = fig.add_subplot(2, 3, 2)
ax2.set_title('img_pad_1'), visualize_show_annot(img_show, img_pad_1, ax2), plt.xticks([]), plt.yticks([])
ax3 = fig.add_subplot(2, 3, 3)
ax3.set_title('img_pad_2'), visualize_show_annot(img_show, img_pad_2, ax3), plt.xticks([]), plt.yticks([])
ax4 = fig.add_subplot(2, 3, 4)
ax4.set_title('img_w_1'), visualize_show_annot(img_show, img_w_1, ax4), plt.xticks([]), plt.yticks([])
ax5 = fig.add_subplot(2, 3, 5)
ax5.set_title('img_w_2'), visualize_show_annot(img_show, img_w_2, ax5), plt.xticks([]), plt.yticks([])
plt.tight_layout()
plt.show()
# 可分离卷积核与全核卷积结果一样(上面的结果)
# import time
height, width = 5, 5
# img_show = np.ones([height, width], dtype=np.uint8) * 180
# img_ori = np.zeros([height, width], dtype=np.uint8)
# img_ori[1:4, 2] = 1
img_ori = np.arange(1, height*width+1).reshape(height, width)
# Separate kernel
w = np.array([[1, 2, 1], [2, 4, 2], [1, 2, 1]])
w_1, w_2 = kernel_seperate(w)
w_1 = w_1.reshape(-1, 1)
w_2 = w_2.reshape(-1, 1)
img_w_1, img_pad_1 = arithmentic_mean(img_ori, w_1)
img_w_2, img_pad_2 = arithmentic_mean(img_w_1, w_2.T)
fig = plt.figure(figsize=(15, 10))
ax1 = fig.add_subplot(2, 3, 1)
ax1.set_title('Original f'), visualize_show_annot(img_show, img_ori, ax1), plt.xticks([]), plt.yticks([])
height, width = 7, 7
img_show = np.ones([height, width], dtype=np.uint8) * 180
ax2 = fig.add_subplot(2, 3, 2)
ax2.set_title('img_pad_1'), visualize_show_annot(img_show, img_pad_1, ax2), plt.xticks([]), plt.yticks([])
ax3 = fig.add_subplot(2, 3, 3)
ax3.set_title('img_pad_2'), visualize_show_annot(img_show, img_pad_2, ax3), plt.xticks([]), plt.yticks([])
ax4 = fig.add_subplot(2, 3, 4)
ax4.set_title('img_w_1'), visualize_show_annot(img_show, img_w_1, ax4), plt.xticks([]), plt.yticks([])
ax5 = fig.add_subplot(2, 3, 5)
ax5.set_title('img_w_2'), visualize_show_annot(img_show, img_w_2, ax5), plt.xticks([]), plt.yticks([])
img_img2col = my_conv2d(img_ori, w, padding='same', mode='constant').astype(int)
ax6 = fig.add_subplot(2, 3, 6)
ax6.set_title('img_img2col'), visualize_show_annot(img_show, img_img2col, ax6), plt.xticks([]), plt.yticks([])
plt.tight_layout()
plt.show()
可以看出可分离核,11x11大小的核,numpy的运算效率也是很高的。
# 全核卷积与分离核单独做卷积,没有优化,所以分离核时间比是全核卷积的一倍
# 这里只做展示,结果不重要
import time
height, width = 512, 512
img_show = np.ones([height, width], dtype=np.uint8) * 180
img_ori = np.zeros([height, width], dtype=np.uint8)
img_ori[1:4, 2] = 1
# Kernel
# w = np.array([[1, 2, 1], [2, 4, 2], [1, 2, 1]])
w = np.ones([11, 11])
# Loop convolution comsum more time, but not need very big memory
start_time = time.time()
img_dst = arithmentic_mean(img_ori, w)
elapse = time.time() - start_time
print(f'Full kernel covolution elapse -> {elapse}s')
# img2col, bit faster than loop, but need very more memory, if kernel is too big, cannot carry on
start_time = time.time()
img_dst = my_conv2d(img_ori, w)
elapse = time.time() - start_time
print(f'Img2col covolution elapse -> {elapse}s')
# Separable Kernel convolution optimized by Numpy, run very fast, but the kernel has to be separable
start_time = time.time()
img_dst = separate_kernel_conv2D(img_ori, w)
elapse = time.time() - start_time
print(f'Seperate kernel covolution optimized elapse -> {elapse}s')
# Separable kernel not optimized
start_time = time.time()
w_1, w_2 = kernel_seperate(w)
w_1 = w_1.reshape(-1, 1)
w_2 = w_2.reshape(-1, 1)
img_w_1, img_pad_1 = arithmentic_mean(img_ori, w_1)
img_w_2, img_pad_2 = arithmentic_mean(img_w_1, w_2.T)
elapse = time.time() - start_time
print(f'separate kernel covolution elapse -> {elapse}s')
# opencv run very fast, but not actual convolution
start_time = time.time()
img_dst = cv2.filter2D(img_ori, -1, w)
elapse = time.time() - start_time
print(f'Opencv elapse -> {elapse}s')
Full kernel covolution elapse -> 1.2127759456634521s
Img2col covolution elapse -> 0.584395170211792s
Seperate kernel covolution optimized elapse -> 0.009973526000976562s
separate kernel covolution elapse -> 2.176215887069702s
Opencv elapse -> 0.004025936126708984s
在接下来的文章也会有这几种卷积方法效果的对比
# 习题3.20
w = np.array([[1, 2, 1], [2, 4, 2], [1, 2, 1]])
print(w)
print()
rank = np.linalg.matrix_rank(w)
print(f"Kernel's rank is -> {rank}\n")
c, r = kernel_seperate(w)
print('Two vector are below:')
print(f'vector c -> {c}, vector r -> {r}\n')
print(f'cv.T -> \n{c.reshape(-1, 1) * r.reshape(-1, 1).T}')
[[1 2 1]
[2 4 2]
[1 2 1]]
Kernel's rank is -> 1
Two vector are below:
vector c -> [1 2 1], vector r -> [1 2 1]
cv.T ->
[[1 2 1]
[2 4 2]
[1 2 1]]
# 习题3.20
height, width = 5, 5
img_show = np.ones([height, width], dtype=np.uint8) * 180
img_ori = np.zeros([height, width], dtype=np.uint8)
img_ori[1:4, 2] = 1
w = np.array([[1, 2, 1], [2, 4, 2], [1, 2, 1]])
kernel = w
img_dst, img_pad = arithmentic_mean(img_ori, kernel)
fig = plt.figure(figsize=(10, 10))
# fig.subplots_adjust(left=None, bottom=None, right=None, top=None,wspace=0.1, hspace=0.1)
ax1 = fig.add_subplot(2, 2, 1)
ax1.set_title('Original f'), visualize_show_annot(img_show, img_ori, ax1), plt.xticks([]), plt.yticks([])
height, width = 7, 7
img_show = np.ones([height, width], dtype=np.uint8) * 180
ax2 = fig.add_subplot(2, 2, 2)
ax2.set_title('Padded f'), visualize_show_annot(img_show, img_pad, ax2), plt.xticks([]), plt.yticks([])
ax3 = fig.add_subplot(2, 2, 3)
ax3.set_title('Kernel'), visualize_show_annot(img_show, kernel, ax3), plt.xticks([]), plt.yticks([])
ax4 = fig.add_subplot(2, 2, 4)
ax4.set_title('Result'), visualize_show_annot(img_show, img_dst, ax4), plt.xticks([]), plt.yticks([])
plt.tight_layout()
plt.show()
# 习题3.22
# (a)
v = np.array([[1], [2], [1]])
w_T = np.array([2, 1, 1, 3])
w = v * w_T.T
print(w)
rank = np.linalg.matrix_rank(w)
print(f"Kernel's rank is -> {rank}")
# rank = 1,所以是可分离的
[[2 1 1 3]
[4 2 2 6]
[2 1 1 3]]
Kernel's rank is -> 1
# 习题3.22
# (b)
w = np.array([[1, 3, 1], [2, 6, 2]])
print(w)
rank = np.linalg.matrix_rank(w)
print(f"Kernel's rank is -> {rank}")
w_1, w_2 = kernel_seperate(w)
w_r = w_1.reshape(-1, 1) * w_2.reshape(-1, 1).T
print(w == w_r)
[[1 3 1]
[2 6 2]]
Kernel's rank is -> 1
[[ True True True]
[ True True True]]
# 习题3.22
# (c)
w = np.array([[2, 1, 1, 3],
[4, 2, 2, 6],
[2, 1, 1, 3]])
print(w)
rank = np.linalg.matrix_rank(w)
print(f"Kernel's rank is -> {rank}")
w_1, w_2 = kernel_seperate(w)
w_1 = w_1.reshape(-1, 1)
w_2 = w_2.reshape(-1, 1)
w_r = w_1 * w_2.T
print(w == w_r)
[[2 1 1 3]
[4 2 2 6]
[2 1 1 3]]
Kernel's rank is -> 1
[[ True True True True]
[ True True True True]
[ True True True True]]