第3章 Python 数字图像处理(DIP) - 灰度变换与空间滤波12 - 空间域滤波基础 - 卷积运算(numpy 实现的三种卷积运算)

这篇文章比较长,请耐心看

  • 空间域滤波基础
      • 线性滤波
      • 可分离滤波器核
      • 空间域滤波和频率域滤波的一些重要比较
      • 如何构建空间滤波器
      • 第一种卷积方法(公式法)
      • 第二种卷积的方法(可分离核)
      • 第三种方法(img2col)
      • 这是分离核与公式法的结果对比
      • 下面是各种卷积的运算时间的比较
      • 书上的一些习题

先上基础,后面有许多的程序代码

空间域滤波基础

滤波是指通过、修改或抑制图像的规定频率分量

  • 通过低频的滤波器称为低通滤波器

  • 通过模糊图像来平滑图像

  • 通过高频的滤波器称为高通滤波器

  • 线性空间滤波器

  • 空间滤波通过把每个像素的值替换为该像素及其邻域的函数值来修改图像,如果 对图像像素执行的运算是线性的,则称为线性空间滤波器

空间域是通过卷积运算而达到滤波的效果。 而频率域滤波,是通过傅里叶变换到频率域,利用乘法运行达到滤波的效果

  1. 卷积是空间域滤波的基础,它等效于频率域中的乘法,反之亦然;
  2. 空间域中振幅为A的冲激,是频率域中值为A的一个常数,反之亦然

线性滤波

线性空间滤波器在图像 f f f和滤波器核 w w w之间执行乘积和运算。

    • 是一个阵列,其大小定义了运算的邻域
    • 也称为滤波器核,核,模板,窗口

滤波器的响应 g ( x , y ) g(x,y) g(x,y)是核系数和和核所覆盖图像像素的乘积之和:
g ( x , y ) = w ( − 1 , − 1 ) f ( x − 1 , y − 1 ) + w ( − 1 , 0 ) f ( x − 1 , y ) + ⋯ + w ( 0 , 0 ) f ( x , y ) + ⋯ + w ( 1 , 1 ) f ( x + 1 , y + 1 ) (3.30) g(x, y) = w(-1, -1)f(x-1, y-1) + w(-1, 0)f(x-1, y) + \dots + w(0,0)f(x,y)+ \dots + w(1, 1)f(x+1, y+1)\tag{3.30} g(x,y)=w(1,1)f(x1,y1)+w(1,0)f(x1,y)++w(0,0)f(x,y)++w(1,1)f(x+1,y+1)(3.30)
坐标 x x x y y y变化时,核的中心逐个像素地移动,并在移动过程中生产滤波后的图像 g g g

m × n m \times{n} m×n的核对大小为 M × N M \times{N} M×N的图像的线性空间滤波可以表示为:
g ( x , y ) = ∑ s = − a a ∑ t = − b b w ( s , t ) f ( x + s , y + t ) (3.31) g(x, y) = \sum_{s=-a}^{a} \sum_{t=-b}^{b} w(s,t)f(x+s, y+t) \tag{3.31} g(x,y)=s=aat=bbw(s,t)f(x+s,y+t)(3.31)
m = 2 a + 1 , n = 2 b + 1 m = 2a + 1, n = 2b + 1 m=2a+1,n=2b+1

离散冲激强度(振幅)
δ ( x − x 0 , y − y 0 ) = { A , x 0 = x a n d y 0 = y 0 , others (3.33) \delta(x - x_0, y - y_0) = \begin{cases} A, & x_0 = x and y_0 = y \\0, & \text{others} \end{cases} \tag{3.33} δ(xx0,yy0)={A,0,x0=xandy0=yothers(3.33)

式(3.31)(相关)重写如下:
( w ⋆ f ) ( x , y ) = ∑ s = − a a ∑ t = − b b w ( s , t ) f ( x + s , y + t ) (3.34) (w\star{f})(x,y) = \sum_{s=-a}^{a} \sum_{t=-b}^{b} w(s,t)f(x+s, y+t) \tag{3.34} (wf)(x,y)=s=aat=bbw(s,t)f(x+s,y+t)(3.34)
:式中的黑色星,在书上是空星星,LaTex找了半天也没有找到这个字符对应的。

  • 卷积的定义:
    ( w ⋆ f ) ( x , y ) = ∑ s = − a a ∑ t = − b b w ( s , t ) f ( x − s , y − t ) (3.35) (w\star{f})(x,y) = \sum_{s=-a}^{a} \sum_{t=-b}^{b} w(s,t)f(x - s, y - t) \tag{3.35} (wf)(x,y)=s=aat=bbw(s,t)f(xs,yt)(3.35)

线性空间滤波和空间卷积是同义的

  • 卷积和相关的一些基本性质
性质 卷积 相关
交换律 f ⋆ g = g ⋆ f f \star{g} = g \star{f} fg=gf -
结合律 f ⋆ ( g ⋆ h ) = ( f ⋆ g ) ⋆ h f \star{(g \star{h})} = (f \star{g}) \star{h} f(gh)=(fg)h -
分配律 f ⋆ ( g + h ) = ( f ⋆ g ) + ( f ⋆ h ) f \star{(g + h)} = (f \star{g}) + (f \star{h}) f(g+h)=(fg)+(fh) f ⋆ ( g + h ) = ( f ⋆ g ) + ( f ⋆ h ) f \star{(g + h)} = (f \star{g}) + (f \star{h}) f(g+h)=(fg)+(fh)

图像有时是被顺序地、分段地滤波(即卷积)的,每个阶段会使用不同的核,对于 Q Q Q个阶段同的滤波,图像 f f f首先用核 w 1 w_1 w1滤波,结果再用核 w 2 w_2 w2滤波,以此类推,由于总面积满足交换律,因此这多阶段的滤波可在音效滤波运算 w ⋆ f w \star{f} wf中完成。
w = w 1 ⋆ w 2 ⋆ w 3 ⋆ … ⋆ w Q (3.38) w = w_1 \star{w_2} \star{w_3} \star{\ldots} \star{w_{Q}} \tag{3.38} w=w1w2w3wQ(3.38)

可分离滤波器核

若二维函数 G ( x , y ) G(x, y) G(x,y)可写成一维函数 G 1 ( x ) G_{1}(x) G1(x) G 2 ( x ) G_{2}(x) G2(x)的乘积,即 G ( x , y ) = G 1 ( x ) G 2 ( y ) G(x, y) = G_{1}(x) G_{2}(y) G(x,y)=G1(x)G2(y),则它是可分离的。

空间滤波器核是一个矩阵,而可分离核是一个能够表示为两个向量的外积的矩阵。

w = [ 1 1 1 1 1 1 ] = c r T = [ 1 1 ] [ 1 1 1 ] w = \begin{bmatrix} 1 & 1 & 1 \\ 1 & 1 & 1 \end{bmatrix} = cr^{T} = \begin{bmatrix} 1 \\ 1 \end{bmatrix} \begin{bmatrix} 1 & 1 & 1 \end{bmatrix} w=[111111]=crT=[11][111]
大小为 m × n m \times{n} m×n的可分离核表示为两个向量 v v v w w w的外积;
w = v w T (3.41) w = vw^T \tag{3.41} w=vwT(3.41)
v v v w w w分别是是大小为 m × 1 m\times{1} m×1 n × 1 n\times{1} n×1的向量

m × m m\times{m} m×m方形核可以写成
w = v v T (3.42) w = vv^T \tag{3.42} w=vvT(3.42)

如果一个核 w w w,可分解为两个更简单的核并且满足 w = w 1 ⋆ w 2 w = w_1 \star{w_2} w=w1w2,则有
w ⋆ f = ( w 1 ⋆ w 2 ) ⋆ f = ( w 2 ⋆ w 1 ) ⋆ f = w 2 ⋆ ( w 1 ⋆ f ) = w 1 ⋆ ( w 2 ⋆ f ) (3.43) w \star{f}= (w_1 \star{w_2}) \star{f} = (w_2 \star{w_1}) \star{f} = w_2 \star{(w_1} \star{f}) = w_1 \star{(w_2} \star{f}) \tag{3.43} wf=(w1w2)f=(w2w1)f=w2(w1f)=w1(w2f)(3.43)

对于大小为 M × N M \times{N} M×N的图像和大小为 m × n m \times{n} m×n的核,需要 M N m n MNmn MNmn将乘法和加法运算。如果是可分离核,分别运算,则需要 M N ( m + n ) MN(m+n) MN(m+n)次乘法和加法运算。因此使用可分离核执行卷积运算相对于不可分离核执行卷积的计算优势定义为:
C = M N m n M N ( m + n ) = m n m + n (3.44) C = \frac{MNmn}{MN(m + n)} = \frac{mn}{m+n} \tag{3.44} C=MN(m+n)MNmn=m+nmn(3.44)

怎样确定核是否可分离

  • 只需要确定其秩是否为1
  • 因为分离的向量的秩都是为1的

确定某个核的矩阵秩为1后,就很容易求出两个向量 v v v w w w,因为它们的外积 v w T vw^T vwT等于这个核,求可分离核的步骤:

  1. 在核中找到任何一个非零元素,并将其值表示为 E E E
  2. 形成向量 c c c r r r,它们分别等于核中包含步骤1中找到的元素的那一行和那一列。
  3. 参考式(3.41),令 v = c v = c v=c w T = r / E w^T = r/E wT=r/E

我们的目的是找到两个一维核 w 1 w_1 w1 w 2 w_2 w2,以便实现一维卷积。根据前面的表示方法,有 w 1 = c = v , w 2 = r / E = w T w_1 = c = v, w_2 = r/E = w^T w1=c=v,w2=r/E=wT。对于圆对称的核,通过核的中心的列描述整个核,即 w = v v T / c w = vv^T /c w=vvT/c,其中 c c c是中心系数的值,则一维分量是 w 1 = v w_1 = v w1=v w 2 = v T / c w_2 = v^T /c w2=vT/c

空间域滤波和频率域滤波的一些重要比较

  1. 卷积是空间域滤波的基础,它等效于频度域的乘法,反之亦然。
  2. 空间域中振幅为 A A A的冲激,是频率域中值为 A A A的一个常数,反之亦然。

如何构建空间滤波器

  • 根据其数学性质

  • 计算邻域像素平均值的滤波器会模糊图像,类似于积分

  • 计算图像局部导数滤波器会锐化图像。

  • 对形状具有所需性质的二维空间函数进行取样

  • 高斯函数 的样本可以构建加权平均(低通)滤波器

  • 设计具有规定频率响应的空间滤波器

  • 一维滤波器值可以表示为向量 v v v,而二维可分离核可用式(3.42)得到

  • 近似于圆对称函数的二维核

程序部分
程序里也许多注释,如有不明,也可以给我留言或者发信息

def kernel_seperate(kernel):
    """
    get kernel seperate vector if separable
    param: kernel: input kerel of the filter
    return two separable vector c, r
    """
    rank = np.linalg.matrix_rank(kernel)

    if rank == 1:

    #-----------------Numpy------------------
        c = np.min(kernel, axis=1)
        r = np.min(kernel, axis=0)

    #------------------Loop------------------
    #     r = w[0, :]
    #     c = []
    #     for i in range(w.shape[0]):
    #         temp = int((w[i, :] / r)[0])
    #         c.append(temp)

    else:
        print('Not separable')

#     c = np.array(c).reshape(-1, 1)
#     r = np.array(r).reshape(-1, 1)
    
    return c, r

第一种卷积方法(公式法)

根据公式用循环写的,方法的命名为算术平均,实际上是基础的一个离散卷积运算。
这种方法就是效率比较低,但是大核也可以运算,不会爆内存

def arithmentic_mean(image, kernel):
    """
    :param image: input image
    :param kernel: input kernel
    :return: image after convolution
    """
    
    img_h = image.shape[0]
    img_w = image.shape[1]

    m = kernel.shape[0]
    n = kernel.shape[1]

    # padding
    padding_h = int((m - 1)/2)
    padding_w = int((n - 1)/2)

    image_pad = np.pad(image.copy(), [(padding_h, padding_h), (padding_w, padding_w)], mode="constant", constant_values=0)

    image_convol = image.copy().astype(np.float)
    for i in range(padding_h, img_h + padding_h):
        for j in range(padding_w, img_w + padding_w):
            temp = np.sum(image_pad[i-padding_h:i+padding_h+1, j-padding_w:j+padding_w+1] * kernel)
            image_convol[i - padding_h][j - padding_w] = temp # 1/(m * n) * temp

    return image_convol, image_pad

第二种卷积的方法(可分离核)

如果核不可分离的话,不可以用此方法
此种方法效率很高,而且大的核也不会爆内存,就是需要核可分离

def separate_kernel_conv2D(img, kernel):
    """
    separable kernel convolution 2D, if 2D kernel not separable, then canot use this fuction. in the fuction. first need to
    seperate the 2D kernel into two 1D vectors.
    param: img: input image you want to implement 2d convolution with 2d kernel
    param: kernel: 2D separable kernel, such as Gaussian kernel, Box Kernel
    return image after 2D convolution
    """
    m, n = kernel.shape[:2]
    pad_h = int((m - 1) / 2)
    pad_w = int((n - 1) / 2)
    w_1, w_2 = kernel_seperate(kernel)
    convovle = np.vectorize(np.convolve, signature='(x),(y)->(k)')
    r_2 = convovle(img, w_2)
    r_r = np.rot90(r_2)
    r_1 = convovle(r_r, w_1)
    r_1 = r_1.T
    r_1 = r_1[:, ::-1]
    img_dst = img.copy().astype(np.float)
    img_dst = r_1[pad_h:-pad_h, pad_w:-pad_w]
    return img_dst

第三种方法(img2col)

此方效率虽然比第一种高,但比第二种低,但是需要大量的内存才能实现,尤其是大核,很容易爆内存错误

def img2col(image, kernel_shape, padding='same', mode='constant', strides=1):
    """
    image to col fuction
    param: image: input image you want to change to col
    param: kernel_shape: input shape of the kernel, normally is a tuple, like (3, 3)
    param: padding: string, valid values 'valid' and 'same', 'valid' then not padding, output shape will be smaller than input
           image, 'same' padding the input image, output image will be same shape as input image
    param: mode: padding mode, please refere to numpy.pad. valid string values are 'constant', 'edge', 'linear_ramp', 'maximum', 
            'mean', 'median', 'minimum', 'reflect', 'symmetric', 'wrap'
    param: strides: strides of the slidding window, default is 1
    """
    height, width = image.shape[:2]
    kernel_h, kernel_w = kernel_shape[:2]
    
    # padding
    if padding == 'same':
        '''padding here only consider convolution, not strides '''
        result_shape = np.array([height , width])
        pad_h = (kernel_h - 1) // 2
        pad_w = (kernel_w - 1) // 2
        image_new = np.pad(image, [(pad_h, pad_h), (pad_w, pad_w)], mode=mode)
    elif padding == 'valid':
        result_shape = np.array([(height - kernel_h + 1), (width - kernel_w + 1)])
        image_new = image 
        
    image2col = np.zeros((kernel_h*kernel_w, (result_shape[0]//strides) * (result_shape[1]//strides)))
    count = 0
    for h in range(0, result_shape[0], strides):
        for w in range(0, result_shape[1], strides):
            cur_item = image_new[h:h+kernel_h, w:w+kernel_w].reshape(-1)
            image2col[:, count] = cur_item 
            count += 1
    return image2col, result_shape // strides

def my_conv2d(image, kernel, padding="same", mode='constant', strides=1):
    """
    convolution implement using img2col
    param: image: input image you want to change to implement convolution
    param: kernel: kernel
    param: padding: string, valid values 'valid' and 'same', 'valid' then not padding, output shape will be smaller than input
            image, 'same' padding the input image, output image will be same shape as input image
    param: mode: padding mode, please refere to numpy.pad. valid string values are 'constant', 'edge', 'linear_ramp', 'maximum', 
            'mean', 'median', 'minimum', 'reflect', 'symmetric', 'wrap'
    param: strides: strides of the slidding window, default is 1
    """
    image2col, result_shape = img2col(image, kernel.shape, padding=padding, mode=mode, strides=strides)
    kernel_flatten = kernel.reshape(1, -1)
    convolve = np.matmul(kernel_flatten, image2col)
    return convolve.reshape(result_shape[0], result_shape[1])

这是为了实现效果而写的方法

def visualize_input(img, ax):
    """
    add annotation to the image, values of each pixel
    param: img: input image
    param: ax: axes of the matplotlib
    """
    ax.imshow(img, cmap='gray', vmin=0, vmax=255)
    width, height = img.shape
    thresh = 100#img.max()/2.5
    for x in range(width):
        for y in range(height):
            ax.annotate(str(round(img[x][y],2)), xy=(y,x),
                        horizontalalignment='center',
                        verticalalignment='center',
                        color='white' if img[x][y]<thresh else 'black')
            
def visualize_kernel(img, ax):
    """
    add annotation to kernel w(0, 0)
    param: img: input image
    param: ax: axes of the matplotlib
    """
    ax.imshow(img, cmap='gray', vmin=0, vmax=255), ax.set_xticks([]), ax.set_yticks([])
    width, height = img.shape
    thresh = img.max()/2.5
    for x in range(width):
        for y in range(height):
            ax.annotate('w'+ str((x - 1, y - 1)), xy=(y,x),
                        horizontalalignment='center',
                        verticalalignment='center',
                        color='white' if img[x][y]<thresh else 'black')
            
def visualize_image(img, ax):
    """
    add annotation to kernel with f(x, y)
    param: img: input image
    param: ax: axes of the matplotlib
    """
    ax.imshow(img, cmap='gray', vmin=0, vmax=255), ax.set_xticks([]), ax.set_yticks([])
    width, height = img.shape
    thresh = img.max()/2.5
    for x in range(width):
        for y in range(height):
            ax.annotate('f'+ str(('x' + str(x - 1), 'y' + str(y - 1))), xy=(y,x),
                        horizontalalignment='center',
                        verticalalignment='center',
                        color='white' if img[x][y]<thresh else 'black')
# 线性空间滤波的原理图
height, width = 10, 10
img_ori = get_check(height, width, lower=230, upper=255)

grid_size = (3, 3)
m = grid_size[0]
n = grid_size[1]

img_ori[height//2-m:height//2, width//2-n:width//2] = img_ori[height//2-m:height//2, width//2-n:width//2] - 100
img_dst = img_ori[height//2-m:height//2, width//2-n:width//2]

fig = plt.figure(figsize=(15, 6))

plt.subplot(1, 3, 1), plt.imshow(img_ori, cmap='gray', vmin=0, vmax=255), plt.title(f'Convolution')
plt.xticks([]), plt.yticks([])

ax2 = fig.add_subplot(1, 3, 2)
ax2.set_title('Kernel'), visualize_kernel(img_dst, ax2)

ax3 = fig.add_subplot(1, 3, 3)
ax3.set_title('Image'), visualize_image(img_dst, ax3)

plt.tight_layout()
plt.show()

第3章 Python 数字图像处理(DIP) - 灰度变换与空间滤波12 - 空间域滤波基础 - 卷积运算(numpy 实现的三种卷积运算)_第1张图片

def visualize_show_annot(img_show, img_annot, ax):
    """
    add annotation to the image, values of each pixel
    param: img: input image
    param: ax: axes of the matplotlib
    """
    height, width = img_annot.shape
    img_show = img_show[:height, :width]
    ax.imshow(img_show, cmap='gray', vmin=0, vmax=255)
    thresh = 100 #img_show.max()/2.5
    for x in range(height):
        for y in range(width):
            ax.annotate(str(round(img_annot[x][y],2)), xy=(y,x),
                        horizontalalignment='center',
                        verticalalignment='center',
                        color='white' if img_annot[x][y]<thresh else 'black')
# 线性空间滤波的原理图,为了好看,原图像显示为灰色
height, width = 5, 5
img_show = np.ones([height, width], dtype=np.uint8) * 180
img_ori = np.zeros([height, width], dtype=np.uint8)
img_ori[2, 2] = 1

kernel = np.arange(1, 10).reshape(3, 3)
img_dst, img_pad = arithmentic_mean(img_ori, kernel)

fig = plt.figure(figsize=(10, 10))
# fig.subplots_adjust(left=None, bottom=None, right=None, top=None,wspace=0.1, hspace=0.1)
ax1 = fig.add_subplot(2, 2, 1)
ax1.set_title('Original f'), visualize_show_annot(img_show, img_ori, ax1), plt.xticks([]), plt.yticks([])

height, width = 7, 7
img_show = np.ones([height, width], dtype=np.uint8) * 180
ax2 = fig.add_subplot(2, 2, 2)
ax2.set_title('Padded f'), visualize_show_annot(img_show, img_pad, ax2), plt.xticks([]), plt.yticks([])

ax3 = fig.add_subplot(2, 2, 3)
ax3.set_title('Kernel'), visualize_show_annot(img_show, kernel, ax3), plt.xticks([]), plt.yticks([])

ax4 = fig.add_subplot(2, 2, 4)
ax4.set_title('Result'), visualize_show_annot(img_show, img_dst, ax4), plt.xticks([]), plt.yticks([])

plt.tight_layout()
plt.show()

第3章 Python 数字图像处理(DIP) - 灰度变换与空间滤波12 - 空间域滤波基础 - 卷积运算(numpy 实现的三种卷积运算)_第2张图片

# 旋转kernel得到的结果
height, width = 5, 5
img_show = np.ones([height, width], dtype=np.uint8) * 190
img_ori = np.zeros([height, width], dtype=np.uint8)
img_ori[2, 2] = 1

kernel = np.arange(1, 10)[::-1].reshape(3, 3)
img_dst, img_pad = arithmentic_mean(img_ori, kernel)

fig = plt.figure(figsize=(10, 10))
# fig.subplots_adjust(left=None, bottom=None, right=None, top=None,wspace=0.1, hspace=0.1)
ax1 = fig.add_subplot(2, 2, 1)
ax1.set_title('Original f'), visualize_show_annot(img_show, img_ori, ax1), plt.xticks([]), plt.yticks([])

height, width = 7, 7
img_show = np.ones([height, width], dtype=np.uint8) * 180
ax2 = fig.add_subplot(2, 2, 2)
ax2.set_title('Padded f'), visualize_show_annot(img_show, img_pad, ax2), plt.xticks([]), plt.yticks([])

ax3 = fig.add_subplot(2, 2, 3)
ax3.set_title('Kernel'), visualize_show_annot(img_show, kernel, ax3), plt.xticks([]), plt.yticks([])

ax4 = fig.add_subplot(2, 2, 4)
ax4.set_title('Result'), visualize_show_annot(img_show, img_dst, ax4), plt.xticks([]), plt.yticks([])

plt.tight_layout()
plt.show()

第3章 Python 数字图像处理(DIP) - 灰度变换与空间滤波12 - 空间域滤波基础 - 卷积运算(numpy 实现的三种卷积运算)_第3张图片

这是分离核与公式法的结果对比

# 可分离卷积核与全核卷积结果一样(上面的结果)
height, width = 5, 5
img_show = np.ones([height, width], dtype=np.uint8) * 180
img_ori = np.zeros([height, width], dtype=np.uint8)
img_ori[1:4, 2] = 1

# Separate kernel
w = np.array([[1, 2, 1], [2, 4, 2], [1, 2, 1]])
w_1, w_2 = kernel_seperate(w)
w_1 = w_1.reshape(-1, 1)
w_2 = w_2.reshape(-1, 1)
img_w_1, img_pad_1 = arithmentic_mean(img_ori, w_1)
img_w_2, img_pad_2 = arithmentic_mean(img_w_1, w_2.T)

fig = plt.figure(figsize=(15, 10))
ax1 = fig.add_subplot(2, 3, 1)
ax1.set_title('Original f'), visualize_show_annot(img_show, img_ori, ax1), plt.xticks([]), plt.yticks([])

height, width = 7, 7
img_show = np.ones([height, width], dtype=np.uint8) * 180
ax2 = fig.add_subplot(2, 3, 2)
ax2.set_title('img_pad_1'), visualize_show_annot(img_show, img_pad_1, ax2), plt.xticks([]), plt.yticks([])

ax3 = fig.add_subplot(2, 3, 3)
ax3.set_title('img_pad_2'), visualize_show_annot(img_show, img_pad_2, ax3), plt.xticks([]), plt.yticks([])

ax4 = fig.add_subplot(2, 3, 4)
ax4.set_title('img_w_1'), visualize_show_annot(img_show, img_w_1, ax4), plt.xticks([]), plt.yticks([])

ax5 = fig.add_subplot(2, 3, 5)
ax5.set_title('img_w_2'), visualize_show_annot(img_show, img_w_2, ax5), plt.xticks([]), plt.yticks([])

plt.tight_layout()
plt.show()

第3章 Python 数字图像处理(DIP) - 灰度变换与空间滤波12 - 空间域滤波基础 - 卷积运算(numpy 实现的三种卷积运算)_第4张图片

# 可分离卷积核与全核卷积结果一样(上面的结果)
# import time
height, width = 5, 5
# img_show = np.ones([height, width], dtype=np.uint8) * 180
# img_ori = np.zeros([height, width], dtype=np.uint8)
# img_ori[1:4, 2] = 1
img_ori = np.arange(1, height*width+1).reshape(height, width)

# Separate kernel
w = np.array([[1, 2, 1], [2, 4, 2], [1, 2, 1]])
w_1, w_2 = kernel_seperate(w)
w_1 = w_1.reshape(-1, 1)
w_2 = w_2.reshape(-1, 1)
img_w_1, img_pad_1 = arithmentic_mean(img_ori, w_1)
img_w_2, img_pad_2 = arithmentic_mean(img_w_1, w_2.T)

fig = plt.figure(figsize=(15, 10))
ax1 = fig.add_subplot(2, 3, 1)
ax1.set_title('Original f'), visualize_show_annot(img_show, img_ori, ax1), plt.xticks([]), plt.yticks([])

height, width = 7, 7
img_show = np.ones([height, width], dtype=np.uint8) * 180
ax2 = fig.add_subplot(2, 3, 2)
ax2.set_title('img_pad_1'), visualize_show_annot(img_show, img_pad_1, ax2), plt.xticks([]), plt.yticks([])

ax3 = fig.add_subplot(2, 3, 3)
ax3.set_title('img_pad_2'), visualize_show_annot(img_show, img_pad_2, ax3), plt.xticks([]), plt.yticks([])

ax4 = fig.add_subplot(2, 3, 4)
ax4.set_title('img_w_1'), visualize_show_annot(img_show, img_w_1, ax4), plt.xticks([]), plt.yticks([])

ax5 = fig.add_subplot(2, 3, 5)
ax5.set_title('img_w_2'), visualize_show_annot(img_show, img_w_2, ax5), plt.xticks([]), plt.yticks([])

img_img2col = my_conv2d(img_ori, w, padding='same', mode='constant').astype(int)
ax6 = fig.add_subplot(2, 3, 6)
ax6.set_title('img_img2col'), visualize_show_annot(img_show, img_img2col, ax6), plt.xticks([]), plt.yticks([])

plt.tight_layout()
plt.show()

第3章 Python 数字图像处理(DIP) - 灰度变换与空间滤波12 - 空间域滤波基础 - 卷积运算(numpy 实现的三种卷积运算)_第5张图片

下面是各种卷积的运算时间的比较

可以看出可分离核,11x11大小的核,numpy的运算效率也是很高的。

# 全核卷积与分离核单独做卷积,没有优化,所以分离核时间比是全核卷积的一倍
# 这里只做展示,结果不重要
import time

height, width = 512, 512
img_show = np.ones([height, width], dtype=np.uint8) * 180
img_ori = np.zeros([height, width], dtype=np.uint8)
img_ori[1:4, 2] = 1

# Kernel
# w = np.array([[1, 2, 1], [2, 4, 2], [1, 2, 1]])
w = np.ones([11, 11])

# Loop convolution comsum more time, but not need very big memory
start_time = time.time()
img_dst = arithmentic_mean(img_ori, w)
elapse = time.time() - start_time
print(f'Full kernel covolution elapse -> {elapse}s')

# img2col, bit faster than loop, but need very more memory, if kernel is too big, cannot carry on
start_time = time.time()
img_dst = my_conv2d(img_ori, w)
elapse = time.time() - start_time
print(f'Img2col covolution elapse -> {elapse}s')

# Separable Kernel convolution optimized by Numpy, run very fast, but the kernel has to be separable
start_time = time.time()
img_dst = separate_kernel_conv2D(img_ori, w)
elapse = time.time() - start_time
print(f'Seperate kernel covolution optimized elapse -> {elapse}s')

# Separable kernel not optimized
start_time = time.time()
w_1, w_2 = kernel_seperate(w)
w_1 = w_1.reshape(-1, 1)
w_2 = w_2.reshape(-1, 1)
img_w_1, img_pad_1 = arithmentic_mean(img_ori, w_1)
img_w_2, img_pad_2 = arithmentic_mean(img_w_1, w_2.T)
elapse = time.time() - start_time
print(f'separate kernel covolution elapse -> {elapse}s')

# opencv run very fast, but not actual convolution
start_time = time.time()
img_dst = cv2.filter2D(img_ori, -1, w)
elapse = time.time() - start_time
print(f'Opencv elapse -> {elapse}s')
Full kernel covolution elapse -> 1.2127759456634521s
Img2col covolution elapse -> 0.584395170211792s
Seperate kernel covolution optimized elapse -> 0.009973526000976562s
separate kernel covolution elapse -> 2.176215887069702s
Opencv elapse -> 0.004025936126708984s

在接下来的文章也会有这几种卷积方法效果的对比

书上的一些习题

# 习题3.20
w = np.array([[1, 2, 1], [2, 4, 2], [1, 2, 1]])
print(w)
print()
rank = np.linalg.matrix_rank(w)
print(f"Kernel's rank is -> {rank}\n")
c, r = kernel_seperate(w)
print('Two vector are below:')
print(f'vector c -> {c}, vector r -> {r}\n')
print(f'cv.T -> \n{c.reshape(-1, 1) * r.reshape(-1, 1).T}')
[[1 2 1]
 [2 4 2]
 [1 2 1]]

Kernel's rank is -> 1

Two vector are below:
vector c -> [1 2 1], vector r -> [1 2 1]

cv.T -> 
[[1 2 1]
 [2 4 2]
 [1 2 1]]
# 习题3.20
height, width = 5, 5
img_show = np.ones([height, width], dtype=np.uint8) * 180
img_ori = np.zeros([height, width], dtype=np.uint8)
img_ori[1:4, 2] = 1

w = np.array([[1, 2, 1], [2, 4, 2], [1, 2, 1]])
kernel = w
img_dst, img_pad = arithmentic_mean(img_ori, kernel)

fig = plt.figure(figsize=(10, 10))
# fig.subplots_adjust(left=None, bottom=None, right=None, top=None,wspace=0.1, hspace=0.1)
ax1 = fig.add_subplot(2, 2, 1)
ax1.set_title('Original f'), visualize_show_annot(img_show, img_ori, ax1), plt.xticks([]), plt.yticks([])

height, width = 7, 7
img_show = np.ones([height, width], dtype=np.uint8) * 180
ax2 = fig.add_subplot(2, 2, 2)
ax2.set_title('Padded f'), visualize_show_annot(img_show, img_pad, ax2), plt.xticks([]), plt.yticks([])

ax3 = fig.add_subplot(2, 2, 3)
ax3.set_title('Kernel'), visualize_show_annot(img_show, kernel, ax3), plt.xticks([]), plt.yticks([])

ax4 = fig.add_subplot(2, 2, 4)
ax4.set_title('Result'), visualize_show_annot(img_show, img_dst, ax4), plt.xticks([]), plt.yticks([])

plt.tight_layout()
plt.show()

第3章 Python 数字图像处理(DIP) - 灰度变换与空间滤波12 - 空间域滤波基础 - 卷积运算(numpy 实现的三种卷积运算)_第6张图片

# 习题3.22
# (a)
v = np.array([[1], [2], [1]])
w_T = np.array([2, 1, 1, 3])
w = v * w_T.T
print(w)
rank = np.linalg.matrix_rank(w)
print(f"Kernel's rank is -> {rank}")
# rank = 1,所以是可分离的
[[2 1 1 3]
 [4 2 2 6]
 [2 1 1 3]]
Kernel's rank is -> 1
# 习题3.22
# (b)
w = np.array([[1, 3, 1], [2, 6, 2]])
print(w)
rank = np.linalg.matrix_rank(w)
print(f"Kernel's rank is -> {rank}")
w_1, w_2 = kernel_seperate(w)
w_r = w_1.reshape(-1, 1) * w_2.reshape(-1, 1).T
print(w == w_r)
[[1 3 1]
 [2 6 2]]
Kernel's rank is -> 1
[[ True  True  True]
 [ True  True  True]]
# 习题3.22
# (c)
w = np.array([[2, 1, 1, 3],
              [4, 2, 2, 6],
              [2, 1, 1, 3]])
print(w)
rank = np.linalg.matrix_rank(w)
print(f"Kernel's rank is -> {rank}")
w_1, w_2 = kernel_seperate(w)
w_1 = w_1.reshape(-1, 1)
w_2 = w_2.reshape(-1, 1)
w_r = w_1 * w_2.T
print(w == w_r)
[[2 1 1 3]
 [4 2 2 6]
 [2 1 1 3]]
Kernel's rank is -> 1
[[ True  True  True  True]
 [ True  True  True  True]
 [ True  True  True  True]]

你可能感兴趣的:(#,第3章,-,灰度变换与空间滤波,python,卷积,numpy,算法,图像处理)