[CS131 Computer Vision] 图像处理中卷积的理解与Python实现

博主: Chris_yg


1. 二维卷积公式及性质


g(m,n)=fh=k=l=f(m,n)h(mk,nl) g ( m , n ) = f ∗ h = ∑ k = − ∞ ∞ ∑ l = − ∞ ∞ f ( m , n ) h ( m − k , n − l )


  1. 交换律: fh=hf f ∗ h = h ∗ f
  2. 结合律: f(gh)=(fg)h f ∗ ( g ∗ h ) = ( f ∗ g ) ∗ h
  3. 分配律: f(g+h)=fg+fh f ∗ ( g + h ) = f ∗ g + f ∗ h


(1) 利用原始公式进行计算,需要4层嵌套循环:

设 f 大小为 (M1,N1) ( M 1 , N 1 ) , h 大小为 (M2,N2) ( M 2 , N 2 ) ,卷积公式可表示如下:

g(m,n)=fh=hf=k=0M11l=0N11h(m,n)f(mk,nl) g ( m , n ) = f ∗ h = h ∗ f = ∑ k = 0 M 1 − 1 ∑ l = − 0 N 1 − 1 h ( m , n ) f ( m − k , n − l )

其中, 0m<M1+M21,0m<N1+N21 0 ≤ m < M 1 + M 2 − 1 , 0 ≤ m < N 1 + N 2 − 1
[CS131 Computer Vision] 图像处理中卷积的理解与Python实现_第1张图片

import numpy as np

def conv_nested(image, kernel):
    """A naive implementation of convolution filter.

    This is a naive implementation of convolution using 4 nested for-loops.
    This function computes convolution of an image with a kernel and outputs
    the result that has the same shape as the input image.

        image: numpy array of shape (Hi, Wi)
        kernel: numpy array of shape (Hk, Wk)

        out: numpy array of shape (Hi, Wi)
    Hi, Wi = image.shape
    Hk, Wk = kernel.shape
    out = np.zeros((Hi, Wi))

    temp_m = np.zeros((Hi+Hk-1, Wi+Wk-1))     # 所得为 full 矩阵
    for i in range(Hi+Hk-1):
        for j in range(Wi+Wk-1):
            temp = 0
            # 通常来说,卷积核的尺寸远小于图片尺寸,同时卷积满足交换律,为了加快运算,可用h*f 代替 f*h 进行计算
            for m in range(Hk):
                for n in range(Wk):
                    if ((i-m)>=0 and (i-m)and (j-n)>=0 and (j-n)# 截取出 same 矩阵 (输出尺寸同输入)
    for i in range(Hi):
        for j in range(Wi):
            out[i][j] = temp_m[int(i+(Hk-1)/2)][int(j+(Wk-1)/2)]            

    return out

(2) 旋转卷积核180°,原始图像进行zero-padding,随后滑动卷积核加权求和:

[CS131 Computer Vision] 图像处理中卷积的理解与Python实现_第2张图片

def zero_pad(image, pad_height, pad_width):
    """ Zero-pad an image.

    Ex: a 1x1 image [[1]] with pad_height = 1, pad_width = 2 becomes:

        [[0, 0, 0, 0, 0],
         [0, 0, 1, 0, 0],
         [0, 0, 0, 0, 0]]         of shape (3, 5)

        image: numpy array of shape (H, W)
        pad_width: width of the zero padding (left and right padding)
        pad_height: height of the zero padding (bottom and top padding)

        out: numpy array of shape (H+2*pad_height, W+2*pad_width)

    H, W = image.shape
    out = None

    out = np.zeros((H+2*pad_height, W+2*pad_width))
    out[pad_height:pad_height+H, pad_width:pad_width+W] = image

    return out

def conv_fast(image, kernel):
    """ An efficient implementation of convolution filter.

    This function uses element-wise multiplication and np.sum()
    to efficiently compute weighted sum of neighborhood at each

        - Use the zero_pad function you implemented above
        - There should be two nested for-loops
        - You may find np.flip() and np.sum() useful

        image: numpy array of shape (Hi, Wi)
        kernel: numpy array of shape (Hk, Wk)

        out: numpy array of shape (Hi, Wi)
    Hi, Wi = image.shape
    Hk, Wk = kernel.shape
    out = np.zeros((Hi, Wi))

    pad_height = Hk // 2
    pad_width = Wk // 2
    image_padding = zero_pad(image, pad_height, pad_width)
    kernel_flip = np.flip(np.flip(kernel, 0), 1)

    for i in range(Hi):
        for j in range(Wi):            
            out[i][j] = np.sum(np.multiply(kernel_flip, image_padding[i:(i+Hk), j:(j+Wk)]))

    return out

(3) 利用傅里叶变换


F(fh)=F(f)F(h) F ( f ∗ h ) = F ( f ) · F ( h )

fh=F1(F(f)F(h)) f ∗ h = F − 1 ( F ( f ) · F ( h ) )

其中,F表示傅里叶变换, F1 F − 1 为傅里叶逆变换
