Tensorflow深度学习之二十六:atrous convolution

  Computes a 2-D atrous convolution, also known as convolution with holes or dilated convolution, given 4-D `value` and `filters` tensors. If the `rate` parameter is equal to one, it performs regular 2-D convolution. If the `rate` parameter is greater than one, it performs convolution with holes, sampling the input values every `rate` pixels in the `height` and `width` dimensions. This is equivalent to convolving the input with a set of upsampled filters, produced by inserting `rate - 1` zeros between two consecutive values of the filters along the `height` and `width` dimensions, hence the name atrous convolution or convolution with holes (the French word trous means holes in English).

  计算2-D atrous卷积,也称为带孔或卷积的卷积扩张卷积,给出4-D`value’和\’filters\’张量。如果`率`参数等于1,它执行常规的2-D卷积。如果`rate`参数大于1,它执行带孔的卷积,采样输入值为“height”和“width”维度中的每个“rate”像素。这相当于使用一组上采样过滤器对输入进行卷积,通过在两个连续的值之间插入`rate-1`零来产生沿着’height`和`width`尺寸过滤,因此名称很难带孔的卷积或卷积(法语单词trous表示孔)。

  Atrous convolution allows us to explicitly control how densely to compute feature responses in fully convolutional networks. Used in conjunction with bilinear interpolation, it offers an alternative to conv2d_transpose in dense prediction tasks such as semantic image segmentation, optical flow computation, or depth estimation. It also allows us to effectively enlarge the field of view of filters without increasing the number of parameters or the amount of computation.


  For a description of atrous convolution and how it can be used for dense feature extraction, please see: Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs. The same operation is investigated further in [Multi-Scale Context Aggregation by Dilated Convolutions] http://arxiv.org/abs/1511.07122). Previous works that effectively use atrous convolution in different ways are, among others, OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks and Fast Image Scanning with Deep Max-Pooling Convolutional Neural Networks. Atrous convolution is also closely related to the so-called noble identities in multi-rate signal processing.

  有关atrous卷积的描述以及它如何用于密集特征提取,请参阅:[深度语义图像分割卷积网和完全连接的CRF]http://arxiv.org/abs/1412.7062)。在[Multi-Scale Context Aggregation]中进一步研究了相同的操作通过扩张卷积]http://arxiv.org/abs/1511.07122)。以前的作品有效地使用不同方式的萎缩卷积,其中包括:[OverFeat:集成识别,定位和检测使用卷积网络](http://arxiv.org/abs/1312.6229)和[快速图像使用Deep Max-Pooling卷积神经网络进行扫描] http://arxiv.org/abs/1302.1700)。痛苦的卷积也与所谓的贵族身份密切相关在多速率信号处理中。

Tensorflow深度学习之二十六:atrous convolution_第1张图片
Standard Convolution with a 3 x 3 kernel (and padding)
Tensorflow深度学习之二十六:atrous convolution_第2张图片
Dilated Convolution with a 3 x 3 kernel and dilation rate 2


new_height=rate(height1)+1new_width=rate(width1)+1 n e w _ h e i g h t = r a t e ∗ ( h e i g h t − 1 ) + 1 n e w _ w i d t h = r a t e ∗ ( w i d t h − 1 ) + 1



conv2d(input, atrous_kernel, [1, 1, 1, 1], 'VALID')
# 注:
# input为需要被卷积的张量或者特征图,
# atrous_kernel为前述的经过插入空洞的扩张之后的卷积核
# strides在这里必须为[1, 1, 1, 1]
# padding在这里也必须为'VALID'



H+x(rate(h1)+1)+1=H H + x − ( r a t e ∗ ( h − 1 ) + 1 ) + 1 = H

x=rate(h1) x = r a t e ∗ ( h − 1 )

   x x 为height方向上,上方和下方需要添加的0的层数。当 rate(h1) r a t e ∗ ( h − 1 ) 为偶数时,上方和下方需要添加的0的层数相等,当 rate(h1) r a t e ∗ ( h − 1 ) 为奇数时,上方需要添加的数目比下方少1。



conv2d(input_after_paddind_zero, atrous_kernel, [1, 1, 1, 1], 'VALID')
# 注:
# input_after_paddind_zero为经过上述添加0之后的张量或者特征图。
# atrous_kernel为前述的经过插入空洞的扩张之后的卷积核
# strides在这里必须为[1, 1, 1, 1]
# padding在这里也必须为'VALID'


def atrous_conv2d(value, filters, rate, padding, name=None)
参数 作用
value A 4-D Tensor of type float. It needs to be in the default “NHWC” format. Its shape is [batch, in_height, in_width, in_channels]. 4维张量,表示输入的特征图
filters A 4-D Tensor with the same type as value and shape [filter_height, filter_width, in_channels, out_channels]. filtersin_channels dimension must match that of value. Atrous convolution is equivalent to standard convolution with upsampled filters with effective height filter_height + (filter_height - 1) * (rate - 1) and effective width filter_width + (filter_width - 1) * (rate - 1), produced by inserting rate - 1 zeros along consecutive elements across the filters’ spatial dimensions. 卷积核
rate A positive int32. The stride with which we sample input values across the height and width dimensions. Equivalently, the rate by which we upsample the filter values by inserting zeros across the height and width dimensions. In the literature, the same parameter is sometimes called input stride or dilation. rate,表示插入空洞的数目
padding A string, either 'VALID' or 'SAME'. The padding algorithm. padding
name Optional name for the returned tensor. 名称



import tensorflow as tf
import numpy as np

len(tensor.shape) = 2
len(kernel.shape) = 2

def my_atrous_conv2d(tensor, kernel, rate, padding):
    tensor = np.array(tensor, dtype=np.float32)
    kernel = np.array(kernel, dtype=np.float32)

    shape = kernel.shape

    # 定义一个矩阵,存放插入空洞之后的卷积核
    atrous_kernel = np.zeros(shape=[rate * (shape[0] - 1) + 1, rate * (shape[1] - 1) + 1], dtype=np.float32)

    # 将原卷积核的元素依次放入对应位置
    for i, line in enumerate(kernel):
        for j, number in enumerate(line):
            atrous_kernel[i * rate, j * rate] = number

    # 当padding='valid'时:
    if padding.lower() == 'valid':
        tensor = np.expand_dims(tensor, axis=-1)
        tensor = np.expand_dims(tensor, axis=0)

        atrous_kernel = np.expand_dims(atrous_kernel, axis=-1)
        atrous_kernel = np.expand_dims(atrous_kernel, axis=-1)

        # 直接使用卷积操作
        conv = tf.nn.conv2d(tensor, atrous_kernel, [1, 1, 1, 1], 'VALID')
        return conv

    # 当padding='SAME'时:
        atrous_kernel = np.expand_dims(atrous_kernel, axis=-1)
        atrous_kernel = np.expand_dims(atrous_kernel, axis=-1)

        # 进行上下左右填充0
        up = np.zeros(shape=[rate * (kernel.shape[1] - 1) // 2, tensor.shape[1]], dtype=np.float32)
        bottom = np.zeros(shape=[rate * (kernel.shape[1] - 1) - rate * (kernel.shape[1] - 1) // 2, tensor.shape[1]], dtype=np.float32)

        tensor = np.concatenate((up, tensor, bottom), axis=0)

        left = np.zeros(shape=[tensor.shape[0], rate * (kernel.shape[0] - 1) // 2], dtype=np.float32)
        right = np.zeros(shape=[tensor.shape[0], rate * (kernel.shape[0] - 1) - rate * (kernel.shape[0] - 1) // 2], dtype=np.float32)

        tensor = np.concatenate((left, tensor, right), axis=1)

        tensor = np.expand_dims(tensor, axis=-1)
        tensor = np.expand_dims(tensor, axis=0)

        # 最后返回卷积结果
        return tf.nn.conv2d(tensor, atrous_kernel, [1, 1, 1, 1], 'VALID')

def tf_atrous_conv2d(tensor, kernel, rate, padding):
    tensor = np.array(tensor, dtype=np.float32)
    kernel = np.array(kernel, dtype=np.float32)

    tensor = np.expand_dims(tensor, 0)
    tensor = np.expand_dims(tensor, -1)

    kernel = np.expand_dims(kernel, -1)
    kernel = np.expand_dims(kernel, -1)

    return tf.nn.atrous_conv2d(tensor, kernel, rate, padding)

if __name__ == '__main__':
    a = np.arange(81)
    a = np.reshape(a, [9, 9])
    k = np.arange(99, 99+9)
    k = np.reshape(k, [3, 3])

    atrous_conv1 = my_atrous_conv2d(a, k, 3, 'SAME')
    atrous_conv2 = tf_atrous_conv2d(a, k, 3, 'SAME')

    with tf.Session() as sess1:
        a1 = atrous_conv1.eval().reshape([9, 9])
        a2 = atrous_conv2.eval().reshape([9, 9])
        print(a1 == a2)


    a = np.random.random([9, 9]).astype(np.float32)
    k = np.random.random([3, 3]).astype(np.float32)

    atrous_conv1 = my_atrous_conv2d(a, k, 3, 'SAME')
    atrous_conv2 = tf_atrous_conv2d(a, k, 3, 'SAME')

    with tf.Session() as sess2:
        a1 = atrous_conv1.eval().reshape([9, 9])
        a2 = atrous_conv2.eval().reshape([9, 9])
        print(a1 - a2 < 1e-6)


[[16774. 15349. 17182. 18997. 22688. 25712. 12038. 16006. 15595.]
 [10372. 23398. 22395. 14290. 32495. 24145.  9781. 19842. 16505.]
 [15025. 23846.  5930. 26586. 25956. 15106. 14836. 16754. 11041.]
 [24527. 30590. 28034. 33654. 44940. 37008. 20302. 30919. 20154.]
 [17262. 29459. 30172. 26122. 44556. 35347. 18303. 30932. 27821.]
 [22972. 36041. 16298. 38027. 41745. 31230. 22169. 28649. 22251.]
 [13632. 18299. 16718. 22239. 28881. 24208. 14944. 19558. 12480.]
 [12129. 17634. 17297. 18678. 26119. 22487. 12831. 17893. 19671.]
 [13000. 25086. 15062. 22432. 29108. 28309. 14388. 19006. 20159.]]
[[16774. 15349. 17182. 18997. 22688. 25712. 12038. 16006. 15595.]
 [10372. 23398. 22395. 14290. 32495. 24145.  9781. 19842. 16505.]
 [15025. 23846.  5930. 26586. 25956. 15106. 14836. 16754. 11041.]
 [24527. 30590. 28034. 33654. 44940. 37008. 20302. 30919. 20154.]
 [17262. 29459. 30172. 26122. 44556. 35347. 18303. 30932. 27821.]
 [22972. 36041. 16298. 38027. 41745. 31230. 22169. 28649. 22251.]
 [13632. 18299. 16718. 22239. 28881. 24208. 14944. 19558. 12480.]
 [12129. 17634. 17297. 18678. 26119. 22487. 12831. 17893. 19671.]
 [13000. 25086. 15062. 22432. 29108. 28309. 14388. 19006. 20159.]]
[[ True  True  True  True  True  True  True  True  True]
 [ True  True  True  True  True  True  True  True  True]
 [ True  True  True  True  True  True  True  True  True]
 [ True  True  True  True  True  True  True  True  True]
 [ True  True  True  True  True  True  True  True  True]
 [ True  True  True  True  True  True  True  True  True]
 [ True  True  True  True  True  True  True  True  True]
 [ True  True  True  True  True  True  True  True  True]
 [ True  True  True  True  True  True  True  True  True]]
[[0.85756415 0.97809374 0.6221388  0.47452182 1.1971059  1.4956675
  0.6211113  0.6013928  1.2532485 ]
 [1.3485698  0.9727119  1.4426544  1.2280183  1.2285496  1.2395221
  0.69691694 0.9009933  0.9230705 ]
 [1.0359509  0.3736195  1.0340105  0.7417851  0.95010257 0.73275506
  0.8190989  0.3943194  0.7381625 ]
 [1.4999393  1.8689545  2.3845592  1.604199   2.3503845  2.3006964
  0.33571613 1.4319618  1.351671  ]
 [2.6279428  2.2722342  1.7000405  1.9155477  1.7234559  2.4140751
  0.7227813  0.7500137  1.5047629 ]
 [1.5855988  1.7331333  1.0298394  1.8303206  1.6153543  1.8850565
  1.0260942  1.2247385  0.9588959 ]
 [0.5739771  1.4931052  0.47283304 0.4867544  1.1949276  1.5942652
  0.550773   0.6214407  1.1143594 ]
 [0.9355327  1.0806943  1.419316   1.126513   1.2977234  1.4669281
  0.5325237  0.8319793  0.85846186]
 [0.86645645 0.5750332  0.7152909  1.0724365  0.5485303  0.91686004
  0.77844644 0.42520177 0.64212894]]
[[0.85756415 0.97809374 0.6221388  0.4745218  1.1971059  1.4956675
  0.6211113  0.6013928  1.2532485 ]
 [1.3485698  0.9727119  1.4426544  1.2280183  1.2285496  1.2395222
  0.69691694 0.9009933  0.9230705 ]
 [1.0359509  0.3736195  1.0340106  0.7417851  0.9501025  0.73275506
  0.8190989  0.3943194  0.7381625 ]
 [1.4999394  1.8689545  2.3845594  1.604199   2.3503845  2.3006964
  0.33571613 1.4319618  1.351671  ]
 [2.6279428  2.2722342  1.7000405  1.9155477  1.7234559  2.4140751
  0.7227813  0.7500137  1.5047629 ]
 [1.585599   1.7331333  1.0298394  1.8303206  1.6153543  1.8850565
  1.0260942  1.2247385  0.9588959 ]
 [0.5739771  1.4931052  0.47283304 0.48675442 1.1949276  1.5942653
  0.550773   0.6214407  1.1143594 ]
 [0.9355327  1.0806943  1.419316   1.126513   1.2977233  1.4669281
  0.53252375 0.8319793  0.85846186]
 [0.86645645 0.5750332  0.7152909  1.0724365  0.54853034 0.9168601
  0.7784464  0.42520177 0.64212894]]
[[ True  True  True  True  True  True  True  True  True]
 [ True  True  True  True  True  True  True  True  True]
 [ True  True  True  True  True  True  True  True  True]
 [ True  True  True  True  True  True  True  True  True]
 [ True  True  True  True  True  True  True  True  True]
 [ True  True  True  True  True  True  True  True  True]
 [ True  True  True  True  True  True  True  True  True]
 [ True  True  True  True  True  True  True  True  True]
 [ True  True  True  True  True  True  True  True  True]]

