论文阅读笔记：经典论文-可视化和理解卷积神经网络

论文网址
实现代码
中文相关解读

阅读目标：

文中如何可视化网络的？
通过可视化网络，作者理解了哪些信息？

一、Introduction

在related work里介绍了两个概念：Visualization和Feature Generalization。
之前我写过一篇feature visualization的文章，这里有提到。那种方法的缺点是requires a careful initialization and does not give any information about the unit’s invariances.
这里提到一个词叫做”unit’s invariances“，表面含义是单元的不变形。
这里作者对自己的网络的特点有一个概括：

they are not just crops of input images, but rather top-down projections that reveal structures within each patch that stimulate a particular feature map.

Feature generalization具体是指：
the generalization ability of convnet features

二、Approach

(一)、所用模型介绍

用到的是standard fully supervised convnet models。
彩色2D输入——>C类概率
每一层包含：
(i)卷积层
(ii)relu层: a rectified linear function (relu(x)= max(x, 0))
(iii)[optionally] 最大池化层
(iv)[optionally]局部归一化a local contrast operation that normalizes the responses across feature maps
前几层是全连接卷积层层，最后一层是一个softmax分类器。

(二) 训练过程

训练集：{x, y}
损失函数：cross-entropy loss function
比较yi^和yi
训练过程描述：
（这句话写的蛮好的，我就复制粘贴过来了）

The parameters of the network (filters in the convolutional layers, weight matrices in the fully- connected layers and biases) are trained by back-propagating the derivative of the loss with respect to the parameters throughout the network, and updating the parameters via stochastic gradient descent. Details

(三)、可视化的方法

目标：理解the feature activity in 中间层（intermediate layers）
做法概括：map these activities backto the input pixel space，用一个解卷积网络去实现这样的映射
解卷积可以理解为卷积的逆向操作(filtering, pooling)
具体做法：
convet的每一层都链接了一个deconvnet。
如果要看某一个convnet的activation，我们可以把这一层的其他activation都设为0，然后把这些feature maps输入到attached deconvnet layer。
进而进行(i) unpool:
max pooling其实是不可逆的，但是我们在”switch“这个变量中，记录每一个pooling region的最大值的位置。在解卷积网络中，unpooling操作用这些”switches”去把layer reconstructions放到合适的位置。
(ii) rectify: RELU
确保feature map always positive
(iii) filter:
这一步是卷积的逆向操作。
approximately invert。
用filter的转置，并且用于rectified maps而不是
用于the output of the layer。

In practice this means flipping each filter vertically and horizontally.

这三步需要重复until input pixel space is reached。

原理图