现有方法仅仅是从网络中的部分特征获得测量值,并且将其用于图像重建一次。
【Low-level feature: 通常是指图像中的一些小的细节信息,例如边缘(edge),角(corner),颜色(color),像素(pixeles), 梯度(gradients)等,这些信息可以通过滤波器、SIFT或HOG获取;
High level feature:是建立在low level feature之上的,可以用于图像中目标或物体形状的识别和检测,具有更丰富的语义信息。】
所以提出了MR-CCSNet(Measurements Reuse Convolutional Compressed Sensing Network),其中GSM(Global Sensing Module)用于提取所有特征,MRB(Measurements Reuse Block)用于多次重建。
GSM:
为了匹配维度,将池层添加到快捷连接中。
MRB:
实验数据集:BSDS500[2]、Set5[4]、Set14[39]
评估指标:PSNR、SSIM
消融实验:GSM和MRB是有效的
贡献:
即解决sparsity-regularized optimization problem
min x 1 2 ∥ Φ x − y ∥ 2 2 + λ ∥ Ψ x ∥ 1 \min_x\frac{1}{2}\Vert\Phi x-y\Vert^2_2 +\lambda\Vert\Psi x\Vert_1 xmin21∥Φx−y∥22+λ∥Ψx∥1
Ψ x \Psi x Ψx是 x x x相对于域 Ψ \Psi Ψ的变换系数, Ψ x \Psi x Ψx的稀疏性由 1 1 1范数表示
即用神经网络来求解逆问题,损失函数为
min θ 1 2 ∑ i = 1 k ∥ x i − F ( y i , θ ) ∥ 2 2 \min_{\theta}\frac{1}{2}\sum^k_{i=1}\Vert x_i-F(y_i,\theta)\Vert^2_2 θmin21i=1∑k∥xi−F(yi,θ)∥22
x i x_i xi是原始图像, y i y_i yi是观测, F F F是神经网络, θ \theta θ是参数
[37]中是LAPRAN,来通过不同分辨率的多个阶段同时重建原始图像。
采样率:6.25%
MR-CCSNet:【a sensing network GSM】+【an initial reconstruction network】+【a deep reconstruction network】
Sensing network S ( ⋅ ) S(\cdot) S(⋅)(GSM?):
Initial reconstruction network I ( ⋅ ) I(\cdot) I(⋅):the first time to utilize the measurements
【pixel shuffle layer:一种对低分辨率特征图上采样的思路,假设打算将 H × W × C H\times W\times C H×W×C的特征图在长和宽的维度上扩大 r r r倍变成 r H × r W × C rH\times rW\times C rH×rW×C,则通过深度为 r 2 C r^2C r2C的卷积对 H × W × C H\times W\times C H×W×C的特征图进行卷积得到 H × W × r 2 C H\times W\times r^2C H×W×r2C的特征图,再通过“周期洗牌”的操作变成 r H × r W × C rH\times rW\times C rH×rW×C】
【pixel shuffle layer就是输入 H × W H\times W H×W的低分辨率的图像,输出 r H × r W rH\times rW rH×rW的高分辨率的图像】
Deep reconstruction network D ( ⋅ ) D(\cdot) D(⋅):the second time to utilize the measurements
Finally:
The final reconstructed image x ^ \hat{x} x^:
x ^ = D ( I ( y ) ) + I ( y ) \hat{x}=D(I(y))+I(y) x^=D(I(y))+I(y)
【卷积神经网络以分层方式提取特征,则靠近输入的层学习低级特征,如线条和简单纹理。而深的层学习高级特征,如形状】
【To collect all level features for sampling, we use a shortcut connection to pass the features of different layers to the end, and the pooling layer is added for matching the dimensions.】
【采样率变化时,GSM不能很好的适配,故提出GSM+】
Different from GSM:
Add a shortcut connection between two successive layers rather than add it from different layers to the end directly.
The building block of GSM+:
y t + 1 = C o n v ( y t ) + P ( y t ) y_{t+1}=Conv(y_t)+P(y_t) yt+1=Conv(yt)+P(yt)
Conv and P denote convolution layer and meanpooling layer.
The sampling ratio is controlled by the number of building block and the blue block.
When the sampling ratio is 50 % 50\% 50%, there is only one building block in GSM+, so GSM+ degenerate into GSM.
Phased recontructed result f t ∈ R C × H × W f_t\in\mathbb{R}^{C\times H\times W} ft∈RC×H×W and measurements y ∈ R C × H 4 × W 4 y\in\mathbb{R}^{C\times\frac{H}{4}\times\frac{W}{4}} y∈RC×4H×4W are fed into MRB.
Use two convolutional layers, denoted as C o n v 1 Conv_1 Conv1 and C o n v 2 Conv_2 Conv2, to obtain a compacted feature map f ↓ ∈ R C × H 2 × W 2 f^{\downarrow}\in\mathbb{R}^{C\times\frac{H}{2}\times\frac{W}{2}} f↓∈RC×2H×2W and f ↓ ↓ ∈ R C × H 4 × W 4 f^{\downarrow\downarrow}\in\mathbb{R}^{C\times\frac{H}{4}\times\frac{W}{4}} f↓↓∈RC×4H×4W.
f ↓ = C o n v 1 ( f t ) , f ↓ ↓ = C o n v 2 ( f ↓ ) . f^{\downarrow}=Conv_1(f_t),\\ f^{\downarrow\downarrow}=Conv_2(f^{\downarrow}). f↓=Conv1(ft),f↓↓=Conv2(f↓).
Fig.5 extract matching information from measurements and obtain three feature maps y 1 ∈ R C × H 4 × W 4 y_1\in\mathbb{R}^{C\times\frac{H}{4}\times\frac{W}{4}} y1∈RC×4H×4W, y 2 ∈ R C × H 2 × W 2 y_2\in\mathbb{R}^{C\times\frac{H}{2}\times\frac{W}{2}} y2∈RC×2H×2W and y 3 ∈ R C × H × W y_3\in\mathbb{R}^{C\times H\times W} y3∈RC×H×W by Multi-Scale Reusing.
对于这块的channel融合不是很理解,如果有大佬明白的话,希望给我讲一下。
For the initial reconstruction network, l i n t = ∑ k = 1 n ∥ I ( S ( y k ; θ ) ; ϕ i n t ) − x k ∥ F 2 l_{int}=\sum^n_{k=1}\Vert I(S(y_k;\theta);\phi_{int})-x_k\Vert^2_F lint=∑k=1n∥I(S(yk;θ);ϕint)−xk∥F2.
For the deep reconstruction network, l d e e p = ∑ k = 1 n ∥ D ( I ( S ( y k ; θ ) ; ϕ i n t ) ; ϕ d e e p ) − x k ∥ F 2 l_{deep}=\sum^n_{k=1}\Vert D(I(S(y_k;\theta);\phi_{int});\phi_{deep})-x_k\Vert^2_F ldeep=∑k=1n∥D(I(S(yk;θ);ϕint);ϕdeep)−xk∥F2
θ \theta θ, ϕ i n t \phi_{int} ϕint, ϕ d e e p \phi_{deep} ϕdeep denote the parameters of S ( ⋅ ) S(\cdot) S(⋅), I ( ⋅ ) I(\cdot) I(⋅) and D ( ⋅ ) D(\cdot) D(⋅)
The loss of MR-CCSNet is l = l d e e p + l i n t l=l_{deep}+l_{int} l=ldeep+lint.
Training datasets: 400 images from BSDS500[2]
Three standard benchmark datasets: Set5[4], Set14[39], BSDS100[2]
In the sensing network, pooling operation loses information about the low-level features.
Attention mechanism can effectively help us in extracting matching features from measurements.
In the real-world, because there are noise in the measurements, using them multiple times will introduce noise in the reconstruction process.
ching features from measurements.
In the real-world, because there are noise in the measurements, using them multiple times will introduce noise in the reconstruction process.