Jan. 14 - Jan. 25th 2019 two weeks paper reading
- paper reading list--- Image Denoising
-
-
- 1 A multiscale Image Denoising Algorithm Based on dilated residual Convolution Network [link](https://arxiv.org/pdf/1812.09131.pdf).
- 2 Dilated Residual Networks(ResNet)[link](https://www.cs.princeton.edu/~funk/drn.pdf).
-
- 2.1 Dilated convolution.
- 2.2 Residual Network
- 3 Understanding Convolution for semantic segmentation[link](http://sunw.csail.mit.edu/abstract/understanding.pdf).
- 4 learning Deep CNN Denoiser Prior for Image restoration(IRCNN)[link](http://openaccess.thecvf.com/content_cvpr_2017/papers/Zhang_Learning_Deep_CNN_CVPR_2017_paper.pdf).
- How to solve this type of problem?
-
- 5 Image Super-Resolution Using Deep Convolution Networks[link](https://arxiv.org/pdf/1501.00092.pdf).
-
-
- 5.1 There are 3 steps of this model to improve the performance.
- 5.2 loss fuction
- 5.3 the limitation of this model (SRCNN)
- 6. Accurate Image Super- Resolution using Very Deep Convolution Networks(VDSR) [link](https://arxiv.org/pdf/1511.04587.pdf).
-
- 7 Centralized Sparse representation for Image restoration [link](http://www4.comp.polyu.edu.hk/~cslzhang/paper/conf/iccv11/ICCV_CSR_Final.pdf).
- 7.1 Introduction
-
-
- To do the image restoration
- In this paper
- 7.2 Centralized sparse representation modeling
-
- 7.2.1 The sparse coding noise in image restoration
- 7.2.2 Centralized sparse representation
- Now the problem turn to be how to find a reasonable estimate of the unknown vector $\alpha_x$
-
- Then we can apply iterative minimization approach to the CSR model.
- 7.3 Algorithm of CSR
-
- 7.3.1 The determination of parameters $\lambda$ and r.
paper reading list— Image Denoising
1 A multiscale Image Denoising Algorithm Based on dilated residual Convolution Network link.
- Dilated filter
- Multiscale convolution group
2 Dilated Residual Networks(ResNet)link.
which is the combination of Dilated networks and Residual networks.
2.1 Dilated convolution.
there is a very illustrated explanation about dilated convolution.
link.
2.2 Residual Network
3 Understanding Convolution for semantic segmentationlink.
- dense upsampling convolution(DUC)
- bilinear upsampling(interpolation)
- hybrid dilated convolution(HDC)
4 learning Deep CNN Denoiser Prior for Image restoration(IRCNN)link.
Image restoration(IR)
The object of IR:
y = Hx + v
the purpose of image restoration is to recover the latent clean image x from its
degraded observation y.
How to solve this type of problem?
- model-based optimization method
NCSR, BM3D, WNNM and so on
- discriminative learning method
MLP, SRCNN, DCNN and so on
There are so many different types method of these two classes.
maybe next time I can give a literature review of these papers.
5 Image Super-Resolution Using Deep Convolution Networkslink.
-
5.1 There are 3 steps of this model to improve the performance.
the low-resolution inputs is first upscaled to the desired size using bicubic interpolation before inputing to SRCNN network.
- Patch Extraction and Representation
X: Ground truth high-resolution image
Y: Bicubic upsampling version of low-resolution image
F 1 ( Y ) = m a x ( 0 , W 1 ∗ Y + B 1 ) F_1(Y) = max(0, W_1 * Y + B_1) F1(Y)=max(0,W1∗Y+B1)
- Nonlinear Mapping
F 2 ( Y ) = m a x ( 0 , W 2 ∗ F 1 ( Y ) + B 2 ) F_2(Y) = max(0, W_2 * F_1(Y) + B_2) F2(Y)=max(0,W2∗F1(Y)+B2)
- To Reconstruct the image
F ( Y ) = W 3 ∗ F 2 ( Y ) + B 3 F(Y) = W_3 * F_2(Y) + B_3 F(Y)=W3∗F2(Y)+B3
W 3 W_3 W3: n 2 n_2 n2 * 1 * 1 * C
-
5.3 the limitation of this model (SRCNN)
- rely on the context of small image regions
- the training converge too slowly
- network only works for a single scale
6. Accurate Image Super- Resolution using Very Deep Convolution Networks(VDSR) link.
HR high- resolution
LR low-resolution
- context: very deep network using large receptive field and take a large image context into account.
- using interpolated low-resolution image as input, and predict the image details.
- we pad zeros before convolutions to keep the size of our feature maps the same.
- convergence: using residual-learning CNN to speed up the training
- scale factor: we propose a single-model SR approach. Scales are typically user-specified and can be arbitrary including fractions.
7 Centralized Sparse representation for Image restoration link.
7.1 Introduction
y = Hx + v
H: degradation matrix
y: observed image
v: additive noise vector
x: original image
x can be represented as a linear combination of a few atoms from a dictionary Φ \Phi Φ
x ≈ \approx ≈ Φ \Phi Φ α \alpha α
α x \alpha_x αx = a r g m i n α argmin_\alpha argminα ∥ α ∥ 0 \Vert \alpha \Vert_0 ∥α∥0; s.t. ∥ \Vert ∥ x- Φ \Phi Φ α \alpha α ∥ \Vert ∥ 2 _2 2 < \lt < ε \varepsilon ε
ε \varepsilon ε: small constant balacing the sparsity and the approximation error.
∥ \Vert ∥ ∥ \Vert ∥ 0 _0 0 counts the number of non-zero coefficients in α \alpha α
To do the image restoration
y = Hx + v
To reconstruct x from y
Since x ≈ Φ α \approx \mathbb{\Phi} \alpha ≈Φα,
y ≈ \approx ≈ H Φ \Phi Φ α \alpha α
Then:
- α y \alpha_y αy = a r g m i n α argmin_\alpha argminα ∥ α ∥ 1 \Vert \alpha \Vert_1 ∥α∥1; s.t. ∥ \Vert ∥ y - H Φ \Phi Φ α \alpha α ∥ \Vert ∥ 2 _2 2 < \lt < ε \varepsilon ε
- Reconstruct x:
- x ^ \hat x x^ = Φ \Phi Φ α y \alpha_y αy
But because y is noise corrupted, blurred or incomplete, α y \alpha_y αy may deviate much from α x \alpha_x αx
In this paper
We introduce the concept of sparse coding noise(SCN) to facilitate the discussion of problem.
v α = α x − α y v_\alpha = \alpha _x - \alpha _y vα=αx−αy
Given the dictionary Φ \Phi Φ
v x = x ^ − x ≈ v_x = \hat x -x \approx vx=x^−x≈ Φ \Phi Φ α y \alpha_y αy- Φ \Phi Φ α x \alpha_x αx= Φ \Phi Φ v α v_ \alpha vα
We proposed centralized sparse representation model to effectively reduce the SCN and then enhance the sparsity based IR performance.
7.2 Centralized sparse representation modeling
7.2.1 The sparse coding noise in image restoration
X ∈ \in ∈ R \mathbb R R N ^N N: original image
x i = R i X x_i = R_i X xi=RiX
R i R_i Ri (matrix extracting patch x i x_i xi from X at location i)
Given dictionary Φ \Phi Φ ∈ R n × M \in \mathbb R ^{n \times M} ∈Rn×M n < M n \lt M n<M
each patch can be sparsely represented by the set of sparse code x i ≈ Φ α i x_i \approx \Phi \alpha_i xi≈Φαi
-
In the application of IR, x is not available to code, and we only have the degraded observation image y: y = Hx + v
α y \alpha_y αy = a r g m i n α argmin_\alpha argminα{ ∥ y − H Φ α ∥ 2 2 + λ ∥ α ∥ 1 \Vert y - H \Phi \alpha \Vert _2 ^2 + \lambda\Vert \alpha \Vert _1 ∥y−HΦα∥22+λ∥α∥1}
-
the image then can be reconstructed as: x ^ = Φ α y \hat x = \Phi \alpha_y x^=Φαy
-
from the context we mentioned before, we know that α y \alpha_y αy will deviate from α x \alpha_x αx
-
So that SCN : v α = α y − α x v_\alpha =\alpha_y -\alpha_x vα=αy−αx And v α v_\alpha vα will determine the IR quality of x ^ \hat x x^
-
We perform the experiment to investigate the statistics of SCN v α v_\alpha vα, And the observation motivate us to model SCN with a laplacian prior.
Laplacian distribution f ( x ) = 1 2 b e x p ( − ∣ x − ν ∣ b ) f(x) = \frac{1}{2b} exp(\frac{ - \vert x -\nu \vert}{b}) f(x)=2b1exp(b−∣x−ν∣)
7.2.2 Centralized sparse representation
From the context we have mentioned before, we know that if we want to improve the performance of the model, we need to suppress the SCN : v α v_\alpha vα
v α = α y − α x v_\alpha = \alpha_y -\alpha_x vα=αy−αx
But in practice, α x \alpha_x αx is always unknown, so we can give a good estimate of α x \alpha_x αx, donated as α ^ x \hat \alpha_x α^x, so that α y − α ^ x \alpha_y - \hat \alpha_x αy−α^x can be an estimate of SCN.
- A new sparse coding model can be:
- α y \alpha_y αy = a r g m i n α argmin_\alpha argminα{ ∥ y − H Φ α ∥ 2 2 + λ ∥ α ∥ 1 \Vert y - H \Phi \alpha \Vert _2 ^2 + \lambda\Vert \alpha \Vert _1 ∥y−HΦα∥22+λ∥α∥1+ r ∥ \Vert ∥ α \alpha α - α ^ x \hat \alpha_x α^x ∥ \Vert ∥ l p _{l_p} lp}
r is constant
l p {l_p} lp norm, p can be 1 OR 2 measure the distance between α \alpha α and α ^ x \hat \alpha_x α^x
- Compared with the model before, this model enforce α y \alpha_y αy to be more close to α ^ x \hat \alpha_x α^x.
Now the problem turn to be how to find a reasonable estimate of the unknown vector α x \alpha_x αx
Normally, the estimate of a variable can be the average of several samples or the expectation.
In this part, we use expectation to estimate the α x \alpha_x αx.
α ^ x \hat \alpha_x α^x = E[ α x \alpha_x αx], and in practice, we can approach E[ α x \alpha_x αx] by E[ α y \alpha_y αy], by assuming the SCN is nearly zero.
- Then the model we have showed before can be:
-
- α y \alpha_y αy = a r g m i n α argmin_\alpha argminα{ ∥ y − H Φ α ∥ 2 2 + λ ∥ α ∥ 1 \Vert y - H \Phi \alpha \Vert _2 ^2 + \lambda\Vert \alpha \Vert _1 ∥y−HΦα∥22+λ∥α∥1+ r ∥ \Vert ∥ α \alpha α - E[ α \alpha α] ∥ \Vert ∥ l p _{l_p} lp}
- We call this model centralized sparse representation (CSR)
- For the sparse code α i \alpha_i αi on each image patch i, E[ α i \alpha_i αi] can be nearly computed if we have enough samples of α i \alpha_i αi. Then E[ α i \alpha_i αi] can be computed as the weighted average of those sparse code vectors associated with the nonlocal similar patches to patch i. Donated C i C_i Ci for each patch i via block matching and then average the sparse codes within each cluster.
- Denoted by α i , j \alpha_{i,j} αi,j, the sparse code of the searched similar patch j to patch i.
- Then E[ α i \alpha_i αi] = u i u_i ui = ∑ j ∈ C i ω i , j α i , j \sum _{j\in C_i} \omega_{i,j} \alpha_{i,j} ∑j∈Ciωi,jαi,j
- ω i , j \omega_{i,j} ωi,j is the weight
ω i , j = e x p ( ∥ x i ^ − x i , j ^ ∥ 2 2 / h ) / W \omega_{i,j} = exp(\Vert \hat {x_i} - \hat {x_{i,j}} \Vert _2 ^2/h)/W ωi,j=exp(∥xi^−xi,j^∥22/h)/W
x i ^ = Φ α ^ i \hat {x_i} = \Phi \hat \alpha_i xi^=Φα^i; x i , j ^ = Φ α ^ i , j \hat {x_{i,j}} = \Phi \hat \alpha_{i,j} xi,j^=Φα^i,j are the estimate of patch i and patch j. W is the normalization factor and h is predetermined scalar.
- α y \alpha_y αy = a r g m i n α argmin_\alpha argminα{ ∥ y − H Φ α ∥ 2 2 + λ ∥ α ∥ 1 \Vert y - H \Phi \alpha \Vert _2 ^2 + \lambda\Vert \alpha \Vert _1 ∥y−HΦα∥22+λ∥α∥1+ r ∑ i = 1 N \sum_{i =1}^{N} ∑i=1N ∥ \Vert ∥ α i \alpha_i αi - u i u_i ui ∥ \Vert ∥ l p _{l_p} lp}
Then we can apply iterative minimization approach to the CSR model.
The steps are as follows:
- initialize u i u_i ui as 0, eg: u i ( − 1 ) = 0 u_i ^{(-1)} =0 ui(−1)=0 Then compute α y ( 0 ) \alpha_y ^{(0)} αy(0), and then, using α y ( 0 ) \alpha_y ^{(0)} αy(0), we can compute x ( 0 ) x^{(0)} x(0) via x ( 0 ) = Φ α y ( 0 ) x^{(0)}=\Phi \alpha_y ^{(0)} x(0)=Φαy(0).
- Based on x ( 0 ) x^{(0)} x(0), we can find the similar patches with each local patch i, then we can update u i u_i ui by α y ( 0 ) \alpha_y ^{(0)} αy(0), and the updated result, donated by u i 0 u_i^{0} ui0. Then it can be used in next round. Such a procedure is iterated until convergence. In the j t h j^{th} jth iteration, the sparse coding is performed by.
α y ( j ) \alpha_y^{(j)} αy(j) = a r g m i n α argmin_\alpha argminα{ ∥ y − H Φ α ∥ 2 2 + λ ∥ α ∥ 1 \Vert y - H \Phi \alpha \Vert _2 ^2 + \lambda\Vert \alpha \Vert _1 ∥y−HΦα∥22+λ∥α∥1+ r ∑ i = 1 N \sum_{i =1}^{N} ∑i=1N ∥ \Vert ∥ α i \alpha_i αi - u i ( j ) u_i^{(j)} ui(j) ∥ \Vert ∥ l p _{l_p} lp}
During iteration, the accuracy of sparse code α y ( j ) \alpha_y ^{(j)} αy(j) is gradually improved.
7.3 Algorithm of CSR
7.3.1 The determination of parameters λ \lambda λ and r.
It can be empirically found that α \alpha α and θ \theta θ are nearly uncorrelated.
And before we have found that SCN can be well characterized by the laplacian distribution.
Meanwhile, it is also well accepted in literature that the sparse coefficients α \alpha α can be characterized by i.i.d Laplacian distribution.
α y \alpha_y αy = a r g m i n α argmin_\alpha argminα{ ∥ y − H Φ α ∥ 2 2 + λ ∥ α ∥ 1 \Vert y - H \Phi \alpha \Vert _2 ^2 + \lambda\Vert \alpha \Vert _1 ∥y−HΦα∥22+λ∥α∥1+ r ∑ i = 1 N \sum_{i =1}^{N} ∑i=1N ∥ \Vert ∥ α i \alpha_i αi - u i u_i ui ∥ \Vert ∥ l p _{l_p} lp}
It is normally for us to set l p l_p lp equal to 1
And then the model can be converted to:
α y \alpha_y αy = a r g m i n α argmin_\alpha argminα{ ∥ y − H Φ α ∥ 2 2 + λ ∥ α ∥ 1 \Vert y - H \Phi \alpha \Vert _2 ^2 + \lambda\Vert \alpha \Vert _1 ∥y−HΦα∥22+λ∥α∥1+ r ∑ i = 1 N \sum_{i =1}^{N} ∑i=1N ∥ \Vert ∥ α i \alpha_i αi - u i u_i ui ∥ \Vert ∥ 1 _{1} 1}
α y \alpha_y αy = a r g m i n α argmin_\alpha argminα{ ∥ y − H Φ α ∥ 2 2 + λ \Vert y - H \Phi \alpha \Vert _2 ^2 + \lambda ∥y−HΦα∥22+λ ∑ i = 1 N \sum_{i =1}^{N} ∑i=1N ∥ \Vert ∥ α i ∥ 1 \alpha_i \Vert _1 αi∥1+ r ∑ i = 1 N \sum_{i =1}^{N} ∑i=1N ∥ \Vert ∥ θ i \theta_i θi ∥ \Vert ∥ 1 _{1} 1}
Compared this model with eq(18)
we can conclude that:
This is the end of this model.