Two training options are considered: using the corrupted image itself, or training on a corpus of high-quality image database.
我们考虑了两种训练选项:使用损坏的图像本身,或者在高质量图像数据库的语料库上进行训练。
Since the K-SVD is limited in handling small image patches, we extend its deployment to arbitrary image sizes by defining a global image prior that forces sparsity over patches in every location in the image.
由于 K-SVD 在处理小的图像块方面受到限制,我们通过定义一个全局图像先验,将其部署扩展到任意大小的图像,该全局图像在图像中的每个位置强制稀疏。
图像先验的提出:
In addressing general inverse problems in image processing using the Bayesian approach, an image prior is necessary.
在使用贝叶斯方法解决图像处理中的一般反问题时,需要图像先验。
Traditionally, this has been handled by choosing a prior based on some simplifying assumptions, such as spatial smoothness, low/max-entropy, or sparsity in some transform domain.
传统上,这是通过基于一些简化假设选择先验来处理的,例如空间平滑性、低\最大熵或某些变换域中的稀疏性。
the example-based techniques suggest to learn the prior from images somehow
基于示例的技术建议以某种方式从图像中学习先验知识
图像先验与稀疏冗余的结合:
When this prior-learning idea is merged with sparsity and redundancy, it is the dictionary to be used that we target as the learned set of parameters.
当这种先验学习思想与稀疏性和冗余性相结合时,我们将被使用的字典其作为学习的参数集。
we propose to learn the dictionary from examples:
我们建议从例子中学习这本字典:
training the dictionary using patches from the corrupted image itself
使用损坏图像本身的块来训练字典
training on a corpus of patches taken from a high-quality set of images
在从一组高质量图像中提取的图像块数据集上进行训练
we shall see how the training and the denoising fuse together naturally into one coherent and iterated process, when training is done on the given image directly.
当直接对给定的图像进行训练时,我们将看到训练和去噪如何自然地融合到一个连贯的迭代过程中。
本文实现细节:
Since dictionary learning is limited in handling small image patches, a natural difficulty arises: How can we use it for general images of arbitrary size?
由于字典学习仅限于处理小的图像补丁,因此自然会出现一个困难:我们如何将其用于任意大小的一般图像?
we propose a global image prior that forces sparsity over patches in every location in the image (with overlaps).
我们提出了一种全局图像先验算法,该算法在图像中的每个位置(具有重叠)的面片上强制稀疏。
We define a maximum a posteriori probability (MAP) estimator as the minimizer of a well-defined global penalty term.
我们将最大后验概率(MAP)估计定义为定义良好的全局惩罚项的最小值。
Its numerical solution leads to a simple iterated patch-by-patch sparse coding and averaging algorithm
它的数值解产生了一种简单的逐块迭代稀疏编码和平均算法
总结:
the novelty of this paper includes the way we use local sparsity and redundancy as ingredients in a global Bayesian objective
本文的创新之处在于,我们将局部稀疏性和冗余性作为全局贝叶斯目标的组成部分
Also novel in this work is the idea to train dictionaries for the denoising task, rather than use prechosen ones.
这项工作中的另一个新颖之处是为去噪任务训练词典,而不是使用预筛选词典。
We do that via the introduction of the Sparseland model. Once this is set, we will discuss how local treatment on image patches turns into a global prior in a Bayesian reconstruction framework.
我们通过引进 Sparseland 模型来做到这一点。设置好后,我们将讨论在贝叶斯重建框架中,如何将图像块的局部处理转化为全局先验。
对于干净矩阵 x \boldsymbol{x} x 的 Sparseland 模型:
image patches(图像块大小): n × n \sqrt{n} \times \sqrt{n} n×n
ordered lexicographically as column vectors(按字典顺序排列为列向量): x ∈ ℜ n \boldsymbol{x} \in \Re^{n} x∈ℜn
For the construction of the Sparseland model, we need to define a dictionary (matrix) of size(对于 Sparseland 模型的构造,我们需要定义一个大小为的字典(矩阵)): D ∈ ℜ n × k \boldsymbol{D} \in \Re^{n \times k} D∈ℜn×k,with k > n k > n k>n , implying that it is redundant( k > n k > n k>n 意味着它是多余的)At the moment, we shall assume that this matrix is known and fixed.(现在,我们假设这个矩阵是已知的和固定的。)
Put loosely, the proposed model suggests that every image patch, x \boldsymbol{x} x , could be represented sparsely over this dictionary, i.e., the solution of(粗略地说,提出的模型表明,每个图像块 x x x 都可以在这个字典上稀疏地表示,即 ):
α ^ = arg min α ∥ α ∥ 0 subject to D α ≈ x \hat{\boldsymbol{\alpha}}=\arg \min _{\boldsymbol{\alpha}}\|\boldsymbol{\alpha}\|_{0} \text { subject to } \mathbf{D} \boldsymbol{\alpha} \approx \mathrm{x} α^=argαmin∥α∥0 subject to Dα≈x
The notation ∥ α ∥ 0 \|\boldsymbol{\alpha}\|_{0} ∥α∥0 stands for the count of the nonzero entries in α \boldsymbol{\alpha} α .(该 ∥ α ∥ 0 \|\boldsymbol{\alpha}\|_{0} ∥α∥0 符号表示 α \boldsymbol{\alpha} α 中非零项的个数。 )
The basic idea here is that every signal instance from the family we consider can be represented as a linear combination of few columns (atoms) from the redundant dictionary D \boldsymbol{D} D.(这里的基本思想是,从我们考虑的簇中的每个信号实例可以表示为冗余字典 D \boldsymbol{D} D 中的几个列(原子)的线性组合。)
This model should be made more precise by replacing the rough constraint D α ≈ x \mathbf{D} \boldsymbol{\alpha} \approx \mathrm{x} Dα≈x with a clear requirement to allow a bounded representation error, ∥ D α − x ∥ 2 ≤ ϵ \|\mathbf{D} \boldsymbol{\alpha}-\mathbf{x}\|_{2} \leq \epsilon ∥Dα−x∥2≤ϵ . Also, one needs to define how deep is the required sparsity, adding a requirement of the form ∥ α ^ ∥ 0 ≤ L ≪ n \|\hat{\boldsymbol{\alpha}}\|_{0} \leq L \ll n ∥α^∥0≤L≪n , that states that the sparse representation uses no more than atoms from the dictionary for every image patch instance.
通过用允许有界表示误差 ∥ D α − x ∥ 2 ≤ ϵ \|\mathbf{D} \boldsymbol{\alpha}-\mathbf{x}\|_{2} \leq \epsilon ∥Dα−x∥2≤ϵ 的明确要求取代粗糙约束 D α ≈ x \mathbf{D} \boldsymbol{\alpha} \approx \mathrm{x} Dα≈x,该模型应该变得更精确。此外,还需要定义所需稀疏度的深度,并添加一个形式要求 ∥ α ^ ∥ 0 ≤ L ≪ n \|\hat{\boldsymbol{\alpha}}\|_{0} \leq L \ll n ∥α^∥0≤L≪n,即稀疏表示对每个图像块实例使用的原子不超过字典中的原子。
Considering the simpler option between the two, with the triplet ( ϵ , L , D ) (\epsilon, L, \boldsymbol{D}) (ϵ,L,D) in place, our model is well defined.
考虑到两者之间更简单的选择,以及三元组 ( ϵ , L , D ) (\epsilon, L, \boldsymbol{D}) (ϵ,L,D) 的存在,我们的模型得到了很好的定义。
对于噪声矩阵 y \boldsymbol{y} y 的 Sparseland 模型:
Consider a noisy version of it, y \boldsymbol{y} y, contaminated by an additive zero-mean white Gaussian noise with standard deviation σ \sigma σ
考虑一个噪声版本 y \boldsymbol{y} y,它被一个额外零均值且标准差为 σ \sigma σ 的白高斯噪声污染
The MAP estimator for denoising this image patch is built by solving(该图像块去噪的 MAP 估计器是通过求解):
α ^ = arg min α ∥ α ∥ 0 subject to ∥ D α − y ∥ 2 2 ≤ T \hat{\boldsymbol{\alpha}}=\arg \min _{\boldsymbol{\alpha}}\|\boldsymbol{\alpha}\|_{0} \text { subject to }\|\mathbf{D} \boldsymbol{\alpha}-\mathbf{y}\|_{2}^{2} \leq T α^=argαmin∥α∥0 subject to ∥Dα−y∥22≤T
where T T T is dictated by σ \sigma σ ϵ \epsilon ϵ( T T T 是由 σ \sigma σ ϵ \epsilon ϵ 决定). The denoised image is, thus, given by x ^ = D α ^ \hat{\mathbf{x}}=\mathbf{D} \hat{\boldsymbol{\alpha}} x^=Dα^ (因此,去噪后的图像如下所示: x ^ = D α ^ \hat{\mathbf{x}}=\mathbf{D} \hat{\boldsymbol{\alpha}} x^=Dα^)
Notice that the above optimization task can be changed to be(请注意,上面的优化任务可以更改为):
α ^ = arg min α ∥ D α − y ∥ 2 2 + μ ∥ α ∥ 0 \hat{\boldsymbol{\alpha}}=\arg \min _{\boldsymbol{\alpha}}\|\mathbf{D} \boldsymbol{\alpha}-\mathbf{y}\|_{2}^{2}+\mu\|\boldsymbol{\alpha}\|_{0} α^=argαmin∥Dα−y∥22+μ∥α∥0
so that the constraint becomes a penalty(这样约束就变成了惩罚 )。
应用于更大图像时
If we want to handle a larger image X \mathbf{X} X of size N × N \sqrt{N} \times \sqrt{N} N×N( N ≫ n N \gg n N≫n),and we are still interested in using the above described model, one option is to redefine the model with a larger dictionary.
如果我们想处理更大尺寸的图像 X \mathbf{X} X,大小为 N × N \sqrt{N} \times \sqrt{N} N×N( N ≫ n N \gg n N≫n),我们仍然对使用上述模型感兴趣,一个选择是用更大的字典重新定义模型。
坚持使用小字典
However, when we insist on using a specific fixed and small size dictionary D ∈ ℜ n × k \boldsymbol{D} \in \Re^{n \times k} D∈ℜn×k, this option no longer exists.Thus, a natural question arises concerning the use of such a small dictionary in the first place.
因此,一个自然而然的问题就出现了,关于如何使用这样一本小词典。
Two reasons come to mind: 1) when training takes place, only small dictionaries can be composed; and furthermore; 2) a small dictionary implies a locality of the resulting algorithms, which simplifies the overall image treatment.
我想到了两个原因:1)当进行培训时,只能编写小词典;而且;2) 一个小字典意味着结果算法的局部性,这简化了整体图像处理。
A heuristic approach is to work on smaller patches of size n × n \sqrt{n} \times \sqrt{n} n×n and tile the results. In doing so, visible artifacts may occur on block boundaries.
一种启发式方法是对大小为 n × n \sqrt{n} \times \sqrt{n} n×n 较小的补丁进行处理,并将结果平铺。这样,块边界上可能会出现可见的瑕疵 。
One could also propose to work on overlapping patches and average the results in order to prevent such blockiness artifacts.
人们还可以建议对重叠的斑块进行研究,并对结果进行平均,以防止出现这种块状伪影。
As we shall see next, a systematic global approach towards this problem leads to this very option as a core ingredient in an overall algorithm.
正如我们接下来将看到的,对这个问题的系统性全局方法将导致这个选项成为整体算法的核心组成部分。
If our knowledge on the unknown large image X \mathbf{X} X is fully expressed in the fact that every patch in it belongs to the ( ϵ , L , D ) (\epsilon, L, \boldsymbol{D}) (ϵ,L,D)-Sparseland model, then the natural generalization of the above MAP estimator is the replacement of (4) with:
如果我们对未知大图像 X \mathbf{X} X 的认知完全表示为它的每个图像块都属于 ( ϵ , L , D ) (\epsilon, L, \boldsymbol{D}) (ϵ,L,D)-稀疏模型,那么上述 MAP 估计器的自然泛化就是将(4)替换为:
{ α ^ i j , X ^ } = arg min α i j , X λ ∥ X − Y ∥ 2 2 + ∑ i j μ i j ∥ α i j ∥ 0 + ∑ i j ∥ D α i j − R i j X ∥ 2 2 \begin{aligned} &\left\{\hat{\boldsymbol{\alpha}}_{i j}, \hat{\mathbf{X}}\right\}=\arg \min _{\boldsymbol{\alpha}_{i j}, \mathbf{X}} \lambda\|\mathbf{X}-\mathbf{Y}\|_{2}^{2} + \sum_{i j} \mu_{i j}\left\|\boldsymbol{\alpha}_{i j}\right\|_{0}+\sum_{i j}\left\|\mathbf{D} \boldsymbol{\alpha}_{i j}-\mathbf{R}_{i j} \mathbf{X}\right\|_{2}^{2} \end{aligned} {α^ij,X^}=argαij,Xminλ∥X−Y∥22+ij∑μij∥αij∥0+ij∑∥Dαij−RijX∥22
the first term is the log-likelihood global force that demands the proximity between the measured image, Y \mathbf{Y} Y, and its denoised (and unknown) version X \mathbf{X} X. Put as a constraint, this penalty would have read ∥ X − Y ∥ 2 2 ≤ Const ⋅ σ 2 \|\mathbf{X}-\mathbf{Y}\|_{2}^{2} \leq \text { Const } \cdot \sigma^{2} ∥X−Y∥22≤ Const ⋅σ2, and this reflects the direct relationship between λ \lambda λ and σ \sigma σ.
第一项是对数似然全局势,它要求测量图像 Y \mathbf{Y} Y 与其去噪(未知)版本 X \mathbf{X} X 之间的接近度 。作为一种约束,这个惩罚应该是 ∥ X − Y ∥ 2 2 ≤ Const ⋅ σ 2 \|\mathbf{X}-\mathbf{Y}\|_{2}^{2} \leq \text { Const } \cdot \sigma^{2} ∥X−Y∥22≤ Const ⋅σ2,这反映了 λ \lambda λ 和 σ \sigma σ 之间的直接关系。
The second and the third terms are the image prior that makes sure that in the constructed image, X \mathbf{X} X, every patch x i j = R i j X \mathbf{x}_{i j}=\mathbf{R}_{i j} \mathbf{X} xij=RijX of size n × n \sqrt{n} \times \sqrt{n} n×n in every location (thus, the summation by i i i, j j j ) has a sparse representation with bounded error.
第二项和第三项是图像先验,确保在构造的图像 X \mathbf{X} X 中,每个位置的每个 n × n \sqrt{n} \times \sqrt{n} n×n 大小的面片 x i j = R i j X \mathbf{x}_{i j}=\mathbf{R}_{i j} \mathbf{X} xij=RijX(因此, i i i, j j j 的总和)具有有界误差的稀疏表示。
The matrix R \mathbf{R} R is an n × N n \times N n×N matrix that extracts the block from the image.
矩阵 R \mathbf{R} R 是从图像中提取块的 n × N n \times N n×N 矩阵。
For an N × N \sqrt{N} \times \sqrt{N} N×N image X \mathbf{X} X, the summation over i i i, j j j includes ( N − n + 1 ) 2 (\sqrt{N}-\sqrt{n}+1)^{2} (N−n+1)2 items, considering all image patches of size n × n \sqrt{n} \times \sqrt{n} n×n in X \mathbf{X} X with overlaps.
对于一幅 N × N \sqrt{N} \times \sqrt{N} N×N 图像 X \mathbf{X} X, i i i, j j j 的求和包含了 ( N − n + 1 ) 2 (\sqrt{N}-\sqrt{n}+1)^{2} (N−n+1)2 项目,考虑到所有大小为 n × n \sqrt{n} \times \sqrt{n} n×n 相同且重叠的图像块。
As to the coefficients u i j u_{ij} uij, those must be location dependent, so as to comply with a set of constraints of the form ∥ D α i j − x i j ∥ 2 2 ≤ T \left\|\mathbf{D} \boldsymbol{\alpha}_{i j}-\mathbf{x}_{i j}\right\|_{2}^{2} \leq T ∥Dαij−xij∥22≤T。
至于系数 u i j u_{ij} uij,这些系数必须依赖于位置,以符合形式的一组约束 ∥ D α i j − x i j ∥ 2 2 ≤ T \left\|\mathbf{D} \boldsymbol{\alpha}_{i j}-\mathbf{x}_{i j}\right\|_{2}^{2} \leq T ∥Dαij−xij∥22≤T。
When the underlying dictionary D \mathbf{D} D is assumed known, the proposed penalty term in (4) has two kinds of unknowns: the sparse representations per each location, and the overall output image X \mathbf{X} X.
假设基础字典 D \mathbf{D} D 已知,则(4)中提出的惩罚项有两种未知量:每个位置的稀疏表示和整体输出图像 X \mathbf{X} X。
Instead of addressing both together, we propose a block-coordinate minimization algorithm that starts with an initialization X = Y \mathbf{X} = \mathbf{Y} X=Y, and then seeks the optimal α ^ i j \hat{\alpha}_{ij} α^ij.
我们没有同时解决这两个问题,而是提出了一种块坐标最小化算法,该算法从初始化 X = Y \mathbf{X} = \mathbf{Y} X=Y 开始,然后寻找最优解 α ^ i j \hat{\alpha}_{ij} α^ij。
In doing so, we get a complete decoupling of the minimization task to many smaller ones, each of the form:
在这样做的过程中,我们将最小化任务完全解耦为许多更小的任务,每种形式:处理一个图像块。
α ^ i j = arg min α μ i j ∥ α ∥ 0 + ∥ D α − x i j ∥ 2 2 \hat{\boldsymbol{\alpha}}_{i j}=\arg \min _{\boldsymbol{\alpha}} \mu_{i j}\|\boldsymbol{\alpha}\|_{0}+\left\|\mathbf{D} \boldsymbol{\alpha}-\mathbf{x}_{i j}\right\|_{2}^{2} α^ij=argαminμij∥α∥0+∥Dα−xij∥22
Solving this using the orthonormal matching pursuit is easy, gathering one atom at a time, and stopping when the error ∥ D α − x i j ∥ 2 2 \left\|\mathbf{D} \boldsymbol{\alpha}-\mathbf{x}_{i j}\right\|_{2}^{2} ∥Dα−xij∥22 goes below T T T.
使用正交匹配追踪解决这个问题很容易,一次只收集一个原子,当误差 ∥ D α − x i j ∥ 2 2 \left\|\mathbf{D} \boldsymbol{\alpha}-\mathbf{x}_{i j}\right\|_{2}^{2} ∥Dα−xij∥22 低于 T T T 时停止。
This way, the choice of μ i j \mu_{i j} μij has been handled implicitly.
通过这种方式, μ i j \mu_{i j} μij 的选择被隐式处理 。
Thus, this stage works as a sliding window sparse coding stage, operated on each block of n × n \sqrt{n} \times \sqrt{n} n×n at a time.
因此,该阶段作为滑动窗口稀疏编码阶段,一次对每个 n × n \sqrt{n} \times \sqrt{n} n×n 数据块进行操作。
Given all α ^ i j \hat{\alpha}_{ij} α^ij, we can now fix those and turn to update X \mathbf{X} X. Returning to (4), we need to solve:
综上所述,给定所有的 α ^ i j \hat{\alpha}_{ij} α^ij,我们现在可以修复这些问题并进行更新 X \mathbf{X} X。回到(4),我们需要解决 :
X ^ = arg min x λ ∥ X − Y ∥ 2 2 + ∑ i j ∥ D α ^ i j − R i j X ∥ 2 2 \hat{\mathbf{X}}=\arg \min _{\mathbf{x}} \lambda\|\mathbf{X}-\mathbf{Y}\|_{2}^{2}+\sum_{i j}\left\|\mathbf{D} \hat{\boldsymbol{\alpha}}_{i j}-\mathbf{R}_{i j} \mathbf{X}\right\|_{2}^{2} X^=argxminλ∥X−Y∥22+ij∑∥Dα^ij−RijX∥22
This is a simple quadratic term that has a closed-form solution of the form(这是一个简单的二次项,具有以下形式的闭式解):
X ^ = ( λ I + ∑ i j R i j T R i j ) − 1 ( λ Y + ∑ i j R i j T D α ^ i j ) \hat{\mathbf{X}}=\left(\lambda \mathbf{I}+\sum_{i j} \mathbf{R}_{i j}^{T} \mathbf{R}_{i j}\right)^{-1}\left(\lambda \mathbf{Y}+\sum_{i j} \mathbf{R}_{i j}^{T} \mathbf{D} \hat{\boldsymbol{\alpha}}_{i j}\right) X^=(λI+ij∑RijTRij)−1(λY+ij∑RijTDα^ij)
This rather cumbersome expression may mislead, as all it says is that averaging of the denoised patches is to be done, with some relaxation obtained by averaging with the original noisy image.
这个相当繁琐的表达式可能会产生误导,因为它所说的只是要对去噪后的面片进行平均,并通过对原始噪声图像进行平均来获得一些松弛。
So far, we have seen that the obtained denoising algorithm calls for sparse coding of small patches, and an averaging of their outcomes.
到目前为止,我们已经看到,获得的去噪算法需要对小面片进行稀疏编码,并对其结果进行平均。