Phoenixtree_Zhao

MyDLNote-Enhancment: [SR转文] Deep Unfolding Network for Image Super-Resolution

2020CVPR -- Deep Unfolding Network for Image Super-Resolution

[paper] : Deep Unfolding Network for Image Super-Resolution

[github] : https://github.com/cszn/USRNet

Abstract

Learning-based single image super-resolution (SISR) methods are continuously showing superior effectiveness and efficiency over traditional model-based methods, largely due to the end-to-end training. However, different from model-based methods that can handle the SISR problem with different scale factors, blur kernels and noise levels under a unified MAP (maximum a posteriori) framework, learning-based methods generally lack such flexibility.

提出问题：

首先，肯定了深度学习 SISR 方法的优越表现；然后，指出基于深度学习 SISR 的问题，即：

基于模型的（非深度学习的）SISR 方法在同一标准的最大后验概率（maximum a posteriori，MAP）下，可以对不同的尺度因素，模糊内核和噪音水平的 LR 图像进行研究处理；但基于深度学习的 SISR 缺乏这种灵活度。

To address this issue, this paper proposes an end-to-end trainable unfolding network which leverages both learning-based methods and model-based methods. Specifically, by unfolding the MAP inference via a half-quadratic splitting algorithm, a fixed number of iterations consisting of alternately solving a data subproblem and a prior subproblem can be obtained. The two subproblems then can be solved with neural modules, resulting in an end-to-end trainable, iterative network. As a result, the proposed network inherits the flexibility of model-based methods to super-resolve blurry, noisy images for different scale factors via a single model, while maintaining the advantages of learning-based methods.

解决问题的方法：

为了解决这一问题，本文提出了的端到端可训练展开网络，同时利用了基于学习的方法和基于模型的方法。

具体地，通过半二次分裂算法（half-quadratic splitting algorithm）展开 MAP 推理，可以得到由交替求解一个数据子问题和一个先验子问题组成的固定迭代次数。

这两个子问题可以用神经模块来解决，从而形成一个端到端可训练的迭代网络。

因此，该网络继承了基于模型的方法的灵活性，可以通过单一模型对不同尺度因子的模糊、噪声图像进行超分辨，同时保持了基于学习的方法的优点。

Extensive experiments demonstrate the superiority of the proposed deep unfolding network in terms of flexibility, effectiveness and also generalizability.

实验结果。

Introduction

让我们看看，Introduction 讲了一个什么故事。

Single image super-resolution (SISR) refers to the process of recovering the natural and sharp detailed highresolution (HR) counterpart from a low-resolution (LR) image. It is one of the classical ill-posed inverse problems in low-level computer vision and has a wide range of realworld applications, such as enhancing the image visual quality on high-definition displays [42, 53] and improving the performance of other high-level vision tasks [13].

笼统介绍了 SISR 是干嘛的。

Despite decades of studies, SISR still requires further study for academic and industrial purposes [35, 64]. The difficulty is mainly caused by the inconsistency between the simplistic degradation assumption of existing SISR methods and the complex degradations of real images [16]. Actually, for a scale factor of , the classical (traditional) degradation model of SISR [17, 18, 37] assumes the LR image is a blurred, decimated, and noisy version of an HR image . Mathematically, it can be expressed by

$y= (x \otimes k)\downarrow_s +n$ , (1)

where ⊗ represents two-dimensional convolution of with blur kernel , $\downarrow_s$ denotes the standard s-fold downsampler, i.e., keeping the upper-left pixel for each distinct $s\times s$ patch and discarding the others, and is usually assumed to be additive, white Gaussian noise (AWGN) specified by standard deviation (or noise level) $\sigma$ [71]. With a clear physical meaning, Eq. (1) can approximate a variety of LR images by setting proper blur kernels, scale factors and noises for an underlying HR images. In particular, Eq. (1) has been extensively studied in model-based methods which solve a combination of a data term and a prior term under the MAP framework.

继续详细介绍 SISR，从数学模型角度：学术研究和工业应用之间的差异，主要是由于现有SISR方法的退化假设过于简单，与真实图像的复杂退化不一致造成的。后面，就是具体介绍这个问题的数学含义。对于给定的HR图像，LR 图像由模糊内核、比例因子（降维比例）和噪声决定。

Eq.(1) 在基于模型的方法中得到了广泛的研究，这种方法解决了MAP框架下数据项和先验项的组合。

在 Eq.(1) 基础上，将引出下一段关于基于模型和基于深度学习 SISR 算法的问题。

Though model-based methods are usually algorithmically interpretable, they typically lack a standard criterion for their evaluation because, apart from the scale factor, Eq. (1) additionally involves a blur kernel and added noise. For convenience, researchers resort to bicubic degradation without consideration of blur kernel and noise level [14,56, 60]. However, bicubic degradation is mathematically complicated [25], which in turn hinders the development of model-based methods.

For this reason, recently proposed SISR solutions are dominated by learning-based methods that learn a mapping function from a bicubicly downsampled LR image to its HR estimation. Indeed, significant progress on improving PSNR [26, 70] and perceptual quality [31, 47, 58] for the bicubic degradation has been achieved by learning-based methods, among which convolutional neural network (CNN) based methods are the most popular, due to their powerful learning capacity and the speed of parallel computing. Nevertheless, little work has been done on applying CNNs to tackle Eq. (1) via a single model. Unlike model-based methods, CNNs usually lack flexibility to super-resolve blurry, noisy LR images for different scale factors via a single end-to-end trained model (see Fig. 1).

Figure 1. While a single degradation model (i.e., Eq. (1)) can result in various LR images for an HR image, with different blur kernels, scale factors and noise, the study of learning a single deep model to invert all such LR images to HR image is still lacking.

问题提出：

目前的 SISR 无非两类，基于模型的（非深度学习）和基于深度学习的。这两类方法都有各自的问题。

基于模型的问题：采用双三次退化而不考虑模糊核和噪声水平[14,56,60]。然而，双三次退化在数学上是复杂的[25]，这反过来又阻碍了基于模型的方法的发展。

基于深度学习的：CNNs通常缺乏灵活性，无法通过单一的端到端训练模型，针对不同的尺度因子来实现模糊、含有噪声的 LR 图像的超分辨。

问题指明了工作内容，下一段讲本文的工作了。

In this paper, we propose a deep unfolding super-resolution network (USRNet) to bridge the gap between learning-based methods and model-based methods. On one hand, similar to model-based methods, USRNet can effectively handle the classical degradation model (i.e., Eq. (1)) with different blur kernels, scale factors and noise levels via a single model. On the other hand, similar to learning-based methods, USRNet can be trained in an end-to-end fashion to guarantee effectiveness and efficiency.

To achieve this, we first unfold the model-based energy function via a halfquadratic splitting algorithm. Correspondingly, we can obtain an inference which iteratively alternates between solving two subproblems, one related to a data term and the other to a prior term. We then treat the inference as a deep network, by replacing the solutions to the two subproblems with neural modules.

Since the two subproblems correspond respectively to enforcing degradation consistency knowledge and guaranteeing denoiser prior knowledge, USRNet is well-principled with explicit degradation and prior constraints, which is a distinctive advantage over existing learning-based SISR methods.

It is worth noting that since USRNet involves a hyper-parameter for each subproblem, the network contains an additional module for hyper-parameter generation. Moreover, in order to reduce the number of parameters, all the prior modules share the same architecture and same parameters.

本文工作：

在介绍本文工作时，作者的思路是这样的：

1. 先介绍本文的核心思想：结合基于学习的方法和基于模型的方法。

一方面，与基于模型的方法类似，USRNet 可以有效地处理经典的退化模型(即，式(1)))通过单一模型，具有不同的模糊核、尺度因子和噪声水平。

另一方面，与基于学习的方法类似，USRNet 可以以端到端的方式进行培训，以保证有效性和效率。

2. 核心思想是怎么实现的：

首先，通过半二次分裂算法展开基于模型的能量函数。相应地，可以得到一个推理，它迭代地交替解决两个子问题，一个与数据项有关，另一个与先验项有关。

然后，将推理作为一个深层网络，用神经模块代替这两个子问题的解。

3. 解释提出的网络：

由于这两个子问题分别对应于强化降解一致性知识（enforcing degradation consistency knowledge）和保证去噪器先验知识（guaranteeing denoiser prior knowledge），所以 USRNe t对于显式退化和先验约束是很有原则的，这是与现有的基于学习的SISR 方法的显著优势。

值得注意的是，由于 USRNet 涉及到每个子问题的超参数，因此网络包含一个额外的模块用于生成超参数。

此外，为了减少参数的数量，所有之前的模块共享相同的架构和相同的参数。

（当然，只读到这里呢，完全不懂这几句在说什么，看看正文吧，或许能看明白。）

The main contributions of this work are as follows:

1) An end-to-end trainable unfolding super-resolution network (USRNet) is proposed. USRNet is the first attempt to handle the classical degradation model with different scale factors, blur kernels and noise levels via a single end-to-end trained model.

2) USRNet integrates the flexibility of model-based methods and the advantages of learning-based methods, providing an avenue to bridge the gap between model-based and learning-based methods.

3) USRNet intrinsically imposes a degradation constraint (i.e., the estimated HR image should accord with the degradation process) and a prior constraint (i.e., the estimated HR image should have natural characteristics) on the solution.

4) USRNet performs favorably on LR images with different degradation settings, showing great potential for practical applications.

本文的主要贡献：（不厚道地直接翻译了）

1) 提出了一种端到端可训练展开超分辨率网络 (USRNet)。USRNet 是第一个尝试处理经典退化模型与不同的尺度因子，模糊内核和噪声水平通过一个单一的端到端训练模型。

2) USRNet 融合了基于模型方法的灵活性和基于学习方法的优点，为弥合基于模型方法和基于学习方法之间的鸿沟提供了途径。

3) USRNet 本质上强加了一个退化约束(即，估计的 HR 图像应符合退化过程) 和一个先验约束 (即，估计的 HR 图像应该具有自然特征) 上的解决方案。

4) USRNet 在不同退化设置的 LR 图像上表现良好，显示出巨大的实际应用潜力。

Related work

略

Method

Degradation model: classical vs. bicubic

Since bicubic degradation is well-studied, it is interesting to investigate its relationship to the classical degradation model. Actually, the bicubic degradation can be approximated by setting a proper blur kernel in Eq. (1). To achieve this, we adopt the data-driven method to solve the following kernel estimation problem by minimizing the reconstruction error over a large HR/bicubic-LR pairs $\{(x, y)\}$

$k^{\times s}_{bicubic } = arg min_k \left \| (x \otimes k)\downarrow_s-y \right \| .$ (2)

Fig. 2 shows the approximated bicubic kernels for scale factors 2, 3 and 4. It should be noted that since the downsamlping operation selects the upper-left pixel for each distinct s × s patch, the bicubic kernels for scale factors 2, 3 and 4 have a center shift of 0.5, 1 and 1.5 pixels to the upper-left direction, respectively.

由于双三次降解已经得到了很好的研究，因此研究其与经典降解模型的关系是很有趣的。实际上，可以通过在式 (1) 中设置合适的模糊核来近似双三次退化。为了实现这一点，采用数据驱动的方法来解决下面的核估计问题，即在一个较大的 HR/ bicubicr - LR 对 $\{(x, y)\}$ 上最小化重构误差

$k^{\times s}_{bicubic } = arg min_k \left \| (x \otimes k)\downarrow_s-y \right \| .$ (2)

图2显示了比例因子2、3和4的近似双三次核。需要注意的是，由于 downsamlping 操作为每个不同的 $s\times s$ patch 选择左上角像素，因此比例因子2、3和4的双三次核分别向左上角偏移 0.5、1和1.5像素。

Figure 2. Approximated bicubic kernels for scale factors 2, 3 and 4 under the classical SISR degradation model assumption. Note that these kernels contain negative values.

Unfolding optimization

According to the MAP framework, the HR image could be estimated by minimizing the following energy function

where is the data term, $\Phi (x)$ is the prior term, and $\lambda$ is a trade-off parameter.

给出了 MAP 框架。

In order to obtain an unfolding inference for Eq. (3), the half-quadratic splitting (HQS) algorithm is selected due to its simplicity and fast convergence in many applications. HQS tackles Eq. (3) by introducing an auxiliary variable , leading to the following approximate equivalence

where $\mu$ is the penalty parameter.

给出了 HQS 算法处理 MAP 的形式。（对 HQS 完全不了解哦，有点像是，把第一项的 x 换成 z，再加入一个 L2(x, z) 的惩罚项，让 z 与 x 尽可能相似。）

Such problem can be addressed by iteratively solving subproblems for and

According to Eq. (5), $\mu$ should be large enough so that and are approximately equal to the fixed point. However, this would also result in slow convergence. Therefore, a good rule of thumb is to iteratively increase $\mu$ . For convenience, the $\mu$ in the -th iteration is denoted by $\mu_k$ .

给出了具体解公式（4）的方法，即把（4）拆分从两部分（5）和（6）。（5）用来解。（6）用来解。

其中， $\mu$ 应该足够大，使得和近似等于不动点。然而，这也会导致缓慢的收敛。因此，一个好的经验法则是迭代递增 $\mu$ 。为方便起见，第次迭代中的记为 $\mu_k$ 。

（5）就是前面提到的数据项 data term。

（6）就是前面提到的先验项 prior term。

It can be observed that the data term and the prior term are decoupled into Eq. (5) and Eq. (6), respectively.

For the solution of Eq. (5), the fast Fourier transform (FFT) can be utilized by assuming the convolution is carried out with circular boundary conditions. Notably, it has a closed-form expression [71]

where is defined as

with $\alpha_k \triangleq \mu_k\sigma ^2$ and where the $F(\cdot)$ and $F^{-1} (\cdot)$ denote FFT and inverse FFT, $\overline{F(\cdot)}$ denotes complex conjugate of $F(\cdot)$ , denotes the distinct block processing operator with element-wise multiplication, i.e., applying elementwise multiplication to the $s \times s$ distinct blocks of , $\Downarrow_s$ denotes the distinct block downsampler, i.e., averaging the $s \times s$ distinct blocks, $\uparrow_s$ denotes the standard s-fold upsampler, i.e., upsampling the spatial size by filling the new entries with zeros.

It is especially noteworthy that Eq. (7) also works for the special case of deblurring when .

For the solution of Eq. (6), it is known that, from a Bayesian perspective, it actually corresponds to a denoising problem with noise level $\beta _k \triangleq \sqrt{\lambda /\mu_k}$ [10].

给出了公式（5）和（6）的解。（公式（6）的解我没看懂，或许要参考一下文献 [10]。）

公式（5）的求解采用了快速傅里叶变换，求解过程可能需要参考一下文献 [71]。

值得注意的是，当时，公式（7）也可以适用于去模糊算法（deblurring ）。

Deep unfolding network

Once the unfolding optimization is determined, the next step is to design the unfolding super-resolution network (USRNet). Because the unfolding optimization mainly consists of iteratively solving a data subproblem (i.e., Eq. (5)) and a prior subproblem (i.e., Eq. (6)), USRNet should alternate between a data module D and a prior module P. In addition, as the solutions of the subproblems also take the hyper-parameters αk and βk as input, respectively, a hyper-parameter module H is further introduced into USRNet. Fig. 3 illustrates the overall architecture of USRNet with K iterations, where K is empirically set to 8 for the speed-accuracy trade-off. Next, more details on D, P and H are provided.

一旦展开优化确定，下一步是设计展开超分辨率网络(USRNet)。因为展开优化主要包括迭代求解一个数据子问题 (即式(5)) 和一个先验子问题 (即公式(6))，USRNet 应该轮流一个数据模块 D 和前一个模块 P。此外，子问题的解决方案也把 hyper-parameters $\alpha_k$ $\beta_k$ 作为输入，分别进一步引入 USRNet hyper-parameter 模块 H。图 3 展示了带有个迭代的 USRNet 的总体架构，其中被经验地设置为 8，以权衡速度-准确性。接下来，提供D、P和H的更多细节。

Data module D

The data module plays the role of Eq. (7) which is the closed-form solution of the data subproblem. Intuitively, it aims to find a clearer HR image which minimizes a weighted combination of the data term $\left \| y - (z \otimes k)\downarrow_s \right \|^2$ and the quadratic regularization term $\left \| z- x_{k-1} \right \|^2$ with trade-off hyper-parameter $\alpha _k$ .

Because the data term corresponds to the degradation model, the data module thus not only has the advantage of taking the scale factor and blur kernel as input but also imposes a degradation constraint on the solution. Actually, it is difficult to manually design such a simple but useful multiple-input module. For brevity, Eq. (7) is rewritten as

$z_k = D(x_{k-1}, s, k, y, \alpha_k)$ . (8)

Note that is initialized by interpolating with scale factor via the simplest nearest neighbor interpolation. It should be noted that Eq. (8) contains no trainable parameters, which in turn results in better generalizability due to the complete decoupling between data term and prior term. For the implementation, we use PyTorch where the main FFT and inverse FFT operators can be implemented by torch.rfft and torch.irfft, respectively.

数据模块Eq.(7) 的目的是找到一个更清晰的 HR 图像，使最小化加权组合的数据项和二次正则项。

由于数据项对应于退化模型，因此数据模块不仅具有将比例因子和 blur 内核作为输入的优点，而且对解决方案施加了退化约束。实际上，手工设计这样一个简单但有用的多输入模块是很困难的。

为简洁起见，将式(7)改写为公式（8）。数据模块不包含可训练参数（即数据模块没有用深度学习），由于数据项与先验项完全解耦，因此具有更好的泛化能力。

具体实现，傅里叶变换与反变换可用 torch.rfft and torch.irfft 函数实现。

Figure 3. The overall architecture of the proposed USRNet with iterations. USRNet can flexibly handle the classical degradation (i.e., Eq. (1)) via a single model as it takes the LR image , scale factor , blur kernel and noise level $\sigma$ as input. Specifically, USRNet consists of three main modules, including the data module D that makes HR estimation clearer, the prior module P that makes HR estimation cleaner, and the hyper-parameter module H that controls the outputs of D and P.

Prior module P

The prior module aims to obtain a cleaner HR image by passing through a denoiser with noise level $\beta _k$ . Inspired by [66], we propose a deep CNN denoiser that takes the noise level as input

$x_k = P(z_k, \beta_k)$ . (9)

The proposed denoiser, namely ResUNet, integrates residual blocks [21] into U-Net [45]. U-Net is widely used for image-to-image mapping, while ResNet owes its popularity to fast training and its large capacity with many residual blocks. ResUNet takes the concatenated and noise level map as input and outputs the denoised image . By doing so, ResUNet can handle various noise levels via a single model, which significantly reduces the total number of parameters. Following the common setting of U-Net, ResUNet involves four scales, each of which has an identity skip connection between downscaling and upscaling operations. Specifically, the number of channels in each layer from the first scale to the fourth scale are set to 64, 128, 256 and 512, respectively. For the downscaling and upscaling operations, $2\times2$ strided convolution (SConv) and $2\times2$ transposed convolution (TConv) are adopted, respectively. Note that no activation function is followed by SConv and TConv layers, as well as the first and the last convolutional layers. For the sake of inheriting the merits of ResNet, a group of 2 residual blocks are adopted in the downscaling and upscaling of each scale. As suggested in [36], each residual block is composed of two $3\times 3$ convolution layers with ReLU activation in the middle and an identity skip connection summed to its output.

先验模块的目的是通过带噪声级的降噪器传递来获得更清晰的 HR 图像。受到 [66] 的启发，提出了一种深度CNN降噪器，以噪声水平作为输入。

先验模块由 ResUNet 构成。具体细节包括：

1） U-Net 包括 4 层；

2）每层的下采样用 $2\times2$ 步进卷积 SConv；没有跟激活函数；

3）每层的上采样用 $2\times2$ 反卷积 TConv；没有跟激活函数；

4）每个编解码器包括两个 residual block；卷积是 $3\times 3$ 的；结构如文献【2017CVPRW Enhanced deep residual networks for single 9 image super-resolution】，下图这个样子的。

Hyper-parameter module H

The hyper-parameter module acts as a ‘slide bar’ to control the outputs of the data module and prior module. For example, the solution would gradually approach $x_{k-1}$ as $\alpha _k$ increases. According to the definition of $\alpha _k$ and $\beta _k$ , $\alpha _k$ is determined by $\sigma$ and $\mu _k$ , while $\beta _k$ depends on $\lambda$ and $\mu _k$ . Although it is possible to learn a fixed $\lambda$ and $\mu _k$ , we argue that a performance gain can be obtained if $\lambda$ and $\mu _k$ vary with two key elements, i.e., scale factor and noise level $\sigma$ , that influence the degree of ill-posedness. Let $\alpha = [\alpha _1, \alpha _2, . . . , \alpha _K]$ and $\beta = [\beta_1, \beta_2, . . . , \beta_K]$ , we use a single module to predict $\alpha$ and $\beta$

$[\alpha , \beta ] = H(\sigma , s)$ . (10)

The hyper-parameter module consists of three fully connected layers with ReLU as the first two activation functions and Softplus [19] as the last. The number of hidden nodes in each layer is 64. Considering the fact that $\alpha _k$ and $\beta _k$ should be positive, and Eq. (7) should avoid division by extremely small $\alpha _k$ , the output Softplus layer is followed by an extra addition of 1e-6. We will show how the scale factor and noise level affect the hyper-parameters in Sec. 4.4.

超参数模块由三个完全连接的层组成，ReLU 作为前两个激活函数，Softplus[19] 作为后两个激活函数。每一层的隐藏节点数为 64个。考虑到 $\alpha _k$ 和 $\beta _k$ 应该为正，且 Eq.(7) 应该避免被极小除法，输出 Softplus 层后面是额外添加的 1e-6。在第 4.4 节中展示尺度因子和噪声水平如何影响超参数。

三个模块中，D 用的是基于模型的方法（非深度学习）；P 和 H 是深度学习模型；这就是 Introduction 中说的，本文的方法是结合了二者的方法。

End-to-end training

The end-to-end training aims to learn the trainable parameters of USRNet by minimizing a loss function over a large training data set. Thus, this section mainly describe the training data, loss function and training settings. Following [58], we use DIV2K [3] and Flickr2K [55] as the HR training dataset. The LR images are synthesized via Eq. (1). Although USRNet focuses on SISR, it is also applicable to the case of deblurring with . Hence, the scale factors are chosen from {1, 2, 3, 4}. However, due to limited space, this paper does not consider the deblurring experiments. For the blur kernels, we use anisotropic Gaussian kernels as in [44, 51, 67] and motion kernels as in [5]. We fix the kernel size to 25×25. For the noise level, we set its range to [0, 25].

With regard to the loss function, we adopt the L1 loss for PSNR performance. Following [58], once the model is obtained, we further adopt a weighted combination of L1 loss, VGG perceptual loss and relativistic adversarial loss [24] with weights 1, 1 and 0.005 for perceptual quality performance. We refer to such fine-tuned model as USRGAN. As usual, USRGAN only considers scale factor 4. We do not use additional losses to constrain the intermediate outputs since the above losses work well. One possible reason is that the prior module shares parameters across iterations.

To optimize the parameters of USRNet, we adopt the Adam solver [27] with mini-batch size 128. The learning rate starts from $1 \times 10^{-4}$ and decays by a factor of 0.5 every $4 \times10^4$ iterations and finally ends with $3 \times 10^{-6}$ . It is worth pointing out that due to the infeasibility of parallel computing for different scale factors, each min-batch only involves one random scale factor. For USRGAN, its learning rate is fixed to $1 \times 10^{-5}$ . The patch size of the HR image for both USRNet and USRGAN is set to $96\times 96$ . We train the models with PyTorch on 4 Nvidia Tesla V100 GPUs in Amazon AWS cloud. It takes about two days to obtain the USRNet model.

讲了一下网络参数的设置。

USRNet：只采用 L1 Loss。

USRGAN：采用 L1 Loss、VGG Loss、relativistic adversarial Loss。

二者的学习率设置也不同。

Experiments

We choose the widely-used color BSD68 dataset [40, 46] to quantitatively evaluate different methods. The dataset consists of 68 images with tiny structures and fine textures and thus is challenging to improve the quantitative metrics, such as PSNR. For the sake of synthesizing the corresponding testing LR images via Eq. (1), blur kernels and noise levels should be provided. Generally, it would be helpful to employ a large variety of blur kernels and noise levels for a thorough evaluation, however, it would also give rise to burdensome evaluation process. For this reason, as shown in Table 1, we only consider 12 representative and diverse blur kernels, including 4 isotropic Gaussian kernels with different widths (i.e., 0.7, 1.2, 1.6 and 2.0), 4 anisotropic Gaussian kernels from [67], and 4 motion blur kernels from [5, 33]. While it has been pointed out that anisotropic Gaussian kernels are enough for SISR task [44, 51], the SISR method that can handle more complex blur kernels would be a preferred choice in real applications. Therefore, it is necessary to further analyze the kernel robustness of different methods, we will thus separately report the PSNR results for each blur kernel rather than for each type of blur kernels. Although it has been pointed out that the proper blur kernel should vary with scale factor [64], we argue that the 12 blur kernels are diverse enough to cover a large kernel space. For the noise levels, we choose 2.55 (1%) and 7.65 (3%).

评价实验中的一些设置，如数据集的选择，核的选择，噪声等级的选择。

PSNR results

The compared methods include RCAN [70], ZSSR [51], IKC [20] and IRCNN [65]. Specifically, RCAN is stateof-the-art PSNR oriented method for bicubic degradation; ZSSR is a non-blind zero-shot learning method with the ability to handle Eq. (1) for anisotropic Gaussian kernels; IKC is a blind iterative kernel correction method for isotropic Gaussian kernels; IRCNN a non-blind deep denoiser based plug-and-play method.

Table 1. Average PSNR(dB) results of different methods for different combinations of scale factors, blur kernels and noise levels. The best two results are highlighted in red and blue colors, respectively.

对几个典型的 SISR 算法进行介绍和 PSNR 比较。

Although USRNet is not designed for bicubic degradation, it is interesting to test its results by taking the approximated bicubic kernels in Fig. 2 as input. From Table 2, one can see that USRNet still performs favorably without training on the bicubic kernels.

Table 2. The average PSNR(dB) results of USRNet for bicubic degradation on commonly-used testing datasets.

虽然 USRNet 不是为双三次退化设计的，但是以图2 中近似的双三次核作为输入来测试其结果是很有趣的。从表2 可以看出，USRNet 在没有经过双三次内核训练的情况下仍然表现良好。

Visual results

Figure 4. Visual results of different methods on super-resolving noise-free LR image with scale factor 4. The blur kernel is shown on the upper-right corner of the LR image. Note that RankSRGAN and our USRGAN aim for perceptual quality rather than PSNR value.

Analysis on D and P

Because the proposed USRNet is an iterative method, it is interesting to investigate the HR estimations of data module D and prior module P in different iterations. Fig. 5 shows the results of USRNet and USRGAN in different iterations for an LR image with scale factor 4. As one can see, D and P can facilitate each other for iterative and alternating blur removal and detail recovery. Interestingly, P can also act as a detail enhancer for high-frequency recovery due to the task-specific training. In addition, it does not reduce blur kernel induced degradation which verifies the decoupling between D and P. As a result, the end-to-end trained USRNet has a task-specific advantage over Gaussian denoiser based plug-and-play SISR. To quantitatively analyze the role of D, we have trained an USRNet model with 5 iterations, it turns out that the average PSNR value will decreases about 0.1dB on Gaussian blur kernels and 0.3dB on motion blur kernels. This further indicates that D aims to eliminate blur kernel induced degradation. In addition, one can see that USRGAN has similar results with USRNet in the first few iterations, but will instead recover tiny structures and fine textures in last few iterations.

Figure 5. HR estimations in different iterations of USRNet (top row) and USRGAN (bottom row). The initial HR estimation is the nearest neighbor interpolated version of LR image. The scale factor is 4, the noise level of LR image is 2.55 (1%), the blur kernel is shown on the upper-right corner of .

由于所提出的USRNet是一种迭代方法，因此研究数据模块 D 和先验模块 P 在不同迭代下的 HR 估计是很有意义的。

图5 为比例因子为 4 的 LR 图像的 USRNet 和 USRGAN 在不同迭代下的结果。

D 和 P 可以互相促进迭代和交替模糊去除和细节恢复。有趣的是，由于特定任务的训练，P 也可以作为高频恢复的细节增强剂。此外，它并没有减少模糊核引起的退化，这验证了 D 和 P 之间的解耦，因此，端到端训练的 USRNet 比基于高斯去噪即插即用的SISR 具有任务特定的优势。

为了定量分析 D 的作用，训练了一个 5 次迭代的 USRNet 模型，结果表明，在高斯模糊内核上，平均 PSNR 值会降低约 0.1dB，在运动模糊内核上，平均 PSNR 值会降低约 0.3dB。

这进一步表明 D 的目标是消除模糊核引起的退化。此外，可以看到 USRGAN 在前几个迭代中与 USRNet 有类似的结果，但是在最后几个迭代中将恢复微小的结构和精细的纹理。

Analysis on H

Fig. 6 shows outputs of the hyper-parameter module for different combinations of scale factor and noise level $\sigma$ . It can be observed from Fig. 6(a) that $\alpha$ is positively correlated with $\sigma$ and varies with s. This actually accords with the definition of $\alpha_i$ in Sec. 3.2 and our analysis in Sec. 3.3. From Fig. 6(b), one can see that $\beta$ has a decreasing tendency with the number of iterations and increases with scale factor and noise level. This implies that the noise level of HR estimation is gradually reduced across iterations and complex degradation requires a large $\beta_i$ to tackle with the illposeness. It should be pointed out that the learned hyperparameter setting is in accordance with that of IRCNN [65]. In summary, the learned H is meaningful as it plays the proper role.

Figure 6. Outputs of the hyper-parameter module H, i.e., (a) $\alpha$ and (b) $\beta$ , with respect to different combinations of and $\sigma$ .

图6显示了不同组合 hyper-parameter 模块的输出比例因子和噪声 $\sigma$ 水平。它可以观察到从图 6 (a)， $\alpha$ 与 $\sigma$ 呈正相关，而与相反。这实际上符合 $\alpha_i$ 的定义在3.2节。

从图 6 (b),一个可以看到 $\beta$ 随比例因子和噪声水平的增加，有减少的趋势。

综上所述，所学的 H 是有意义的，因为它发挥了适当的作用。

Generalizability

As mentioned earlier, the proposed method enjoys good generalizability due to the decoupling of data term and prior term. To demonstrate such an advantage, Fig. 7 shows the visual results of USRNet and USRGAN on LR image with a kernel of much larger size than training size of 25×25. It can be seen that both USRNet and USRGAN can produce visually pleasant results, which can be attributed to the trainable parameter-free data module. It is worth pointing out that USRGAN is trained on scale factor 4, while Fig. 7(b) shows its visual result on scale factor 3. This further indicates that the prior module of USRGAN can generalize to other scale factors. In summary, the proposed deep unfolding architecture has superiority in generalizability.

Figure 7. An illustration to show the generalizability of USRNet and USRGAN. The sizes of the kernels in (a) and (c) are 67×67 and 70×70, respectively. The two kernels are chosen from [41].

如前所述，由于数据项和先验项的解耦，该方法具有很好的推广性。为了说明这种优势，图7显示了 USRNet 和 USRGAN 在 LR 图像上的视觉结果，其核尺寸比训练尺寸 $25 \times 25$ 大得多。可以看出，USRNet和USRGAN都可以产生视觉上令人愉悦的结果，这要归功于可训练的无参数数据模块。值得指出的是，USRGAN 在scale factor 4上进行了训练，而图7(b) 显示了其在 scale factor 3 上的可视化结果。这进一步表明 USRGAN 的先验模块可以推广到其他尺度因子。综上所述，所提出的深度展开结构具有普遍的优越性。

Real image super-resolution

Because Eq. (7) is based on the assumption of circular boundary condition, a proper boundary handling for the real LR image is generally required. We use the following three steps to do such pre-processing. First, the LR image is interpolated to the desired size. Second, the boundary handling method proposed in [38] is adopted on the interpolated image with the blur kernel. Last, the downsampled boundaries are padded to the original LR image. Fig. 8 shows the visual result of USRNet on real LR image with scale factor 4. The blur kernel is manually selected as isotropic Gaussian kernel with width 2.2 based on user preference. One can see from Fig. 8 that the proposed USRNet can reconstruct the HR image with improved visual quality.

Figure 8. Visual result of USRNet (×4) on a real LR image.

由于式 (7) 是基于圆形边界条件的假设，因此通常需要对真实 LR 图像进行适当的边界处理。我们使用以下三个步骤来进行这种预处理。首先，将 LR 图像内插到所需的大小。其次，对带模糊核的插值图像采用了 [38] 中提出的边界处理方法。最后，将下采样边界填充到原始 LR 图像。图8 显示了 USRNet 在实际 LR 图像上的可视结果，比例因子为 4。基于用户喜好，手动选择模糊核作为宽度为 2.2 的各向同性高斯核。从图8 可以看出，提出的 USRNet 可以重建出视觉质量得到改善的 HR 图像。

之所以加这一小结，是因为本文的摘要、前言提及过对真实图像有效，本节就是对这几句话做出的验证。

你可能感兴趣的:(深度学习)

机器学习与深度学习间关系与区别 ℒℴѵℯ心·动ꦿ໊ོ꫞ 人工智能学习深度学习 python
一、机器学习概述定义机器学习（MachineLearning,ML）是一种通过数据驱动的方法，利用统计学和计算算法来训练模型，使计算机能够从数据中学习并自动进行预测或决策。机器学习通过分析大量数据样本，识别其中的模式和规律，从而对新的数据进行判断。其核心在于通过训练过程，让模型不断优化和提升其预测准确性。主要类型1.监督学习（SupervisedLearning）监督学习是指在训练数据集中包含输入
将cmd中命令输出保存为txt文本文件落难Coder Windows cmd window
最近深度学习本地的训练中我们常常要在命令行中运行自己的代码，无可厚非，我们有必要保存我们的炼丹结果，但是复制命令行输出到txt是非常麻烦的，其实Windows下的命令行为我们提供了相应的操作。其基本的调用格式就是：运行指令>输出到的文件名称或者具体保存路径测试下，我打开cmd并且ping一下百度：pingwww.baidu.com>./data.txt看下相同目录下data.txt的输出：如果你再
推荐3家毕业AI论文可五分钟一键生成！文末附免费教程！小猪包333 写论文人工智能 AI写作深度学习计算机视觉
在当前的学术研究和写作领域，AI论文生成器已经成为许多研究人员和学生的重要工具。这些工具不仅能够帮助用户快速生成高质量的论文内容，还能进行内容优化、查重和排版等操作。以下是三款值得推荐的AI论文生成器：千笔-AIPassPaper、懒人论文以及AIPaperPass。千笔-AIPassPaper千笔-AIPassPaper是一款基于深度学习和自然语言处理技术的AI写作助手，旨在帮助用户快速生成高质
AI大模型的架构演进与最新发展季风泯灭的季节 AI大模型应用技术二人工智能架构
随着深度学习的发展，AI大模型（LargeLanguageModels,LLMs）在自然语言处理、计算机视觉等领域取得了革命性的进展。本文将详细探讨AI大模型的架构演进，包括从Transformer的提出到GPT、BERT、T5等模型的历史演变，并探讨这些模型的技术细节及其在现代人工智能中的核心作用。一、基础模型介绍：Transformer的核心原理Transformer架构的背景在Transfo
[实践应用] 深度学习之模型性能评估指标 YuanDaima2048 深度学习工具使用深度学习人工智能损失函数性能评估 pytorch python 机器学习
文章总览：YuanDaiMa2048博客文章总览深度学习之模型性能评估指标分类任务回归任务排序任务聚类任务生成任务其他介绍在机器学习和深度学习领域，评估模型性能是一项至关重要的任务。不同的学习任务需要不同的性能指标来衡量模型的有效性。以下是对一些常见任务及其相应的性能评估指标的详细解释和总结。分类任务分类任务是指模型需要将输入数据分配到预定义的类别或标签中。以下是分类任务中常用的性能指标：准确率(
[实践应用] 深度学习之优化器 YuanDaima2048 深度学习工具使用 pytorch 深度学习人工智能机器学习 python 优化器
文章总览：YuanDaiMa2048博客文章总览深度学习之优化器1.随机梯度下降（SGD）2.动量优化（Momentum）3.自适应梯度（Adagrad）4.自适应矩估计（Adam）5.RMSprop总结其他介绍在深度学习中，优化器用于更新模型的参数，以最小化损失函数。常见的优化函数有很多种，下面是几种主流的优化器及其特点、原理和PyTorch实现：1.随机梯度下降（SGD）原理:随机梯度下降通过
生成式地图制图 Bwywb_3 深度学习机器学习深度学习生成对抗网络
生成式地图制图（GenerativeCartography）是一种利用生成式算法和人工智能技术自动创建地图的技术。它结合了传统的地理信息系统（GIS）技术与现代生成模型（如深度学习、GANs等），能够根据输入的数据自动生成符合需求的地图。这种方法在城市规划、虚拟环境设计、游戏开发等多个领域具有应用前景。主要特点：自动化生成：通过算法和模型，系统能够根据输入的地理或空间数据自动生成地图，而无需人工逐
吴恩达深度学习笔记(30)-正则化的解释极客Array
正则化（Regularization）深度学习可能存在过拟合问题——高方差，有两个解决方法，一个是正则化，另一个是准备更多的数据，这是非常可靠的方法，但你可能无法时时刻刻准备足够多的训练数据或者获取更多数据的成本很高，但正则化通常有助于避免过拟合或减少你的网络误差。如果你怀疑神经网络过度拟合了数据，即存在高方差问题，那么最先想到的方法可能是正则化，另一个解决高方差的方法就是准备更多数据，这也是非常
个人学习笔记7-6：动手学深度学习pytorch版-李沐浪子L 深度学习深度学习笔记计算机视觉 python 人工智能神经网络 pytorch
#人工智能##深度学习##语义分割##计算机视觉##神经网络#计算机视觉13.11全卷积网络全卷积网络（fullyconvolutionalnetwork，FCN）采用卷积神经网络实现了从图像像素到像素类别的变换。引入l转置卷积（transposedconvolution）实现的，输出的类别预测与输入图像在像素级别上具有一一对应关系：通道维的输出即该位置对应像素的类别预测。13.11.1构造模型下
深度学习-点击率预估-研究论文2024-09-14速读 sp_fyf_2024 深度学习人工智能
深度学习-点击率预估-研究论文2024-09-14速读1.DeepTargetSessionInterestNetworkforClick-ThroughRatePredictionHZhong,JMa,XDuan,SGu,JYao-2024InternationalJointConferenceonNeuralNetworks,2024深度目标会话兴趣网络用于点击率预测摘要：这篇文章提出了一种新
损失函数与反向传播 Star_. PyTorch pytorch 深度学习 python
损失函数定义与作用损失函数(lossfunction)在深度学习领域是用来计算搭建模型预测的输出值和真实值之间的误差。1.损失函数越小越好2.计算实际输出与目标之间的差距3.为更新输出提供依据（反向传播)常见的损失函数回归常见的损失函数有：均方差（MeanSquaredError，MSE）、平均绝对误差（MeanAbsoluteErrorLoss，MAE）、HuberLoss是一种将MSE与MAE
【深度学习】训练过程中一个OOM的问题，太难查了 weixin_40293999 深度学习深度学习人工智能
现象：各位大佬又遇到过ubuntu的这个问题么？现象是在训练过程中，ssh上不去了，能ping通，没死机，但是ubunutu的pc侧的显示器，鼠标啥都不好用了。只能重启。问题原因：OOM了95G，尼玛！！！！pytorch爆内存了，然后journald假死了，在journald被watchdog干掉之后，系统就崩溃了。这种规模的爆内存一般，即使被oomkill了，也要卡半天的，确实会这样，能不能配
云服务业界动态简报-20180128 Captain7
一、青云青云QingCloud推出深度学习平台DeepLearningonQingCloud，包含了主流的深度学习框架及数据科学工具包，通过QingCloudAppCenter一键部署交付，可以让算法工程师和数据科学家快速构建深度学习开发环境，将更多的精力放在模型和算法调优。二、腾讯云1.腾讯云正式发布腾讯专有云TCE(TencentCloudEnterprise)矩阵，涵盖企业版、大数据版、AI
机器学习VS深度学习 nfgo 机器学习
机器学习（MachineLearning,ML）和深度学习（DeepLearning,DL）是人工智能（AI）的两个子领域，它们有许多相似之处，但在技术实现和应用范围上也有显著区别。下面从几个方面对两者进行区分：1.概念层面机器学习：是让计算机通过算法从数据中自动学习和改进的技术。它依赖于手动设计的特征和数学模型来进行学习，常用的模型有决策树、支持向量机、线性回归等。深度学习：是机器学习的一个子领
大数据毕业设计hadoop+spark+hive知识图谱租房数据分析可视化大屏租房推荐系统 58同城租房爬虫房源推荐系统房价预测系统计算机毕业设计机器学习深度学习人工智能 2401_84572577 程序员大数据 hadoop 人工智能
做了那么多年开发，自学了很多门编程语言，我很明白学习资源对于学一门新语言的重要性，这些年也收藏了不少的Python干货，对我来说这些东西确实已经用不到了，但对于准备自学Python的人来说，或许它就是一个宝藏，可以给你省去很多的时间和精力。别在网上瞎学了，我最近也做了一些资源的更新，只要你是我的粉丝，这期福利你都可拿走。我先来介绍一下这些东西怎么用，文末抱走。（1）Python所有方向的学习路线（
深度学习-13-小语言模型之SmolLM的使用皮皮冰燃深度学习深度学习
文章附录1SmolLM概述1.1SmolLM简介1.2下载模型2运行2.1在CPU/GPU/多GPU上运行模型2.2使用torch.bfloat162.3通过位和字节的量化版本3应用示例4问题及解决4.1attention_mask和pad_token_id报错4.2max_new_tokens=205参考附录1SmolLM概述1.1SmolLM简介SmolLM是一系列尖端小型语言模型，提供三种规
基于深度学习的农作物病害检测 SEU-WYL 深度学习dnn 深度学习人工智能
基于深度学习的农作物病害检测利用卷积神经网络（CNN）、生成对抗网络（GAN）、Transformer等深度学习技术，自动识别和分类农作物的病害，帮助农业工作者提高作物管理效率、减少损失。1.农作物病害检测的挑战病害种类繁多：农作物病害的类型多样，不同病害在同一作物上的表现差异很大，同时同一种病害在不同生长阶段的症状也可能不同。环境影响：天气、光照、湿度等外部环境因素会影响农作物的表现，使得病害检
基于深度学习的文本引导的图像编辑 SEU-WYL 深度学习dnn 深度学习人工智能
基于深度学习的文本引导的图像编辑（Text-GuidedImageEditing）是一种通过自然语言文本指令对图像进行编辑或修改的技术。它结合了图像生成和自然语言处理（NLP）的最新进展，使用户能够通过描述性文本对图像内容进行精确的调整和操控。1.文本引导的图像编辑的挑战文本和图像之间的对齐：如何将文本中的语义信息准确地映射到图像中的特定区域或元素是一个关键挑战。这涉及到多模态数据的对齐和理解。编
深度学习--对抗生成网络（GAN, Generative Adversarial Network） Ambition_LAO 深度学习生成对抗网络
对抗生成网络（GAN,GenerativeAdversarialNetwork）是一种深度学习模型，由IanGoodfellow等人在2014年提出。GAN主要用于生成数据，通过两个神经网络相互对抗，来生成以假乱真的新数据。以下是对GAN的详细阐述，包括其概念、作用、核心要点、实现过程、代码实现和适用场景。1.概念GAN由两个神经网络组成：生成器（Generator）和判别器（Discrimina
深度学习：怎么看pth文件的参数奥利给少年深度学习人工智能
.pth文件是PyTorch模型的权重文件，它通常包含了训练好的模型的参数。要查看或使用这个文件，你可以按照以下步骤操作：1.确保你有模型的定义你需要有创建这个.pth文件时所用的模型的代码。这意味着你需要有模型的类定义和架构。2.加载模型权重使用PyTorch的load_state_dict方法来加载权重。这里是如何操作的：importtorchimporttorch.nnasnn#定义模型结构
chatgpt赋能python：如何在Python中安装Keras库？ turensu ChatGpt python chatgpt keras 计算机
如何在Python中安装Keras库？Keras是一个简单易用的神经网络库，由FrançoisChollet编写。它在Python编程语言中实现了深度学习的功能，可以使您更轻松地构建和试验不同类型的神经网络。如果您是一名Python开发人员，肯定会想知道如何在您的Python项目中安装Keras库。在本文中，我们将向您展示如何安装和配置Keras库。步骤1：安装Python要使用Keras库，您需
如何理解深度学习的训练过程奋斗的草莓熊深度学习人工智能 python scikit-learn virtualenv numpy pandas
文章目录1.训练是干什么？2.预训练模型进行训练，主要更改的是预训练模型的什么东西？1.训练是干什么？以yolov5为例子，训练的目的是把一组输入猫狗图像放到神经网络中，得到一个输出模型，这个模型下次可以直接用来识别哪个是猫，哪个是狗2.预训练模型进行训练，主要更改的是预训练模型的什么东西？超参数（Hyperparameters）：这是模型结构中定义的参数，比如：卷积核大小（kernel_size
Keras深度学习框架入门及实战指南司莹嫣Maude
Keras深度学习框架入门及实战指南keraskeras-team/keras:是一个基于Python的深度学习库，它没有使用数据库。适合用于深度学习任务的开发和实现，特别是对于需要使用Python深度学习库的场景。特点是深度学习库、Python、无数据库。项目地址:https://gitcode.com/gh_mirrors/ke/keras一、项目介绍Keras简介Keras是一款高级神经网络
深度学习驱动的车牌识别：技术演进与未来挑战逼子歌深度学习车牌识别神经网络字符识别 YOLO 卷积神经网络
一、引言1.1研究背景在当今社会，智能交通系统的发展日益重要，而车牌识别作为其关键组成部分，发挥着至关重要的作用。车牌识别技术广泛应用于交通管理、停车场管理、安防监控等领域。在交通管理中，它可以用于车辆识别、交通违法监控和车流统计等，提高交通管理的效率和准确性。在停车场管理中，实现车辆的自动识别和收费，提升管理和服务水平。在安防监控领域，可用于追踪嫌疑人及犯罪行为。深度学习的出现为车牌识别带来了重
每天五分钟玩转深度学习PyTorch：模型参数优化器torch.optim 幻风_huanfeng 深度学习框架pytorch 深度学习 pytorch 人工智能神经网络机器学习优化算法
本文重点在机器学习或者深度学习中，我们需要通过修改参数使得损失函数最小化(或最大化)，优化算法就是一种调整模型参数更新的策略。在pytorch中定义了优化器optim，我们可以使用它调用封装好的优化算法，然后传递给它神经网络模型参数，就可以对模型进行优化。本文是学习第6步(优化器)，参考链接pytorch的学习路线随机梯度下降算法在深度学习和机器学习中，梯度下降算法是最常用的参数更新方法，它的公式
什么是AIGC？有哪些免费工具？ chent_某位 AIGC
AIGC（AIGeneratedContent），即“人工智能生成内容”，是指通过人工智能技术自动生成各种类型的数字内容。AIGC让机器能够根据输入的信息或数据生成符合人类需求的文本、图像、音频、视频等内容，极大提高了内容创作的效率。AIGC的背景与起源随着深度学习和自然语言处理技术的快速发展，人工智能已经不再局限于简单的任务，如分类、预测和数据分析，而是具备了生成内容的能力。生成式AI模型，如O
transformer架构(Transformer Architecture)原理与代码实战案例讲解 AI架构设计之禅大数据AI人工智能 Python入门实战计算科学神经计算深度学习神经网络大数据人工智能大型语言模型 AI AGI LLM Java Python 架构设计 Agent RPA
transformer架构(TransformerArchitecture)原理与代码实战案例讲解关键词：Transformer,自注意力机制,编码器-解码器,预训练,微调,NLP,机器翻译作者：禅与计算机程序设计艺术/ZenandtheArtofComputerProgramming1.背景介绍1.1问题的由来自然语言处理（NLP）领域的发展经历了从规则驱动到统计驱动再到深度学习驱动的三个阶段。
如何有效的学习AI大模型？ Python程序员罗宾学习人工智能语言模型自然语言处理架构
学习AI大模型是一个系统性的过程，涉及到多个学科的知识。以下是一些建议，帮助你更有效地学习AI大模型：基础知识储备：数学基础：学习线性代数、概率论、统计学和微积分等，这些是理解机器学习算法的数学基础。编程技能：掌握至少一种编程语言，如Python，因为大多数AI模型都是用Python实现的。理论学习：机器学习基础：了解监督学习、非监督学习、强化学习等基本概念。深度学习：学习神经网络的基本结构，如卷
【深度学习】【OnnxRuntime】【Python】模型转化、环境搭建以及模型部署的详细教程牙牙要健康深度学习 onnx onnxruntime 深度学习 python 人工智能
【深度学习】【OnnxRuntime】【Python】模型转化、环境搭建以及模型部署的详细教程提示:博主取舍了很多大佬的博文并亲测有效,分享笔记邀大家共同学习讨论文章目录【深度学习】【OnnxRuntime】【Python】模型转化、环境搭建以及模型部署的详细教程前言模型转换--pytorch转onnxWindows平台搭建依赖环境onnxruntime调用onnx模型ONNXRuntime推理核
基于深度学习的多模态信息检索 SEU-WYL 深度学习dnn 深度学习人工智能
基于深度学习的多模态信息检索（MultimodalInformationRetrieval,MMIR）是指利用深度学习技术，从包含多种模态（如文本、图像、视频、音频等）的数据集中检索出满足用户查询意图的相关信息。这种方法不仅可以处理单一模态的数据，还可以在多种模态之间建立关联，从而更准确地满足用户需求。1.多模态信息检索的挑战异构数据表示：多模态数据通常具有不同的特征和表示形式（如文本的词嵌入与图
JVM StackMapTable 属性的作用及理解 lijingyao8206 jvm 字节码 Class文件 StackMapTable
在Java 6版本之后JVM引入了栈图(Stack Map Table)概念。为了提高验证过程的效率，在字节码规范中添加了Stack Map Table属性，以下简称栈图，其方法的code属性中存储了局部变量和操作数的类型验证以及字节码的偏移量。也就是一个method需要且仅对应一个Stack Map Table。在Java 7版
回调函数调用方法百合不是茶 java
最近在看大神写的代码时,.发现其中使用了很多的回调 ,以前只是在学习的时候经常用到 ,现在写个笔记记录一下代码很简单: MainDemo :调用方法得到方法的返回结果
[时间机器]制造时间机器需要一些材料 comsci 制造
根据我的计算和推测,要完全实现制造一台时间机器,需要某些我们这个世界不存在的物质和材料... 甚至可以这样说,这种材料和物质,我们在反应堆中也无法获得......
开口埋怨不如闭口做事邓集海邓集海做人做事工作
“开口埋怨，不如闭口做事。”不是名人名言，而是一个普通父亲对儿子的训导。但是，因为这句训导，这位普通父亲却造就了一个名人儿子。这位普通父亲造就的名人儿子，叫张明正。　　　　张明正出身贫寒，读书时成绩差，常挨老师批评。高中毕业，张明正连普通大学的分数线都没上。高考成绩出来后，平时开口怨这怨那的张明正，不从自身找原因，而是不停地埋怨自己家庭条件不好、埋怨父母没有给他创造良好的学习环境。　　　　
jQuery插件开发全解析，类级别与对象级别开发 IT独行者 jquery 开发插件　函数
jQuery插件的开发包括两种：一种是类级别的插件开发，即给 jQuery添加新的全局函数，相当于给 jQuery类本身添加方法。 jQuery的全局函数就是属于 jQuery命名空间的函数，另一种是对象级别的插件开发，即给 jQuery对象添加方法。下面就两种函数的开发做详细的说明。 1 、类级别的插件开发类级别的插件开发最直接的理解就是给jQuer
Rome解析Rss 413277409 Rome解析Rss
import java.net.URL; import java.util.List; import org.junit.Test; import com.sun.syndication.feed.synd.SyndCategory; import com.sun.syndication.feed.synd.S
RSA加密解密无量加密解密 rsa
RSA加密解密代码代码有待整理 package com.tongbanjie.commons.util; import java.security.Key; import java.security.KeyFactory; import java.security.KeyPair; import java.security.KeyPairGenerat
linux 软件安装遇到的问题 aichenglong linux 遇到的问题 ftp
1 ftp配置中遇到的问题 500 OOPS: cannot change directory 出现该问题的原因:是SELinux安装机制的问题.只要disable SELinux就可以了修改方法:1 修改/etc/selinux/config 中SELINUX=disabled 2 source /etc
面试心得 alafqq 面试
最近面试了好几家公司。记录下；支付宝，面试我的人胖胖的，看着人挺好的；博彦外包的职位，面试失败；阿里金融，面试官人也挺和善，只不过我让他吐血了。。。由于印象比较深，记录下； 1，自我介绍 2，说下八种基本类型；（算上string。楼主才答了3种，哈哈，string其实不是基本类型，是引用类型） 3，什么是包装类，包装类的优点； 4，平时看过什么书？NND，什么书都没看过。。照样
java的多态性探讨百合不是茶 java
java的多态性是指main方法在调用属性的时候类可以对这一属性做出反应的情况 //package 1; class A{ public void test(){ System.out.println("A"); } } class D extends A{ public void test(){ S
网络编程基础篇之JavaScript-学习笔记 bijian1013 JavaScript
1.documentWrite <html> <head> <script language="JavaScript"> document.write("这是电脑网络学校"); document.close(); </script> </h
探索JUnit4扩展：深入Rule bijian1013 JUnit Rule 单元测试
本文将进一步探究Rule的应用，展示如何使用Rule来替代@BeforeClass，@AfterClass，@Before和@After的功能。在上一篇中提到，可以使用Rule替代现有的大部分Runner扩展，而且也不提倡对Runner中的withBefores()，withAfte
[CSS]CSS浮动十五条规则 bit1129 css
这些浮动规则，主要是参考CSS权威指南关于浮动规则的总结，然后添加一些简单的例子以验证和理解这些规则。 1. 所有的页面元素都可以浮动 2. 一个元素浮动后，会成为块级元素，比如<span>,a, strong等都会变成块级元素 3.一个元素左浮动，会向最近的块级父元素的左上角移动，直到浮动元素的左外边界碰到块级父元素的左内边界；如果这个块级父元素已经有浮动元素停靠了
【Kafka六】Kafka Producer和Consumer多Broker、多Partition场景 bit1129 partition
0.Kafka服务器配置 3个broker 1个topic，6个partition，副本因子是2 2个consumer，每个consumer三个线程并发读取 1. Producer package kafka.examples.multibrokers.producers; import java.util.Properties; import java.util.
zabbix_agentd.conf配置文件详解 ronin47 zabbix 配置文件
Aliaskey的别名，例如 Alias=ttlsa.userid:vfs.file.regexp[/etc/passwd,^ttlsa:.:([0-9]+),,,,\1]，或者ttlsa的用户ID。你可以使用key：vfs.file.regexp[/etc/passwd,^ttlsa:.: ([0-9]+),,,,\1]，也可以使用ttlsa.userid。备注: 别名不能重复，但是可以有多个
java--19.用矩阵求Fibonacci数列的第N项 bylijinnan fibonacci
参考了网上的思路，写了个Java版的： public class Fibonacci { final static int[] A={1,1,1,0}; public static void main(String[] args) { int n=7; for(int i=0;i<=n;i++){ int f=fibonac
Netty源码学习-LengthFieldBasedFrameDecoder bylijinnan java netty
先看看LengthFieldBasedFrameDecoder的官方API http://docs.jboss.org/netty/3.1/api/org/jboss/netty/handler/codec/frame/LengthFieldBasedFrameDecoder.html API举例说明了LengthFieldBasedFrameDecoder的解析机制，如下：实
AES加密解密 chicony 加密解密
AES加解密算法，使用Base64做转码以及辅助加密： package com.wintv.common; import javax.crypto.Cipher; import javax.crypto.spec.IvParameterSpec; import javax.crypto.spec.SecretKeySpec; import sun.misc.BASE64Decod
文件编码格式转换 ctrain 编码格式
package com.test; import java.io.File; import java.io.FileInputStream; import java.io.FileOutputStream; import java.io.IOException; import java.io.InputStream; import java.io.OutputStream;
mysql 在linux客户端插入数据中文乱码 daizj mysql 中文乱码
1、查看系统客户端，数据库，连接层的编码查看方法： http://daizj.iteye.com/blog/2174993 进入mysql，通过如下命令查看数据库编码方式： mysql> show variables like 'character_set_%'; +--------------------------+------
好代码是廉价的代码 dcj3sjt126com 程序员读书
长久以来我一直主张：好代码是廉价的代码。当我跟做开发的同事说出这话时，他们的第一反应是一种惊愕，然后是将近一个星期的嘲笑，把它当作一个笑话来讲。当他们走近看我的表情、知道我是认真的时，才收敛一点。当最初的惊愕消退后，他们会用一些这样的话来反驳： “好代码不廉价，好代码是采用经过数十年计算机科学研究和积累得出的最佳实践设计模式和方法论建立起来的精心制作的程序代码。” 我只
Android网络请求库——android-async-http dcj3sjt126com android
在iOS开发中有大名鼎鼎的ASIHttpRequest库，用来处理网络请求操作，今天要介绍的是一个在Android上同样强大的网络请求库android-async-http，目前非常火的应用Instagram和Pinterest的Android版就是用的这个网络请求库。这个网络请求库是基于Apache HttpClient库之上的一个异步网络请求处理库，网络处理均基于Android的非UI线程，通
ORACLE 复习笔记之SQL语句的优化 eksliang SQL优化 Oracle sql语句优化 SQL语句的优化
转载请出自出处：http://eksliang.iteye.com/blog/2097999 SQL语句的优化总结如下 sql语句的优化可以按照如下六个步骤进行：合理使用索引避免或者简化排序消除对大表的扫描避免复杂的通配符匹配调整子查询的性能 EXISTS和IN运算符下面我就按照上面这六个步骤分别进行总结：
浅析：Android 嵌套滑动机制（NestedScrolling） gg163 android 移动开发滑动机制嵌套
谷歌在发布安卓 Lollipop版本之后，为了更好的用户体验，Google为Android的滑动机制提供了NestedScrolling特性 NestedScrolling的特性可以体现在哪里呢？ 比如你使用了Toolbar，下面一个ScrollView，向上滚
使用hovertree菜单作为后台导航 hvt JavaScript jquery .net hovertree asp.net
hovertree是一个jquery菜单插件，官方网址：http://keleyi.com/jq/hovertree/ ，可以登录该网址体验效果。 0.1.3版本：http://keleyi.com/jq/hovertree/demo/demo.0.1.3.htm hovertree插件包含文件： http://keleyi.com/jq/hovertree/css
SVG 教程（二）矩形天梯梦 svg
SVG <rect> SVG Shapes SVG有一些预定义的形状元素，可被开发者使用和操作：矩形 <rect> 圆形 <circle> 椭圆 <ellipse> 线 <line> 折线 <polyline> 多边形 <polygon> 路径 <path>
一个简单的队列 luyulong java 数据结构队列
public class MyQueue { private long[] arr; private int front; private int end; // 有效数据的大小 private int elements; public MyQueue() { arr = new long[10]; elements = 0; front
基础数据结构和算法九：Binary Search Tree sunwinner Algorithm
A binary search tree (BST) is a binary tree where each node has a Comparable key (and an associated value) and satisfies the restriction that the key in any node is larger than the keys in all
项目出现的一些问题和体会 Steven-Walker DAO Web servlet
第一篇博客不知道要写点什么，就先来点近阶段的感悟吧。这几天学了servlet和数据库等知识，就参照老方的视频写了一个简单的增删改查的，完成了最简单的一些功能，使用了三层架构。 dao层完成的是对数据库具体的功能实现，service层调用了dao层的实现方法，具体对servlet提供支持。 &
高手问答：Java老A带你全面提升Java单兵作战能力！ ITeye管理员 java
本期特邀《Java特种兵》作者：谢宇，CSDN论坛ID: xieyuooo 针对JAVA问题给予大家解答，欢迎网友积极提问，与专家一起讨论! 作者简介：淘宝网资深Java工程师，CSDN超人气博主，人称“胖哥”。 CSDN博客地址： http://blog.csdn.net/xieyuooo 作者在进入大学前是一个不折不扣的计算机白痴，曾经被人笑话过不懂鼠标是什么，