论文笔记(4):排序图像质量评价 RankIQA

RankIQA: Learning from Rankings for No-reference Image Quality Assessment
github: https://github.com/xialeiliu/RankIQA

Part 1:

Definition: 无参考图像质量评价(NR-IQA)的目的是对于给定的一幅图像,不使用Ground Truth(GT)图像得到其评价分数。
Problem: 对于大样本集,通过人工标注得到图像主观分数需要花费很大的代价,因此训练一个深的CNN网络是非常困难的。
Observation & Motivation: 通过对图像进行不同程度的退化,能够得到不同质量的图像。
classic method : 直接计算CNN输出分数和GT图像分数的误差。
Paper method:

  1. 使用不同等级的退化因子对GT图像进行退化得到大量的ranked images.
  2. 使用ranked images 训练双生网络,得到ranking score.
  3. 将在ranked images上训练好的网络在具有label的样本集上进行fine-tune.

Part 2 method:

  1. 使用pairwise ranking hinge loss训练双生网络:
    论文笔记(4):排序图像质量评价 RankIQA_第1张图片

  2. 优化双生网络的训练过程
    例如,对GT图像A使用三个level的高斯噪声进行退化得到A1,A2和A3。原始的双生网络将(A1,A2)(A2,A3)和(A1,A3)三个pairwise images 送入网络进行训练。文章中采用了随机选取其中一个pairwise images送入网络进行训练,大大加快了网络的收敛速度。

Part 3 experiment:

  1. dataset: The LIVE consists of 808 images generated from 29 original images by distorting them with five types of distortion: Gaussian blur(GB), Gaussian noise (GN), JPEG compression (JPEG),
    JPEG2000 compression (JP2K) and fast fading (FF). The ground-truth Mean Opinion Score for each image is in the range [0, 100] and is estimated using annotations by 161 human annotators.The TID2013 dataset consists of 25 reference images with 3000 distorted images from 24 different distortion types at 5 degradation levels. Mean Opinion Scores are in the range [0, 9]. Distortion types include a range of noise, compression, and transmission artifacts.
  2. Network architectures: Shallow-4conv, AlexNet 和 VGG-16.
  3. Evaluation protocols: 线性相关系数 Linear Correlation Coefficient (LCC)和Spearman Rank Order Correlation Coefficient (SROCC)
    论文笔记(4):排序图像质量评价 RankIQA_第2张图片
    在这里插入图片描述
  4. 实验结果
    论文笔记(4):排序图像质量评价 RankIQA_第3张图片
    论文笔记(4):排序图像质量评价 RankIQA_第4张图片
    论文笔记(4):排序图像质量评价 RankIQA_第5张图片
  5. 分析
    hard negative mining strategy: 网络开始训练时使用36个pairs, 每5000个iteration增加hard pair的数量,最大到72个。
    Network performance analysis: We randomly split on the original, high-quality images before distortion from the LIVE dataset into 80% training and 20% testing samples and compute the average LCC and SROCC scores on the testing set after training to convergence. This process is repeated ten times and the results are averaged.

reference:
S. Chopra, R. Hadsell, and Y. LeCun. Learning a similarity
metric discriminatively, with application to face verification.
In Computer Vision and Pattern Recognition, 2005.
CVPR 2005. IEEE Computer Society Conference on, volume
1, pages 539–546. IEEE, 2005.

你可能感兴趣的:(论文整理,IQA,deep,learning,图像质量评价,深度学习)