IRN: Invertible Image Rescaling实验验证

IRN:Invertible Image Rescaling实验验证




这篇论文主要是模拟图像在传输过程中,降尺度和升尺度的问题:高分辨率数字图像(HR图像)通常会按比例缩小(LR图像)以适应各种显示屏或者节省存储成本和带宽,同时终端设备采用后放大方法恢复原始的分辨率或放大图像中的细节。形成一个HR → \rightarrow LR → \rightarrow HR 可逆过程,因此这里,高分辨率图像HR一开始就是可以获得的。对于超分辨率任务真实情况下只有一个LR图像的输入,HR图像是不可获得的


  • 作者测试SR的流程(使用HR输入):
# 正向
input = img_HR  ---- shape:(1, 12, 2H, 2W)
img_LR = net(input, ver=False)[:, :3, :, :]   ---- shape:(1, 12, H, W) --> shape:(1, 3, H, W)

# 反向
# 潜变量z使用随机生成的张量,用于网络输入的padding
input = cat(img_LR, torch.randn(z_shape))   ---- shape: (1, 12, H, W) [z_shape: (1, 9, H, W)]
img_SR = net(input, rev=True)[:, :3, :, :]   ---- shape: (1, 12, 2H, 2W) --> shape: (1, 3, 2H, 2W)

# net(ver=False)正向:HaarDownSampling + InvBlockExp(number=8)
               				--> out_shape: (1, 12, H, W)
# net(ver=True) 反向:InvBlockExp(number=8) + HaarDownSampling
                                                     --> out_shape: (1, 3, 2H, 2W)               
  • 本人模拟实际SR的流程(只用LR输入):
img_LR  ---- shape:(1, 3, H, W)
# 潜变量z使用随机生成的张量,用于网络输入的padding
input = cat(img_LR, torch.randn(z_shape))   ---- shape: (1, 12, H, W) [z_shape: (1, 9, H, W)]
img_SR = net(input, rev=True)[:, :3, :, :]   ---- shape: (1, 12, 2H, 2W) --> shape: (1, 3, 2H, 2W)

# net(ver=False)正向:无正向过程
# net(ver=True) 反向:InvBlockExp(number=8) + HaarDownSampling
                                                     --> out_shape: (1, 3, 2H, 2W)               


Set5 Set14 B100 Urban Div2K(val)
input:HR 43.9994 40.7885 41.2929 39.9002 44.3248
input:LR 35.4581 31.0439 30.8835 28.4020 33.6007


input_shape params MAdd Flops Memory
input:HR (3, 192, 192) 1,668,000 30.68G 15.38G 465.36M
input:LR (3, 96, 96) 1,668,000 7.67G 3.84G 121.11M
RCAN (3, 96, 96) 15,444,667 282.26G 141.44G 2.77G


Flops:floating point operations,浮点运算数,即计算量,用来衡量算法/模型的复杂度。


Image rescaling is a different task from super-resolution (see ‘Difference from SR’ in the paper). IRN downscales HR images and reconstruct them from the downscaled LR images, while the ultimate goal of super-resolution is to upscale arbitrary LR images. So in our test code, we only need HR images to verify the performance.

If we just use the architecture of IRN for paired training of bicubic-downscaled LR images and HR images (latent variable z as padding 0), which is the setting of many sr methods, the performance is not as good as them. Reasons include that our invertible architecture is not mainly designed for prior learning, and the parameters are fewer. The improvement of IRN comes from our invertible modeling for downscaling and upscaling.

具体见:Github issue#4


  • 图像缩放(image rescaling)和超分辨率(super-resolution)是不同的任务。IRN是下采样HR得到LR,然后从LR重建出HR;SR的目标是放大任意的LR图像。(按博主理解,)

  • 如果仅仅使用IRN的架构对由bicubic下采样得到LR图像和HR图像进行配对训练(潜变量z作为0 padding),这也是许多SR方法的设置,其性能不如它们。原因包括IRN不是主要为先验学习设计的,而且参数很少。IRN的提升来自于对降尺度和升尺度的可逆建模。

