泛读论文:Style Transfer 风格迁移合集

Style Transfer

A Neural Algorithm of Artistic Style

CVPR‘15

问题

  • 人们可以通过结合不同的风格和主题创作出独特的视觉效果,作者想通过算法实现

方法

  • 问题:给你一个内容图像 C C C,给定一个风格图片 S S S,而你的目标是生成一个新图片 G G G
  • 方法:定义一个关于 G G G的代价函数 J J J用来评判某个生成图像的好坏,使用梯度下降法去最小化, J ( G ) J(G) J(G)以便于生成这个图像。
  • 代价函数定义成两个部分
    • 内容代价函数: J c o n t e n t ( C , G ) = 1 2 ∣ ∣ a [ l ] [ C ] − a [ l ] [ G ] ∣ ∣ 2 J_{content}(C,G)=\frac{1}{2}||a^{[l][C]}-a^{[l][G]}||^2 Jcontent(C,G)=21a[l][C]a[l][G]2
    • 风格代价函数: J s t y l e ( S , G ) = 1 ( 2 n H [ l ] n W [ l ] n C [ L ] ) 2 ∑ k ∑ k ′ ( G k k ′ [ l ] [ S ] − G k k ′ [ l ] [ G ] ) J_{style}(S,G)=\frac{1}{(2n_H^{[l]}n_W^{[l]}n_C^{[L]})^2}\sum_k\sum_{k'}(G_{kk'}^{[l][S]}-G_{kk'}^{[l][G]}) Jstyle(S,G)=(2nH[l]nW[l]nC[L])21kk(Gkk[l][S]Gkk[l][G])
      • G G GGram矩阵
    • 最后用两个超参数来确定内容代价和风格代价总和的代价
    • J ( G ) = α J C O N T E N T ( C , G ) + β J s t y l e ( S , G ) J(G)=\alpha J_{CONTENT}(C,G)+\beta J_{style}(S,G) J(G)=αJCONTENT(C,G)+βJstyle(S,G)

收获

  • 通过Gram矩阵对两个feature map求内积可以表达图像的纹理特征

参考

https://arxiv.org/abs/1508.06576


Image Style Transfer Using Convolutional Neural Networks

CVPR’16

问题

  • 内容图像被提取的content feature为低等级特征,所含语义信息低(the lack of image representations )
  • 作者想提取图像的风格特征以及图像的高级特征进行渲染

方法

  • 作者发现网络的高层特征一般是关于输入图像的物体和布局等信息,低层特征一般表达输入图像的像素信息
  • 所以提取图像的content特征时采用高层特征
  • 算法跟A Neural Algorithm of Artistic Style 思路一样

收获

  • 网络不同层次的响应描述了图像不同层次的信息

参考

https://www.cv-foundation.org/openaccess/content_cvpr_2016/papers/Gatys_Image_Style_Transfer_CVPR_2016_paper.pdf


Perceptual Losses for Real-Time Style Transfer and Super-Resolution

ECCV‘16

问题

  • image transformation 的问题解决方法之一是
    • 在有监督模式下训练一个前反馈神经网络
    • 用per-pixel loss function来衡量输出图像和输入图像的差距
    • 优势在于在测试时,只需要一次前馈的通过已训练好的网络
    • However, the per-pixel losses used by these methods do not capture perceptual differences between output and ground-truth images
  • Perceptual Losses 能解决上面问题
    • perceptual losses measure image similarities more robustly than per-pixel losses
    • 但是很慢,Gatys的方法是对每一张要生成的图片,都要从新初始化训练,所以需要长时间的迭代优化过程
  • 本文的核心就是综合了上述两种方法的优点,在测试的过程中,生成器网络能做到实时转换

方法

  • 训练一个用于图像转换任务的前馈网络,从预训练好的网络中提取高级特征
    • 还能支持各种各样的输入尺寸
  • 不用逐像素求差构造损失函数,使用感知损失函数

收获

  • 可以训练前反馈网络达到一次学习的目的

参考

https://arxiv.org/abs/1603.08155


Stereoscopic Neural Style Transfer

CVPR‘18

问题

  • This paper presents the first attempt at stereoscopic neural style transfer, which responds to the emerging demand for 3D movies or AR/VR.
  • We start with a careful examination of applying existing monocular style transfer methods to left and right views of stereoscopic images separately. This reveals that the original disparity consistency cannot be well preserved in the final stylization results, which causes 3D fatigue to the viewers.

方法

  • To address this issue, we incorporate a new disparity loss into the widely adopted style loss function by enforcing the bidirectional disparity constraint in non-occluded regions
  • For a practical realtime solution, we propose the first feed-forward network by jointly training a stylization sub-network and a disparity sub-network, and integrate them in a feature level middle domain. Our disparity sub-network is also the first end-toend network for simultaneous bidirectional disparity and occlusion mask estimation. Finally, our network is effectively extended to stereoscopic videos, by considering both temporal coherence and disparity consistency

收获

  • 2D 可以扩展到 3D

参考

https://arxiv.org/abs/1802.10591

你可能感兴趣的:(论文笔记)