学习的目标是an unconditional generative model that captures the internal statistics of a single training image x x x
不同于纹理生成(texture generation),本文针对的图像都是general natural images
对于输入图像 x x x的pyramid { x 0 , ⋯ , x N } \left \{ x_0,\cdots,x_N \right \} {x0,⋯,xN},对应各自的生成器 { G 0 , ⋯ , G N } \left \{ G_0,\cdots,G_N \right \} {G0,⋯,GN},其中 x n x_n xn是将 x x x尺寸缩小 r n r^n rn倍的图像, r > 1 r\gt1 r>1是一个超参数,每一个 G n G_n Gn对应一个判别器 D n D_n Dn
训练首先从 x N x_N xN这一尺寸开始, G N G_N GN将高斯白噪声 z N z_N zN转换为图像 x ~ N \tilde{x}_N x~N
x ~ N = G N ( z N ) ( 1 ) \tilde{x}_N=G_N(z_N) \qquad(1) x~N=GN(zN)(1)
x ~ N \tilde{x}_N x~N包含了图像的general layout以及object的global structure,后续的 G n ( n < N ) G_n(n\lt N) Gn(n<N)逐渐地增加各种细节
如Figure 5所示, G n G_n Gn接收的输入有2个,1是高斯白噪声 z n z_n zn,2是上一个尺度生成图像的上采样版本 ( x ~ n + 1 ) ↑ r \left ( \tilde{x}_{n+1} \right )\uparrow^r (x~n+1)↑r
x ~ n = G n ( z n , ( x ~ n + 1 ) ↑ ) , n < N ( 2 ) \tilde{x}_n=G_n\left ( z_n, \left ( \tilde{x}_{n+1} \right )\uparrow \right ), \quad n\lt N \qquad(2) x~n=Gn(zn,(x~n+1)↑),n<N(2)
更具体来说, G n G_n Gn执行的操作如下,是一种残差的操作
x ~ n = ( x ~ n + 1 ) ↑ r + ψ n ( z n + ( x ~ n + 1 ) ↑ r ) ( 3 ) \tilde{x}_n=\left ( \tilde{x}_{n+1} \right )\uparrow^r+\psi_n\left ( z_n+\left ( \tilde{x}_{n+1} \right )\uparrow^r \right ) \qquad(3) x~n=(x~n+1)↑r+ψn(zn+(x~n+1)↑r)(3)
其中 ψ n \psi_n ψn是一个ConvNet,包含了5个block,每个block是Conv(3x3)-BatchNorm-LeakyReLU
训练是从coarsest scale到finest scale,每一个GAN在训练好之后,就保持fixed状态
对于第 n n n个GAN,损失函数包括adversarial term以及reconstruction term
min G n max D n L a d v ( G n , D n ) + α L r e c ( G n ) ( 4 ) \underset{G_n}{\min}\ \underset{D_n}{\max}\ \mathcal{L}_{adv}(G_n,D_n)+\alpha\mathcal{L}_{rec}(G_n) \qquad(4) Gnmin Dnmax Ladv(Gn,Dn)+αLrec(Gn)(4)
Adversarial loss
使用WGAN-GP loss
Reconstruction loss
必须保证存在一组noise,能够重构出原始图像 x x x
因此事先选取一组 { z N r e c , z N − 1 r e c , ⋯ , z 0 r e c } = { z ∗ , 0 , ⋯ , 0 } \left \{ z_N^{rec},z_{N-1}^{rec},\cdots,z_0^{rec} \right \}=\left \{ z^*,0,\cdots,0 \right \} {zNrec,zN−1rec,⋯,z0rec}={z∗,0,⋯,0},生成得到 { x ~ N r e c , x ~ N − 1 r e c , ⋯ , x ~ 0 r e c } \left \{ \tilde{x}_N^{rec},\tilde{x}_{N-1}^{rec},\cdots,\tilde{x}_0^{rec} \right \} {x~Nrec,x~N−1rec,⋯,x~0rec}
于是对于 n < N n\lt N n<N
L r e c = ∥ G n ( 0 , ( x ~ n + 1 r e c ) ↑ r ) − x n ∥ 2 ( 5 ) \mathcal{L}_{rec}=\left \| G_n\left ( 0,\left ( \tilde{x}_{n+1}^{rec} \right )\uparrow^r \right ) -x_n\right \|^2 \qquad(5) Lrec=∥∥Gn(0,(x~n+1rec)↑r)−xn∥∥2(5)
对于 n = N n=N n=N, L r e c = ∥ G N ( z ∗ ) − x N ∥ 2 \mathcal{L}_{rec}=\left \| G_N(z^*)-x_N \right \|^2 Lrec=∥GN(z∗)−xN∥2