欢迎关注我的CSDN:https://spike.blog.csdn.net/
本文地址:https://blog.csdn.net/caroline_wendy/article/details/129939225
论文:Glow - Generative Flow with Invertible 1×1 Convolutions
摘要:
Flow-based generative models (Dinh et al., 2014) are conceptually attractive due to tractability of the exact log-likelihood, tractability of exact latent-variable inference, and parallelizability of both training and synthesis.
In this paper we propose Glow, a simple type of generative flow using an invertible 1 × 1 convolution.
Using our method we demonstrate a significant improvement in log-likelihood on standard benchmarks.
Perhaps most strikingly, we demonstrate that a generative model optimized towards the plain log-likelihood objective is capable of efficient realisticlooking synthesis and manipulation of large images.
1x1卷积主要作用是通道融合。Flow-based Generative Models,基于流的生成模型。PDF,Probability Density Function,概率密度函数;CDF,Cumulative Distribution Function,累积分布函数
输入 x -> log-det1(变换) -> z -> log-det2(变换) -> x,则 log-det1 + log-det2 = 0,即相反数的关系。
置换矩阵,行列式是0,三角矩阵(上三角或下三角都是0),对角线元素相乘,log就是对角线元素相加,即:
Multi-scale architecture,Flow:Actnorm、Invertible 1x1 Conv、Affine coupling layer
函数和逆函数,以及对数行列式(对数似然的增量),NN是非线性变换(神经网络), ⊙ \odot ⊙ 表示element-wise操作,即Hadamard product
[x1, x2, x3] ⊙ \odot ⊙ [w1, w2, w3] = [y1, y2, y3],初始的y是服从标准分布。
Jacobi矩阵
[ w 1 0 0 0 w 2 0 0 0 w 3 ] \begin{bmatrix} w_{1} & 0 & 0\\ 0 & w_{2} & 0\\ 0 & 0 & w_{3}\\ \end{bmatrix} w1000w2000w3
1x1的卷积操作,即一个点乘操作,一个矩阵操作。LU Decomposition(分解),L是下三角矩阵,U是上三角矩阵,复杂度从 O ( C 3 ) O(C^{3}) O(C3)降低为 O ( C ) O(C) O(C)。
Affine Coupling Layer,仿射耦合层,Jacobi矩阵是分块矩阵,左上角和右上角有关。简易图如下:
关于Real NVP,即real-valued Non-Volume Preserving,实值非体积保持。
摘要:
Unsupervised learning of probabilistic models is a central yet challenging problem in machine learning. Specifically, designing models with tractable learning, sampling, inference and evaluation is crucial in solving this task. We extend the space of such models using real-valued non-volume preserving (real NVP) transformations, a set of powerful, stably invertible, and learnable transformations, resulting in an unsupervised learning algorithm with exact log-likelihood computation, exact and efficient sampling, exact and efficient inference of latent variables, and an interpretable latent space. We demonstrate its ability to model natural images on four datasets through sampling, log-likelihood evaluation, and latent variable manipulations.
其中,Multi-scale architecture, z i z_{i} zi的相关计算,即:
Flow也可以引入条件生成。