1、将 图像 x 0 x_0 x0 像素值映射到 [-1, 1] 之间
x 255 × 2 − 1 , w h e r e x 为图像中的像素值 \quad \frac{x}{255} \times 2-1, \quad where \; x 为图像中的像素值 255x×2−1,wherex为图像中的像素值
\quad
2、生成一张尺寸相同的噪声图片,像素值服从标准正态分布
ϵ ∼ N ( 0 , 1 ) , w h e r e ϵ 为噪声图像中的像素值 \quad \epsilon \sim N(0, 1), \quad where \; \epsilon 为噪声图像中的像素值 ϵ∼N(0,1),whereϵ为噪声图像中的像素值
\quad
3、将 处理好的原图像 和 噪音图像 进行融合
\quad 新生成的图像像素计算公式为: β × ϵ + 1 − β × x \sqrt \beta \times \epsilon + \sqrt{1-\beta} \times x β×ϵ+1−β×x , β \;\;\beta β 取值范围 为 0~1
t = 1 、 2 、 3... t=1、2、3 ... t=1、2、3... 时刻的图像像素值为
x 1 = β 1 × ϵ 1 + 1 − β 1 × x 0 x 2 = β 2 × ϵ 2 + 1 − β 2 × x 1 x 3 = β 3 × ϵ 3 + 1 − β 3 × x 2 . . . . . . . . x t = β t × ϵ t + 1 − β t × x t − 1 (1) x_1 = \sqrt \beta_1 \times \epsilon_1 + \sqrt{1-\beta_1} \times x_0 \\ x_2 = \sqrt \beta_2 \times \epsilon_2 + \sqrt{1-\beta_2} \times x_1 \\ x_3 = \sqrt \beta_3 \times \epsilon_3 + \sqrt{1-\beta_3} \times x_2 \\ .... \\ .... \\ x_t = \sqrt \beta_t \times \epsilon_t + \sqrt{1-\beta_t} \times x_{t-1} \tag{1} x1=β1×ϵ1+1−β1×x0x2=β2×ϵ2+1−β2×x1x3=β3×ϵ3+1−β3×x2........xt=βt×ϵt+1−βt×xt−1(1)
备注 :
1) ϵ t \epsilon_t ϵt 都是在每个时刻 t t t 重新采样的随机数
2)每个时刻的 β t \beta_t βt 都各不相同, 0 < β t < 1 0< \beta_t <1 0<βt<1, 且 β 1 < β 2 < . . . < β T − 1 < β T \beta_1 < \beta_2 < ... < \beta_{T-1} < \beta_{T} β1<β2<...<βT−1<βT
为了简化推导过程,我们引入新的变量 α \alpha α : \quad α t = 1 − β t \alpha_t = 1-\beta_t αt=1−βt
将 α t = 1 − β t \alpha_t = 1-\beta_t αt=1−βt 带入上面的公式(1),得 :
x t = 1 − α t × ϵ t + α t × x t − 1 (2) x_t = \sqrt {1- \alpha_t} \times \epsilon_t + \sqrt{\alpha_t} \times x_{t-1} \tag{2} xt=1−αt×ϵt+αt×xt−1(2)
目的 : 我们想直接使用 t=0 时刻的原始图像 x 0 x_0 x0 表示 t 时刻的图像 x t x_t xt
我们一步一步来推导,我们先尝试用 x t − 2 x_{t-2} xt−2 表示 x t x_t xt
由 公式 (2), 我们可得:
x t = 1 − α t × ϵ t + α t × x t − 1 (2) x_t = \sqrt {1- \alpha_t} \times \epsilon_t + \sqrt{\alpha_t} \times x_{t-1} \tag{2} xt=1−αt×ϵt+αt×xt−1(2)
x t − 1 = 1 − α t − 1 × ϵ t − 1 + α t − 1 × x t − 2 (3) x_{t-1} = \sqrt {1- \alpha_{t-1}} \times \epsilon_{t-1} + \sqrt{\alpha_{t-1}} \times x_{t-2} \tag{3} xt−1=1−αt−1×ϵt−1+αt−1×xt−2(3)
将 公式 (3) 带入 公式 (2), 得:
x t = 1 − α t × ϵ t + α t × ( 1 − α t − 1 × ϵ t − 1 + α t − 1 × x t − 2 ) = α t ( 1 − α t − 1 ) × ϵ t − 1 + 1 − α t × ϵ t + α t α t − 1 × x t − 2 \begin{align}x_t &= \sqrt {1- \alpha_t} \times \epsilon_t + \sqrt{\alpha_t} \times (\sqrt {1- \alpha_{t-1}} \times \epsilon_{t-1} + \sqrt{\alpha_{t-1}} \times x_{t-2}) \notag \\ &= \sqrt {\alpha_t(1- \alpha_{t-1})} \times \epsilon_{t-1} + \sqrt{1-\alpha_t} \times \epsilon_t+\sqrt {\alpha_t \alpha_{t-1}} \times x_{t-2} \tag{4} \end{align} xt=1−αt×ϵt+αt×(1−αt−1×ϵt−1+αt−1×xt−2)=αt(1−αt−1)×ϵt−1+1−αt×ϵt+αtαt−1×xt−2(4)
(接下来,我们想要进一步简化公式)
其中, ϵ t \epsilon_t ϵt 和 ϵ t − 1 \epsilon_{t-1} ϵt−1 是两个独立的随机变量, 且 ϵ t ∼ N ( 0 , 1 ) \epsilon_t \sim N(0, 1) ϵt∼N(0,1) , ϵ t − 1 ∼ N ( 0 , 1 ) \epsilon_{t-1} \sim N(0, 1) ϵt−1∼N(0,1)
将要用到的正态分布的 2个性质:
(1) 如果 X ∼ ( μ 1 , σ 1 2 ) X \sim (\mu_1, \sigma_1^2) X∼(μ1,σ12), 那么 a X + b ∼ N ( a μ 1 + b , a 2 σ 1 2 ) aX+b \sim \N(a\mu_1+b, a^2\sigma_1^2) aX+b∼N(aμ1+b,a2σ12)
(2) 两个正态分布相加,其结果也是正态分布
比如 X ∼ N ( μ 1 , σ 1 2 ) , Y ∼ N ( μ 2 , σ 2 2 ) ,则 X + Y ∼ N ( μ 1 + μ 2 , σ 1 2 + σ 2 2 ) X \sim N(\mu_1, \sigma_1^2), \; \;Y \sim N(\mu_2, \sigma_2^2), 则 X+Y \sim N(\mu_1+\mu_2, \sigma_1^2+\sigma_2^2) X∼N(μ1,σ12),Y∼N(μ2,σ22),则X+Y∼N(μ1+μ2,σ12+σ22)
假设学生的考试成绩服从正态分布,均值为 75 分,方差为 10 分。那么,如果学生参加两次考试,则两次考试的总分也服从正态分布,均值为 150 分,方差为 20 分。
===>>> 正态分布根据性质 (1)
α t ( 1 − α t − 1 ) × ϵ t − 1 ∼ N ( 0 , α t − α t α t − 1 ) 1 − α t × ϵ t ∼ N ( 0 , 1 − α t ) \begin{align} \sqrt {\alpha_t(1- \alpha_{t-1})} \times \epsilon_{t-1} &\sim N(0, \alpha_t - \alpha_t\alpha_{t-1}) \notag \\ \sqrt{1-\alpha_t} \times \epsilon_t &\sim N(0, 1-\alpha_t) \notag \end{align} αt(1−αt−1)×ϵt−11−αt×ϵt∼N(0,αt−αtαt−1)∼N(0,1−αt)
===>>> 正态分布根据性质 (2)
α t ( 1 − α t − 1 ) × ϵ t − 1 + 1 − α t × ϵ t ∼ N ( 0 , 1 − α t α t − 1 ) (6) \sqrt {\alpha_t(1- \alpha_{t-1})} \times \epsilon_{t-1} + \sqrt{1-\alpha_t} \times \epsilon_t \sim N(0, 1 - \alpha_t\alpha_{t-1}) \tag{6} αt(1−αt−1)×ϵt−1+1−αt×ϵt∼N(0,1−αtαt−1)(6)
根据(6), 将公式5 继续简化为如下 (这种方式 叫做 重参数化):
x t = 1 − α t α t − 1 × ϵ + α t α t − 1 × x t − 2 (7) x_t = \sqrt {1 - \alpha_t\alpha_{t-1}} \times \epsilon+\sqrt {\alpha_t \alpha_{t-1}} \times x_{t-2} \tag{7} xt=1−αtαt−1×ϵ+αtαt−1×xt−2(7)
我们继续 用 x t − 3 x_{t-3} xt−3 表示 x t x_t xt
由公式(2), 可知:
x t − 2 = 1 − α t − 2 × ϵ t − 2 + α t − 2 × x t − 3 (8) x_{t-2} = \sqrt {1- \alpha_{t-2}} \times \epsilon_{t-2} + \sqrt{\alpha_{t-2}} \times x_{t-3} \tag{8} xt−2=1−αt−2×ϵt−2+αt−2×xt−3(8)
将公式(8)带入公式(7):
x t = 1 − α t α t − 1 × ϵ + α t α t − 1 × ( 1 − α t − 2 × ϵ t − 2 + α t − 2 × x t − 3 ) = 1 − α t α t − 1 × ϵ + α t α t − 1 ( 1 − α t − 2 ) × ϵ t − 2 + α t α t − 1 α t − 2 × x t − 3 \begin{align} x_t &= \sqrt {1 - \alpha_t\alpha_{t-1}} \times \epsilon+\sqrt {\alpha_t \alpha_{t-1}} \times (\sqrt {1- \alpha_{t-2}} \times \epsilon_{t-2} + \sqrt{\alpha_{t-2}} \times x_{t-3})\notag \\ &=\sqrt {1 - \alpha_t\alpha_{t-1}} \times \epsilon+\sqrt {\alpha_t \alpha_{t-1}(1- \alpha_{t-2})} \times \epsilon_{t-2}+\sqrt {\alpha_t \alpha_{t-1}\alpha_{t-2}} \times x_{t-3} \notag \end{align} xt=1−αtαt−1×ϵ+αtαt−1×(1−αt−2×ϵt−2+αt−2×xt−3)=1−αtαt−1×ϵ+αtαt−1(1−αt−2)×ϵt−2+αtαt−1αt−2×xt−3
再次使用 重参数化,得:
x t = 1 − α t α t − 1 α t − 2 × ϵ + α t α t − 1 α t − 2 × x t − 3 (9) x_t =\sqrt {1 - \alpha_t \alpha_{t-1}\alpha_{t-2}} \times \epsilon+\sqrt {\alpha_t \alpha_{t-1}\alpha_{t-2}} \times x_{t-3} \tag{9} xt=1−αtαt−1αt−2×ϵ+αtαt−1αt−2×xt−3(9)
根据公式 (7)和 (9), 采用数学归纳法,得:
x t = 1 − α t α t − 1 . . . α 2 α 1 × ϵ + α t α t − 1 . . . α 2 α 1 × x 0 (10) x_t =\sqrt {1 - \alpha_t \alpha_{t-1}...\alpha_2 \alpha_1} \times \epsilon+\sqrt {\alpha_t \alpha_{t-1}...\alpha_2 \alpha_1} \times x_0 \tag{10} xt=1−αtαt−1...α2α1×ϵ+αtαt−1...α2α1×x0(10)
另 α ˉ t = α t α t − 1 . . . α 2 α 1 \bar \alpha_t = \alpha_t \alpha_{t-1}...\alpha_2 \alpha_1 αˉt=αtαt−1...α2α1, 带入公式 (10) 得
x t = 1 − α ˉ t × ϵ + α ˉ t × x 0 (10) x_t =\sqrt {1 - \bar \alpha_t} \times \epsilon+\sqrt {\bar \alpha_t} \times x_0 \tag{10} xt=1−αˉt×ϵ+αˉt×x0(10)
P ( x t − 1 ∣ x t , x 0 ) = P ( x t ∣ x t − 1 , x 0 ) P ( x t − 1 ∣ x 0 ) P ( x t ∣ x 0 ) = P ( x t ∣ x t − 1 ) P ( x t − 1 ∣ x 0 ) P ( x t ∣ x 0 ) \begin{align} P(x_{t-1}|x_t,x_0) &= \frac{P(x_t|x_{t-1},x_0)P(x_{t-1}|x_0)}{P(x_t|x_0)} \notag \\ &= \frac{P(x_t|x_{t-1})P(x_{t-1}|x_0)}{P(x_t|x_0)} \tag{11}\\ \end{align} P(xt−1∣xt,x0)=P(xt∣x0)P(xt∣xt−1,x0)P(xt−1∣x0)=P(xt∣x0)P(xt∣xt−1)P(xt−1∣x0)(11)
我们来看其中的每一项:
由公式(2): x t = 1 − α t × ϵ t + α t × x t − 1 x_t = \sqrt {1- \alpha_t} \times \epsilon_t + \sqrt{\alpha_t} \times x_{t-1} xt=1−αt×ϵt+αt×xt−1
===>> x t ∼ N ( α t x t − 1 , 1 − α t ) \;\;\; x_t \sim N(\sqrt{\alpha_t} x_{t-1}, 1- \alpha_t) xt∼N(αtxt−1,1−αt)
P ( x t ∣ x t − 1 , x 0 ) = 1 2 π 1 − α t e [ − 1 2 ( x t − α t x t − 1 ) 2 1 − α t ] (a) P(x_t|x_{t-1},x_0) = \frac{1}{\sqrt{2\pi}\sqrt{1-\alpha_t}}e^{[-\frac{1}{2}\frac{(x_t-\sqrt\alpha_tx_{t-1})^2}{1-\alpha_t}]} \tag{a} P(xt∣xt−1,x0)=2π1−αt1e[−211−αt(xt−αtxt−1)2](a)
\quad
由公式(10): x t = 1 − α ˉ t × ϵ + α ˉ t × x 0 x_t =\sqrt {1 - \bar \alpha_t} \times \epsilon+\sqrt {\bar \alpha_t} \times x_0 xt=1−αˉt×ϵ+αˉt×x0
===>> x t ∼ N ( α ˉ t x 0 , 1 − α ˉ t ) \;\;\; x_t \sim N(\sqrt {\bar \alpha_t} x_0, 1 - \bar \alpha_t) xt∼N(αˉtx0,1−αˉt)
P ( x t ∣ x 0 ) = 1 2 π 1 − α ˉ t e [ − 1 2 ( x t − α ˉ t x 0 ) 2 1 − α ˉ t ] (b) P(x_{t}|x_0) = \frac{1}{\sqrt{2\pi}\sqrt{1 - \bar \alpha_t}}e^{[-\frac{1}{2}\frac{(x_t-\sqrt {\bar \alpha_t} x_0)^2}{1 - \bar \alpha_t}]} \tag{b} P(xt∣x0)=2π1−αˉt1e[−211−αˉt(xt−αˉtx0)2](b)
同样
P ( x t − 1 ∣ x 0 ) = 1 2 π 1 − α ˉ t − 1 e [ − 1 2 ( x t − 1 − α ˉ t − 1 x 0 ) 2 1 − α ˉ t − 1 ] (c) P(x_{t-1}|x_0) = \frac{1}{\sqrt{2\pi}\sqrt{1 - \bar \alpha_{t-1}}}e^{[-\frac{1}{2}\frac{(x_{t-1}-\sqrt {\bar \alpha_{t-1}} x_0)^2}{1 - \bar \alpha_{t-1}}]} \tag{c} P(xt−1∣x0)=2π1−αˉt−11e[−211−αˉt−1(xt−1−αˉt−1x0)2](c)
将(a)(b)(c)带入公式 (11)
P ( x t − 1 ∣ x t , x 0 ) = 1 2 π 1 − α t e [ − 1 2 ( x t − α t x t − 1 ) 2 1 − α t ] × 1 2 π 1 − α ˉ t − 1 e [ − 1 2 ( x t − 1 − α ˉ t − 1 x 0 ) 2 1 − α ˉ t − 1 ] 1 2 π 1 − α ˉ t e [ − 1 2 ( x t − α ˉ t x 0 ) 2 1 − α ˉ t ] = 1 2 π ( 1 − α t 1 − α ˉ t − 1 1 − α ˉ t ) e [ − 1 2 ( x t − 1 − ( α t ( 1 − α ˉ t − 1 ) 1 − α ˉ t x t + α ˉ t − 1 ( 1 − α t ) 1 − α ˉ t x 0 ) ) 2 ( 1 − α t 1 − α ˉ t − 1 1 − α ˉ t ) 2 ] \begin{align} P(x_{t-1}|x_t,x_0) &= \frac{\frac{1}{\sqrt{2\pi}\sqrt{1-\alpha_t}}e^{[-\frac{1}{2}\frac{(x_t-\sqrt\alpha_tx_{t-1})^2}{1-\alpha_t}]} \times \frac{1}{\sqrt{2\pi}\sqrt{1 - \bar \alpha_{t-1}}}e^{[-\frac{1}{2}\frac{(x_{t-1}-\sqrt {\bar \alpha_{t-1}} x_0)^2}{1 - \bar \alpha_{t-1}}]}}{\frac{1}{\sqrt{2\pi}\sqrt{1 - \bar \alpha_t}}e^{[-\frac{1}{2}\frac{(x_t-\sqrt {\bar \alpha_t} x_0)^2}{1 - \bar \alpha_t}]}} \notag \\ &= \frac{1}{\sqrt{2\pi}(\frac{\sqrt{1- \alpha_t}\sqrt{1-\bar \alpha_{t-1}}}{\sqrt{1-\bar \alpha_t}})}e^{[-\frac{1}{2}\frac{(x_{t-1} - (\frac{\sqrt{\alpha_t}(1- \bar \alpha_{t-1})}{1- \bar\alpha_t}x_t + \frac{\sqrt{\bar \alpha_{t-1}}(1-\alpha_t)}{1- \bar\alpha_t}x_0))^2}{(\frac{\sqrt{1- \alpha_t}\sqrt{1-\bar \alpha_{t-1}}}{\sqrt{1-\bar \alpha_t}})^2}]}\notag \end{align} P(xt−1∣xt,x0)=2π1−αˉt1e[−211−αˉt(xt−αˉtx0)2]2π1−αt1e[−211−αt(xt−αtxt−1)2]×2π1−αˉt−11e[−211−αˉt−1(xt−1−αˉt−1x0)2]=2π(1−αˉt1−αt1−αˉt−1)1e[−21(1−αˉt1−αt1−αˉt−1)2(xt−1−(1−αˉtαt(1−αˉt−1)xt+1−αˉtαˉt−1(1−αt)x0))2]
===>>
P ( x t − 1 ∣ x t , x 0 ) ∼ N ( α t ( 1 − α ˉ t − 1 ) 1 − α ˉ t x t + α ˉ t − 1 ( 1 − α t ) 1 − α ˉ t x 0 , ( 1 − α t 1 − α ˉ t − 1 1 − α ˉ t ) 2 ) (12) P(x_{t-1}|x_t,x_0) \sim N(\frac{\sqrt{\alpha_t}(1- \bar \alpha_{t-1})}{1- \bar\alpha_t}x_t + \frac{\sqrt{\bar \alpha_{t-1}}(1-\alpha_t)}{1- \bar\alpha_t}x_0,(\frac{\sqrt{1- \alpha_t}\sqrt{1-\bar \alpha_{t-1}}}{\sqrt{1-\bar \alpha_t}})^2) \tag{12} P(xt−1∣xt,x0)∼N(1−αˉtαt(1−αˉt−1)xt+1−αˉtαˉt−1(1−αt)x0,(1−αˉt1−αt1−αˉt−1)2)(12)
由公式(10) x t = 1 − α ˉ t × ϵ + α ˉ t × x 0 x_t =\sqrt {1 - \bar \alpha_t} \times \epsilon+\sqrt {\bar \alpha_t} \times x_0 xt=1−αˉt×ϵ+αˉt×x0 可知 : x 0 = x t − 1 − α ˉ t × ϵ α ˉ t x_0 =\frac{x_t-\sqrt{1-\bar \alpha_t}\times\epsilon}{\sqrt{\bar \alpha_t}} x0=αˉtxt−1−αˉt×ϵ
带入 (12)中, 得
P ( x t − 1 ∣ x t , x 0 ) ∼ N ( α t ( 1 − α ˉ t − 1 ) 1 − α ˉ t x t + α ˉ t − 1 ( 1 − α t ) 1 − α ˉ t × x t − 1 − α ˉ t × ϵ α ˉ t , ( 1 − α t 1 − α ˉ t − 1 1 − α ˉ t ) 2 ) (12) P(x_{t-1}|x_t,x_0) \sim N(\frac{\sqrt{\alpha_t}(1- \bar \alpha_{t-1})}{1- \bar\alpha_t}x_t + \frac{\sqrt{\bar \alpha_{t-1}}(1-\alpha_t)}{1- \bar\alpha_t} \times \frac{x_t-\sqrt{1-\bar \alpha_t}\times\epsilon}{\sqrt{\bar \alpha_t}},(\frac{\sqrt{1- \alpha_t}\sqrt{1-\bar \alpha_{t-1}}}{\sqrt{1-\bar \alpha_t}})^2) \tag{12} P(xt−1∣xt,x0)∼N(1−αˉtαt(1−αˉt−1)xt+1−αˉtαˉt−1(1−αt)×αˉtxt−1−αˉt×ϵ,(1−αˉt1−αt1−αˉt−1)2)(12)