p ( Z ∣ X ) = p ( X ) p ( X , Z ) p(Z∣X)= p(X) p(X,Z) p(Z∣X)=p(X)p(X,Z)
K L ( P ∣ ∣ Q ) = ∫ p ( x ) l o g ( p ( x ) / q ( x ) ) d x KL(P∣∣Q)=∫p(x)log( p(x)/ q(x) )dx KL(P∣∣Q)=∫p(x)log(p(x)/q(x))dx
K L ( P ∣ ∣ Q ) = ∫ p ( x ) l o g p ( x ) d x − ∫ p ( x ) l o g q ( x ) d x = − H ( P ) + H ( P , Q ) KL(P∣∣Q)=∫p(x)logp(x)dx−∫p(x)logq(x)dx =-H(P)+H(P,Q) KL(P∣∣Q)=∫p(x)logp(x)dx−∫p(x)logq(x)dx=−H(P)+H(P,Q)
p ( X ) = p ( X , Z ) / p ( Z ∣ X ) p(X)= p(X,Z)/ p(Z∣X) p(X)=p(X,Z)/p(Z∣X)
KaTeX parse error: No such environment: align* at position 8: \begin{̲a̲l̲i̲g̲n̲*̲}̲ &logP(X)=log …
E ( l o g p ( X ) ) = ∫ Z l o g p ( X ) q ( Z ) d Z = l o g p ( X ) ∫ Z q ( Z ) d Z E(logp(X))=\int_{Z}{logp(X)q(Z)}dZ=logp(X)\int_{Z}{q(Z)}dZ E(logp(X))=∫Zlogp(X)q(Z)dZ=logp(X)∫Zq(Z)dZ
由 于 ∫ Z q ( Z ) d Z = 1 : E ( l o g p ( x ) ) = l o g p ( X ) 由于\int_{Z}{q(Z)}dZ=1:\\ E(logp(x))=logp(X) 由于∫Zq(Z)dZ=1:E(logp(x))=logp(X)
l o g p ( X , Z ) − l o g q ( Z ) − l o g p ( Z ∣ X ) / q ( Z ) = ∫ Z q ( Z ) l o g p ( X , Z ) d Z − ∫ Z q ( Z ) l o g q ( Z ) d Z + ∫ Z + q ( Z ) l o g ( q ( Z ) / p ( Z ∣ X ) ) d Z logp(X,Z)−logq(Z)−log p(Z∣X)/ q(Z) \\ =\int_{Z} q(Z)logp(X,Z)dZ−\int_{Z} q(Z)logq(Z)dZ + \int_{Z} +q(Z)log (q(Z)/ p(Z|X)) dZ logp(X,Z)−logq(Z)−logp(Z∣X)/q(Z)=∫Zq(Z)logp(X,Z)dZ−∫Zq(Z)logq(Z)dZ+∫Z+q(Z)log(q(Z)/p(Z∣X))dZ
E L B O = ∫ Z q ( Z ) l o g p ( X , Z ) d Z − ∫ Z q ( Z ) l o g q ( Z ) d Z ELBO=\int_{Z} q(Z)logp(X,Z)dZ−\int_{Z} q(Z)logq(Z)dZ ELBO=∫Zq(Z)logp(X,Z)dZ−∫Zq(Z)logq(Z)dZ
K L ( q ( Z ) ∣ ∣ p ( Z ∣ X ) ) = ∫ Z q ( Z ) l o g ( q ( Z ) / p ( Z ∣ X ) ) d Z KL(q(Z)||p(Z|X))=\int_{Z} q(Z)log (q(Z)/ p(Z|X)) dZ KL(q(Z)∣∣p(Z∣X))=∫Zq(Z)log(q(Z)/p(Z∣X))dZ
q ( z ) = ∏ i = 1 M z i q(z)=\prod_{i=1}^{M}{z_i} q(z)=i=1∏Mzi
L ( q ) = ∫ Z l o g p ( X , Z ) q ( Z ) d Z − ∫ Z l o g q ( Z ) q ( Z ) d Z = ∫ Z l o g p ( X , Z ) ∏ i = 1 M q i ( z i ) d Z − ∫ Z ∑ i = 1 M l o g q ( Z ) ∏ i = 1 M q i ( z i ) d Z L(q)= \int_Z logp(X,Z)q(Z)dZ− \int_Z logq(Z)q(Z)dZ \\=\int_Z logp(X,Z) \prod_{i=1}^{M}{qi(zi)} dZ−\int_Z \sum_{i=1}^{M}{logq(Z)} \prod_{i=1}^{M}{qi(zi)}dZ L(q)=∫Zlogp(X,Z)q(Z)dZ−∫Zlogq(Z)q(Z)dZ=∫Zlogp(X,Z)i=1∏Mqi(zi)dZ−∫Zi=1∑Mlogq(Z)i=1∏Mqi(zi)dZ
$$
\int_Z
logp(X,Z)
\prod_{i=1}^{M}{qi(zi)}
dZ=
\int_{z1}
\int_{z2}
…\int_{zM}
q
i
(z
i
)log(p(X,Z))dz
1
dz
2
…dz
M
$$
= ∫ z j q j ( z j ) [ ∫ z 1 ∫ z 2 . . . ∫ z M q i ( z i ) l o g ( p ( X , Z ) ) d z 1 d z 2... d z M ] d j = ∫ z j q j ( z j ) E ∏ i ≠ j M q i ( z i ) [ l o g p ( X , Z ) ] d z j =\int_{zj}qj(zj) [\int_{z1} \int_{z2} ...\int_{zM} qi(zi)log(p(X,Z))dz1dz2...dzM]dj\\ =\int_{z_j}q_j(z_j)E_{\prod_{i\neq j}^Mq_i(z_i)}[logp(X,Z)]dz_j =∫zjqj(zj)[∫z1∫z2...∫zMqi(zi)log(p(X,Z))dz1dz2...dzM]dj=∫zjqj(zj)E∏i=jMqi(zi)[logp(X,Z)]dzj
E ∏ i ≠ j M q i ( z i ) [ l o g p ( X , Z ) ] = l o g p ^ ( X , z j ) E_{\prod_{i\neq j}^Mq_i(z_i)}[logp(X,Z)]=log \hat p(X,z_j) E∏i=jMqi(zi)[logp(X,Z)]=logp^(X,zj)
∫ Z l o g p ( X , Z ) ∏ i = 1 M q i ( z i ) d Z = ∫ z j q j ( z j ) l o g p ^ ( X , z j ) d z j \int_Z logp(X,Z) \prod_{i=1}^{M}{qi(zi)} dZ= \int_{z_j}q_j(z_j)log \hat p(X,z_j)dz_j ∫Zlogp(X,Z)i=1∏Mqi(zi)dZ=∫zjqj(zj)logp^(X,zj)dzj
∫ Z ∑ i = 1 M l o g q ( Z ) ∏ i = 1 M q i ( z i ) d Z = ∫ Z ∏ i = 1 M q i ( z i ) d Z [ l o g q 1 ( z 1 ) + l o g q 2 ( z 2 ) + . . . + l o g q M ( z M ) ] d Z \int_Z \sum_{i=1}^{M}{logq(Z)} \prod_{i=1}^{M}{qi(zi)}dZ =\int_Z \prod_{i=1}^{M}{qi(zi)}dZ [logq 1 (z 1 )+logq 2 (z 2 )+...+logq M (z M )]dZ ∫Zi=1∑Mlogq(Z)i=1∏Mqi(zi)dZ=∫Zi=1∏Mqi(zi)dZ[logq1(z1)+logq2(z2)+...+logqM(zM)]dZ
∫ z 1 ∫ z 2 [ l o g q 1 ( z 1 ) + l o g q 2 ( z 2 ) ] q 1 ( z 1 ) q 2 ( z 2 ) d z 1 d z 2 = ∫ z 1 ∫ z 2 q 1 ( z 1 ) q 2 ( z 2 ) l o g q 1 ( z 1 ) d z 1 d z 2 + ∫ z 1 ∫ z 2 q 1 ( z 1 ) q 2 ( z 2 ) l o g q 2 ( z 2 ) d z 1 d z 2 = ∫ z 1 q 1 ( z 1 ) l o g q 1 ( z 1 ) ∫ z 2 q 2 ( z 2 ) d z 2 d z 1 + ∫ z 2 q 2 ( z 2 ) l o g q 2 ( z 2 ) ∫ z 1 q 1 ( z 1 ) d z 1 d z 2 \int_{z1} \int_{z2} [logq 1 (z 1 )+logq 2 (z 2 )]q 1 (z 1 )q 2 (z 2 )dz 1 dz 2 \\= \int_{z1} \int_{z2} q 1 (z 1 )q 2 (z 2 )logq 1 (z 1 )dz 1 dz 2 +\int_{z1} \int_{z2} q 1 (z 1 )q 2 (z 2 )logq 2 (z 2 )dz 1 dz 2 \\=\int_{z1}q1(z1)logq1(z1)\int_{z2}q2(z2)dz2dz1 +\int_{z2}q2(z2)logq2(z2)\int_{z1}q1(z1)dz1dz2 ∫z1∫z2[logq1(z1)+logq2(z2)]q1(z1)q2(z2)dz1dz2=∫z1∫z2q1(z1)q2(z2)logq1(z1)dz1dz2+∫z1∫z2q1(z1)q2(z2)logq2(z2)dz1dz2=∫z1q1(z1)logq1(z1)∫z2q2(z2)dz2dz1+∫z2q2(z2)logq2(z2)∫z1q1(z1)dz1dz2
由 于 ∫ z 2 q 2 ( z 2 ) d z 2 = 1 , ∫ z 1 q 1 ( z 1 ) d z 1 = 1 原 式 = ∑ i = 1 2 ∫ z i q i ( z i ) l o g q ( z i ) d z i 由于\int_{z2}q2(z2)dz2=1,\int_{z1}q1(z1)dz1=1 \\原式=\sum_{i=1}^{2}\int_{zi}{qi(zi)logq(zi)}dzi 由于∫z2q2(z2)dz2=1,∫z1q1(z1)dz1=1原式=i=1∑2∫ziqi(zi)logq(zi)dzi
∫ Z ∑ i = 1 M l o g q ( Z ) ∏ i = 1 M q i ( z i ) d Z = ∑ i = 1 M ∫ z i q i ( z i ) l o g q ( z i ) d z i \int_Z \sum_{i=1}^{M}{logq(Z)} \prod_{i=1}^{M}{qi(zi)}dZ= \sum_{i=1}^{M}\int_{zi}{qi(zi)logq(zi)}dzi ∫Zi=1∑Mlogq(Z)i=1∏Mqi(zi)dZ=i=1∑M∫ziqi(zi)logq(zi)dzi
∫ Z ∑ i = 1 M l o g q ( Z ) ∏ i = 1 M q i ( z i ) d Z = ∫ z i q i ( z i ) l o g q i ( z i ) d z i + C \int_Z \sum_{i=1}^{M}{logq(Z)} \prod_{i=1}^{M}{qi(zi)}dZ =\int_{zi}{qi(zi)logqi(zi)}dzi+C ∫Zi=1∑Mlogq(Z)i=1∏Mqi(zi)dZ=∫ziqi(zi)logqi(zi)dzi+C
E L B O = ∫ z j q j ( z j ) l o g p ^ ( X , z j ) d z j − ( ∫ z i q i ( z i ) l o g q i ( z i ) d z i + C ) = ∫ z j q j ( z j ) l o g ( p ^ ( X , z j ) / q i ( z i ) ) d z j + C = − K L ( q j ( z j ) ∣ ∣ p ^ ( X , z j ) ) ELBO=\int_{z_j}q_j(z_j)log \hat p(X,z_j)dz_j -(\int_{zi}{qi(zi)logqi(zi)}dzi+C)\\ =\int_{z_j}q_j(z_j)log( \hat p(X,z_j)/qi(zi))dz_j+C \\=-KL(qj(zj)||\hat p(X,z_j)) ELBO=∫zjqj(zj)logp^(X,zj)dzj−(∫ziqi(zi)logqi(zi)dzi+C)=∫zjqj(zj)log(p^(X,zj)/qi(zi))dzj+C=−KL(qj(zj)∣∣p^(X,zj))
因 此 我 们 要 求 解 E L B O 的 最 大 值 的 问 题 就 转 变 为 了 求 K L ( q j ( z j ) ∣ ∣ p ^ ( X , z j ) ) 的 最 小 值 的 问 题 又 根 据 K L 散 度 的 定 义 ( 描 述 两 者 差 距 性 ) , 那 么 推 出 只 有 当 q j ( z j ) = p ^ ( X , z j ) 时 K L 达 到 最 小 值 。 因此我们要求解ELBO的最大值的问题就转变为了求KL(qj(zj)||\hat p(X,z_j))的最小值的问题\\ 又根据KL散度的定义(描述两者差距性),那么推出只有当qj(zj)=\hat p(X,z_j)时KL达到最小值。 因此我们要求解ELBO的最大值的问题就转变为了求KL(qj(zj)∣∣p^(X,zj))的最小值的问题又根据KL散度的定义(描述两者差距性),那么推出只有当qj(zj)=p^(X,zj)时KL达到最小值。
q i ( z i ) = e x p E ∏ i ≠ j M q i ( z i ) [ l o g p ^ ( X , Z ) ] d z j l o g q i ( z i ) = E ∏ i ≠ j M q i ( z i ) [ l o g p ^ ( X , Z ) ] d z j q_i(z_i) = exp^{E_{\prod_{i\neq j}^Mq_i(z_i)}[log\hat p(X,Z)]dz_j} \\ logq_i(z_i) =E_{\prod_{i\neq j}^Mq_i(z_i)}[log\hat p(X,Z)]dz_j qi(zi)=expE∏i=jMqi(zi)[logp^(X,Z)]dzjlogqi(zi)=E∏i=jMqi(zi)[logp^(X,Z)]dzj
7D&request_id=165288005816781432982629&biz_id=0&utm_medium=distribute.pc_search_result.none-task-blog-2allbaidu_landing_v2~default-3-83088786-null-null.142v10pc_search_result_control_group,157v4new_style&utm_term=变分推断&spm=1018.2226.3001.4187)