答 案 仅 供 参 考 \color{red}{答案仅供参考} 答案仅供参考
E [ x ] = 1 × λ + 0 × ( 1 − λ ) = λ E[x]=1\times \lambda+0\times (1-\lambda)=\lambda E[x]=1×λ+0×(1−λ)=λ
由 上 一 章 习 题 2.10 可 得 : 由上一章习题2.10可得: 由上一章习题2.10可得:
E [ ( x − μ ) 2 ] = E [ x 2 ] − E [ x ] E [ x ] E[(x-\mu)^2]=E[x^2]-E[x]E[x] E[(x−μ)2]=E[x2]−E[x]E[x]
又 E [ x ] = λ = μ , 所 以 又E[x]=\lambda =\mu,所以 又E[x]=λ=μ,所以
E [ ( x − E [ x ] ) 2 ] = E [ ( x − μ ) 2 ] = E [ x 2 ] − E [ x ] E [ x ] = 1 × λ − λ 2 = λ ( 1 − λ ) \begin{aligned} E[(x-E[x])^2] & =E[(x-\mu)^2] \\ & =E[x^2]-E[x]E[x]\\ & =1\times \lambda-\lambda^2\\ & =\lambda(1-\lambda) \end{aligned} E[(x−E[x])2]=E[(x−μ)2]=E[x2]−E[x]E[x]=1×λ−λ2=λ(1−λ)
得 证 得证 得证
已 知 α , β > 1 已知\alpha,\beta>1 已知α,β>1
P r ( λ ) = Γ [ α + β ] Γ [ α ] Γ [ β ] λ α − 1 ( 1 − λ ) β − 1 Pr(\lambda)=\frac{\Gamma[\alpha+\beta]}{\Gamma[\alpha]\Gamma[\beta]}\lambda^{\alpha-1}(1-\lambda)^{\beta-1} Pr(λ)=Γ[α]Γ[β]Γ[α+β]λα−1(1−λ)β−1
对 λ 进 行 求 导 对\lambda进行求导 对λ进行求导
P r ′ ( λ ) = Γ [ α + β ] Γ [ α ] Γ [ β ] [ ( α − 1 ) λ α − 2 ( 1 − λ ) β − 1 + λ α − 1 ( β − 1 ) ( 1 − λ ) β − 2 ∗ ( − 1 ) ] \begin{aligned} Pr'(\lambda) & =\frac{\Gamma[\alpha+\beta]}{\Gamma[\alpha]\Gamma[\beta]}[(\alpha-1)\lambda^{\alpha-2}(1-\lambda)^{\beta-1}+\lambda^{\alpha-1}(\beta-1)(1-\lambda)^{\beta-2}*(-1)] \end{aligned} Pr′(λ)=Γ[α]Γ[β]Γ[α+β][(α−1)λα−2(1−λ)β−1+λα−1(β−1)(1−λ)β−2∗(−1)]
令 P r ′ ( λ ) = 0 , 解 得 令Pr'(\lambda)=0,解得 令Pr′(λ)=0,解得
λ = α − 1 ( α − 1 ) + ( β − 1 ) \lambda=\frac{\alpha-1}{(\alpha-1)+(\beta-1)} λ=(α−1)+(β−1)α−1
代 入 可 得 代入可得 代入可得
mode = Γ [ α + β ] Γ [ α ] Γ [ β ] ( α − 1 ( α − 1 ) + ( β − 1 ) ) α − 1 ( β − 1 ( α − 1 ) + ( β − 1 ) ) β − 1 \text {mode}=\frac{\Gamma[\alpha+\beta]}{\Gamma[\alpha]\Gamma[\beta]}\left(\frac{\alpha-1}{(\alpha-1)+(\beta-1)}\right)^{\alpha-1}\left(\frac{\beta-1}{(\alpha-1)+(\beta-1)} \right)^{\beta-1} mode=Γ[α]Γ[β]Γ[α+β]((α−1)+(β−1)α−1)α−1((α−1)+(β−1)β−1)β−1
联 列 方 程 组 联列方程组 联列方程组
{ μ = α α + β σ 2 = α β ( α + β ) 2 ( α + β + 1 ) \begin{dcases} & \mu=\frac{\alpha}{\alpha+\beta}\\ & \sigma^2=\frac{\alpha\beta}{(\alpha+\beta)^2(\alpha+\beta+1)} \end{dcases} ⎩⎪⎨⎪⎧μ=α+βασ2=(α+β)2(α+β+1)αβ
从 上 式 得 到 β = α μ − α , 代 入 下 式 得 到 从上式得到\beta=\frac{\alpha}{\mu}-\alpha,代入下式得到 从上式得到β=μα−α,代入下式得到
{ α = μ 2 ( 1 − μ ) σ 2 − μ β = μ ( 1 − μ 2 ) σ 2 + μ − 1 \begin{dcases} & \alpha=\frac{\mu^2(1-\mu)}{\sigma^2}-\mu \\ & \beta=\frac{\mu(1-\mu^2)}{\sigma^2}+\mu-1 \end{dcases} ⎩⎪⎨⎪⎧α=σ2μ2(1−μ)−μβ=σ2μ(1−μ2)+μ−1
已 知 贝 塔 分 布 已知贝塔分布 已知贝塔分布
P r ( λ ) = Γ [ α + β ] Γ [ α ] Γ [ β ] λ α − 1 ( 1 − λ ) β − 1 = exp [ l o g [ Γ [ α + β ] ] + ( α − 1 ) l o g [ λ ] + ( β − 1 ) l o g [ 1 − λ ] − l o g [ Γ [ α ] ] − l o g [ Γ [ β ] ] ] = exp [ ( α − 1 ) l o g [ λ ] + ( β − 1 ) l o g [ 1 − λ ] − [ l o g [ Γ [ α ] ] + l o g [ Γ [ β ] ] − l o g [ Γ [ α + β ] ] ] ] \begin{aligned} Pr(\lambda) & =\frac{\Gamma[\alpha+\beta]}{\Gamma[\alpha]\Gamma[\beta]}\lambda^{\alpha-1}(1-\lambda)^{\beta-1} \\ & =\text{exp}[log[\Gamma[\alpha+\beta]]+(\alpha-1)log[\lambda]+(\beta-1)log[1-\lambda]-log[\Gamma[\alpha]]-log[\Gamma[\beta]]] \\ &=\text{exp}[(\alpha-1)log[\lambda]+(\beta-1)log[1-\lambda]-[log[\Gamma[\alpha]]+log[\Gamma[\beta]]-log[\Gamma[\alpha+\beta]]]] \end{aligned} Pr(λ)=Γ[α]Γ[β]Γ[α+β]λα−1(1−λ)β−1=exp[log[Γ[α+β]]+(α−1)log[λ]+(β−1)log[1−λ]−log[Γ[α]]−log[Γ[β]]]=exp[(α−1)log[λ]+(β−1)log[1−λ]−[log[Γ[α]]+log[Γ[β]]−log[Γ[α+β]]]]
已 知 指 数 族 的 形 式 为 已知指数族的形式为 已知指数族的形式为
P r ( x ∣ θ ) = a [ x ] exp [ b [ θ ] T c [ x ] − d [ θ ] ] Pr(x|\bm\theta)=a[x]\text{exp}[\bm b[\bm\theta]^{\text{T}}\bm c[x]-d[\bm\theta]] Pr(x∣θ)=a[x]exp[b[θ]Tc[x]−d[θ]]
对 比 可 得 对比可得 对比可得
{ a [ x ] = 1 b [ θ ] = [ α − 1 β − 1 ]    c [ x ] = [ l o g [ x ] l o g [ 1 − x ] ] d [ θ ] = l o g [ Γ [ α ] ] + l o g [ Γ [ β ] ] − l o g [ Γ [ α + β ] ] \begin{dcases} a[x]& =1 \\ \bm b[\bm\theta] &= \begin{bmatrix} \alpha-1 \\ \beta-1 \end{bmatrix} \\ & \;\\ \bm c[x]&=\begin{bmatrix} log[x] \\ log[1-x] \end{bmatrix} \\ d[\bm\theta]& =log[\Gamma[\alpha]]+log[\Gamma[\beta]]-log[\Gamma[\alpha+\beta]] \end{dcases} ⎩⎪⎪⎪⎪⎪⎪⎪⎪⎪⎨⎪⎪⎪⎪⎪⎪⎪⎪⎪⎧a[x]b[θ]c[x]d[θ]=1=[α−1β−1]=[log[x]log[1−x]]=log[Γ[α]]+log[Γ[β]]−log[Γ[α+β]]
Γ [ z + 1 ] = ∫ 0 ∞ t z e − t d t = − ∫ 0 ∞ t z d ( e − t ) = − t z e − t ∣ 0 ∞ + ∫ 0 ∞ e − t d ( t z ) = 0 + z ∫ e − t t z − 1 d t = z Γ [ z ] \begin{aligned} \Gamma[z+1] & =\displaystyle\int_0^\infty t^{z}\text e^{-t}\text dt\\ & =-\int_0^\infty t^z\text d(\text e^{-t}) \\ & =-\left. t^z\text e^{-t} \right|_0^\infty+\int_0^\infty \text e^{-t}\text d(t^z)\\ & =0+z\int\text e^{-t}t^{z-1} \text dt\\ & =z\Gamma[z] \end{aligned} Γ[z+1]=∫0∞tze−tdt=−∫0∞tzd(e−t)=−tze−t∣∣0∞+∫0∞e−td(tz)=0+z∫e−ttz−1dt=zΓ[z]
得 证 得证 得证
书上关于共轭分布讲的比较简单,较为详细的可以参见https://www.jianshu.com/p/bb7bce40a15a
需 要 证 明 先 验 分 布 和 后 验 分 布 有 相 同 的 形 式 , 即 证 明 需要证明先验分布和后验分布有相同的形式,即证明 需要证明先验分布和后验分布有相同的形式,即证明
P r ( μ ) ∗ P r ( x ∣ μ ) ∝ P r ( μ ) Pr(\mu)*Pr(x|\mu)\propto Pr(\mu) Pr(μ)∗Pr(x∣μ)∝Pr(μ)
P r ( x ∣ μ ) ∗ P r ( μ ) = 1 2 π σ p exp [ − 0.5 [ ( x − μ ) 2 + ( μ − μ p ) 2 σ p 2 ] ] = 1 2 π σ p exp [ − 0.5 [ x 2 − 2 μ x + μ 2 + μ 2 σ p 2 − 2 μ μ p σ p 2 + μ p 2 σ p 2 ] ] = 1 2 π σ p exp [ − 0.5 [ x 2 + μ p 2 σ p 2 ] − 0.5 [ μ 2 ( 1 + 1 σ p 2 ) − 2 μ ( x + μ p σ p 2 ) ] ] ( 因 为 提 取 出 来 的 前 两 项 与 μ 无 关 , 可 以 将 其 作 为 常 量 ) = κ 1 exp [ − 0.5 ( σ p 2 + 1 σ p 2 μ 2 − 2 σ p 2 x + μ p σ p 2 μ ) ] ( 这 里 使 用 配 方 法 ) = κ 1 exp [ − 0.5 [ ( μ − σ p 2 x + μ p σ p 2 + 1 ) 2 σ p 2 σ p 2 + 1 − 常 量 ] ] = κ 2 exp [ − 0.5 ( μ − σ p 2 x + μ p σ p 2 + 1 ) 2 σ p 2 σ p 2 + 1 ] ∝ P r ( μ ) \begin{aligned} Pr(x|\mu)*Pr(\mu) & =\frac{1}{2\pi \sigma_p}\text{exp}\left[-0.5\left[(x-\mu)^2+\frac{}{}\frac{(\mu-\mu_p)^2}{\sigma_p^2} \right] \right] \\ & =\frac{1}{2\pi \sigma_p}\text{exp}\left[-0.5\left[x^2-2\mu x+\mu^2+\frac{\mu^2}{\sigma_p^2}-\frac{2\mu\mu_p}{\sigma_p^2}+\frac{\mu_p^2}{\sigma_p^2}\right] \right] \\ & =\frac{1}{2\pi \sigma_p}\text{exp}\left[-0.5\left[x^2+\frac{\mu_p^2}{\sigma_p^2}\right] -0.5\left[\mu^2(1+\frac{1}{\sigma_p^2})-2\mu(x+\frac{\mu_p}{\sigma_p^2})\right] \right] \\ &(因为提取出来的前两项与\mu无关,可以将其作为常量)\\ & =\kappa_1\text{exp}\left[-0.5\left( \frac{\sigma_p^2+1}{\sigma_p^2}\mu^2-2\frac{\sigma_p^2x+\mu_p}{\sigma_p^2}\mu\right) \right]\\ &(这里使用配方法)\\ & =\kappa_1\text{exp}\left[-0.5\left[ \frac{\left(\mu-\frac{\sigma_p^2x+\mu_p}{\sigma_p^2+1}\right)^2}{\frac{\sigma_p^2}{\sigma_p^2+1}}-常量\right] \right]\\ & =\kappa_2\text{exp}\left[-0.5 \frac{\left(\mu-\frac{\sigma_p^2x+\mu_p}{\sigma_p^2+1}\right)^2}{\frac{\sigma_p^2}{\sigma_p^2+1}}\right]\\ & \propto Pr(\mu) \end{aligned} Pr(x∣μ)∗Pr(μ)=2πσp1exp[−0.5[(x−μ)2+σp2(μ−μp)2]]=2πσp1exp[−0.5[x2−2μx+μ2+σp2μ2−σp22μμp+σp2μp2]]=2πσp1exp[−0.5[x2+σp2μp2]−0.5[μ2(1+σp21)−2μ(x+σp2μp)]](因为提取出来的前两项与μ无关,可以将其作为常量)=κ1exp[−0.5(σp2σp2+1μ2−2σp2σp2x+μpμ)](这里使用配方法)=κ1exp⎣⎢⎡−0.5⎣⎢⎡σp2+1σp2(μ−σp2+1σp2x+μp)2−常量⎦⎥⎤⎦⎥⎤=κ2exp⎣⎢⎡−0.5σp2+1σp2(μ−σp2+1σp2x+μp)2⎦⎥⎤∝Pr(μ)
得 证 得证 得证
和 3.4 相 似 , 已 知 正 态 分 布 和3.4相似,已知正态分布 和3.4相似,已知正态分布
P r ( x ) = 1 2 π σ 2 exp [ − 0.5 ( x − μ ) 2 σ 2 ] = exp [ − 0.5 ( x 2 σ 2 − 2 μ x σ 2 + μ 2 σ 2 ) − l o g [ 2 π σ 2 ] ] \begin{aligned} Pr(x) & =\frac{1}{\sqrt{2\pi\sigma^2}}\text{exp}\left[ -0.5\frac{(x-\mu)^2}{\sigma^2}\right]\\ & =\text{exp}\left[ -0.5(\frac{x^2}{\sigma^2}-\frac{2\mu x}{\sigma^2}+\frac{\mu^2}{\sigma^2})-log\left[\sqrt{2\pi\sigma^2} \right]\right] \end{aligned} Pr(x)=2πσ21exp[−0.5σ2(x−μ)2]=exp[−0.5(σ2x2−σ22μx+σ2μ2)−log[2πσ2]]
已 知 指 数 族 的 形 式 为 已知指数族的形式为 已知指数族的形式为
P r ( x ∣ θ ) = a [ x ] exp [ b [ θ ] T c [ x ] − d [ θ ] ] Pr(x|\bm\theta)=a[x]\text{exp}[\bm b[\bm\theta]^{\text{T}}\bm c[x]-d[\bm\theta]] Pr(x∣θ)=a[x]exp[b[θ]Tc[x]−d[θ]]
对 比 可 得 对比可得 对比可得
{ a [ x ] = 1 b [ θ ] = [ 1 / σ 2 μ / σ 2 ]    c [ x ] = [ − 0.5 x 2 x ] d [ θ ] = μ 2 2 σ 2 + l o g [ 2 π σ 2 ] \begin{dcases} a[x]& =1 \\ \bm b[\bm\theta] &= \begin{bmatrix} 1/\sigma^2\\ \mu/\sigma^2 \end{bmatrix} \\ & \;\\ \bm c[x]&=\begin{bmatrix} -0.5x^2\\ x \end{bmatrix} \\ d[\bm\theta]& =\frac{\mu^2}{2\sigma^2}+log\left[ \sqrt{2\pi\sigma^2}\right] \end{dcases} ⎩⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎨⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎧a[x]b[θ]c[x]d[θ]=1=[1/σ2μ/σ2]=[−0.5x2x]=2σ2μ2+log[2πσ2]
已 知 已知 已知
P r ( μ , σ 2 ) = γ σ 2 π β α Γ [ α ] ( 1 σ 2 ) α + 1 exp [ − 2 β + γ ( δ − μ ) 2 2 σ 2 ] Pr(\mu,\sigma^2)=\frac{\sqrt{\gamma}}{\sigma\sqrt{2\pi}}\frac{\beta^\alpha}{\Gamma[\alpha]}\left(\frac{1}{\sigma^2} \right)^{\alpha+1}\text{exp}\left[-\frac{2\beta+\gamma(\delta-\mu)^2}{2\sigma^2}\right] Pr(μ,σ2)=σ2πγΓ[α]βα(σ21)α+1exp[−2σ22β+γ(δ−μ)2]
对 μ 进 行 求 导 对\mu进行求导 对μ进行求导
∂ P r ( μ , σ 2 ) ∂ μ = c o n s t a n t ⋅ e [ ⋯   ] ( − 1 2 σ 2 ) ( 2 γ ) ( δ − μ ) ( − 1 ) \begin{aligned} \frac{\partial Pr(\mu,\sigma^2)}{\partial \mu}=\text constant\cdot \text e[\cdots]\left( -\frac{1}{2\sigma^2}\right)(2\gamma)(\delta-\mu)(-1) \end{aligned} ∂μ∂Pr(μ,σ2)=constant⋅e[⋯](−2σ21)(2γ)(δ−μ)(−1)
令 ∂ P r ( μ , σ 2 ) ∂ μ = 0 , 解 得 令\frac{\partial Pr(\mu,\sigma^2)}{\partial \mu}=0,解得 令∂μ∂Pr(μ,σ2)=0,解得
μ = δ \mu=\delta μ=δ
对 σ 2 进 行 求 导 对\sigma^2进行求导 对σ2进行求导
∂ P r ( μ , σ 2 ) ∂ σ 2 = b a l a b a l a \begin{aligned} \frac{\partial Pr(\mu,\sigma^2)}{\partial \sigma^2}=balabala \end{aligned} ∂σ2∂Pr(μ,σ2)=balabala
令 ∂ P r ( μ , σ 2 ) ∂ σ 2 = 0 , 代 入 μ 的 值 , 解 得 令\frac{\partial Pr(\mu,\sigma^2)}{\partial \sigma^2}=0,代入\mu的值,解得 令∂σ2∂Pr(μ,σ2)=0,代入μ的值,解得
σ 2 = 2 β 2 α + 3 \sigma^2=\frac{2\beta}{2\alpha+3} σ2=2α+32β
综 上 综上 综上
{ μ = δ σ 2 = 2 β 2 α + 3 \begin{dcases} \mu&=\delta \\ \sigma^2&=\frac{2\beta}{2\alpha+3} \end{dcases} ⎩⎨⎧μσ2=δ=2α+32β
代 入 可 得 代入可得 代入可得
mode = b a l a b a l a ( 自 己 代 进 去 就 可 以 ) \text {mode}=balabala(自己代进去就可以) mode=balabala(自己代进去就可以)
∏ i = 1 I Bern x i [ λ ] ⋅ Beta λ [ α , β ] = ∏ i = 1 I λ x i ( 1 − λ ) 1 − x i Γ [ α + β ] Γ [ α ] Γ [ β ] λ α − 1 ( 1 − λ ) β − 1 = κ ⋅ Γ [ α + β + I ] Γ [ α + ∑ x i ] Γ [ β + ∑ ( 1 − x i ) ] λ α + ∑ x i − 1 ( 1 − λ ) β + ∑ ( 1 − x i ) − 1 = κ ⋅ Beta λ [ α ~ , β ~ ] \begin{aligned} &\prod_{i=1}^I \text{Bern}_{x_i}[\lambda]\cdot\text{Beta}_\lambda[\alpha,\beta] \\ =&\prod_{i=1}^I\lambda^{x_i}(1-\lambda)^{1-x_i}\frac{\Gamma[\alpha+\beta]}{\Gamma[\alpha]\Gamma[\beta]}\lambda^{\alpha-1}(1-\lambda)^{\beta-1}\\ =& \kappa\cdot \frac{\Gamma[\alpha+\beta+I]}{\Gamma\left [\alpha+\sum x_i\right]\Gamma[\beta+\sum(1-x_i)]}\lambda^{\alpha+\sum x_i-1}(1-\lambda)^{\beta+\sum(1-x_i)-1}\\ =& \kappa\cdot\text{Beta}_\lambda[\widetilde{\alpha},\widetilde{\beta}] \end{aligned} ===i=1∏IBernxi[λ]⋅Betaλ[α,β]i=1∏Iλxi(1−λ)1−xiΓ[α]Γ[β]Γ[α+β]λα−1(1−λ)β−1κ⋅Γ[α+∑xi]Γ[β+∑(1−xi)]Γ[α+β+I]λα+∑xi−1(1−λ)β+∑(1−xi)−1κ⋅Betaλ[α ,β ]
得 证 得证 得证
∏ i = 1 I Cat x i [ λ 1 ⋯ K ] ⋅ Dir λ 1 ⋯ K [ α 1 ⋯ K ] = [ ∏ i = 1 I ∏ j = 1 K λ j x i j ] Γ [ ∑ j = 1 K α j ] ∏ j = 1 K Γ [ α j ] ∏ j = 1 K λ j α j − 1 ( x i j 是 x i 取 到 j 号 结 果 的 次 数 ) = [ ∏ j = 1 K λ j ∑ i = 1 I x i j ] Γ [ ∑ j = 1 K α j ] ∏ j = 1 K Γ [ α j ] ∏ j = 1 K λ j α j − 1 = Γ [ ∑ j = 1 K α j ] ∏ j = 1 K Γ [ α j ] ∏ j = 1 K λ j α j − 1 + N j ( 设 ∑ i = 1 I x i j = N j , 即 取 I 次 总 共 取 到 j 号 结 果 的 总 次 数 ) = Γ [ ∑ j = 1 K α j ] ∏ j = 1 K Γ [ α j ] ∏ j = 1 K Γ [ α j + N j ] Γ [ ∑ j = 1 K ( α j + N j ) ] Γ [ ∑ j = 1 K ( α j + N j ) ] ∏ j = 1 K Γ [ α j + N j ] ∏ j = 1 K λ j α j − 1 + N j ( ∵ ∑ j = 1 K N j = I , 即 取 I 次 总 共 取 到 所 有 结 果 的 总 次 数 , 因 为 取 了 I 次 , 所 以 等 于 I ) = Γ [ ∑ j = 1 K α j ] ∏ j = 1 K Γ [ α j ] ∏ j = 1 K Γ [ α j + N j ] Γ [ I + ∑ j = 1 K α j ] ⋅ Dir λ 1 ⋯ K [ α 1 ⋯ K ~ ] = κ ~ ⋅ Dir λ 1 ⋯ K [ α 1 ⋯ K ~ ] \begin{aligned} &\prod_{i=1}^I\text{Cat}_{x_i}[\lambda_{1\cdots K}]\cdot \text{Dir}_{\lambda_{1\cdots K}}[\alpha_{1\cdots K}] \\ =& \left[\prod_{i=1}^{I}\prod_{j=1}^{K}\lambda_j^{x_{ij}} \right]\frac{\Gamma\left[\displaystyle\sum_{j=1}^{K}\alpha_j\right]}{\displaystyle\prod_{j=1}^K\Gamma[\alpha_j]}\prod_{j=1}^K\lambda_j^{\alpha_j-1}\\ &(x_{ij}是x_i 取到j号结果的次数) \\ =& \left[\prod_{j=1}^{K}\lambda_j^{\sum_{i=1}^I x_{ij}} \right] \frac{\Gamma\left[\displaystyle\sum_{j=1}^{K}\alpha_j\right]}{\displaystyle\prod_{j=1}^K\Gamma[\alpha_j]} \prod_{j=1}^K\lambda_j^{\alpha_j-1} \\ =& \frac{\Gamma\left[\displaystyle\sum_{j=1}^{K}\alpha_j\right]}{\displaystyle\prod_{j=1}^K\Gamma[\alpha_j]} \prod_{j=1}^K\lambda_j^{\alpha_j-1+N_j} \\ &(设\displaystyle\sum_{i=1}^I x_{ij}=N_j,即取I次总共取到j号结果的总次数) \\ =& \frac{\Gamma\left[\displaystyle\sum_{j=1}^{K}\alpha_j\right]}{\displaystyle\prod_{j=1}^K\Gamma[\alpha_j]} \frac{\displaystyle\prod_{j=1}^K\Gamma[\alpha_j+N_j]}{\Gamma\left[\displaystyle\sum_{j=1}^{K}(\alpha_j+N_j)\right]} \frac{\Gamma\left[\displaystyle\sum_{j=1}^{K}(\alpha_j+N_j)\right]}{\displaystyle\prod_{j=1}^K\Gamma[\alpha_j+N_j]} \prod_{j=1}^K\lambda_j^{\alpha_j-1+N_j} \\ &(\because \displaystyle\sum_{j=1}^{K}N_j=I,即取I次总共取到所有结果的总次数,因为取了I次,所以等于I)\\ =& \frac{\Gamma\left[\displaystyle\sum_{j=1}^{K}\alpha_j\right]}{\displaystyle\prod_{j=1}^K\Gamma[\alpha_j]} \frac{\displaystyle\prod_{j=1}^K\Gamma[\alpha_j+N_j]}{\Gamma\left[I+\displaystyle\sum_{j=1}^{K}\alpha_j\right]} \cdot \text{Dir}_{\lambda_{1\cdots K}}\left[\widetilde{\alpha_{1\cdots K}} \right] \\ & \\ =& \widetilde{\kappa} \cdot \text{Dir}_{\lambda_{1\cdots K}}\left[\widetilde{\alpha_{1\cdots K}} \right] \end{aligned} ======i=1∏ICatxi[λ1⋯K]⋅Dirλ1⋯K[α1⋯K][i=1∏Ij=1∏Kλjxij]j=1∏KΓ[αj]Γ[j=1∑Kαj]j=1∏Kλjαj−1(xij是xi取到j号结果的次数)[j=1∏Kλj∑i=1Ixij]j=1∏KΓ[αj]Γ[j=1∑Kαj]j=1∏Kλjαj−1j=1∏KΓ[αj]Γ[j=1∑Kαj]j=1∏Kλjαj−1+Nj(设i=1∑Ixij=Nj,即取I次总共取到j号结果的总次数)j=1∏KΓ[αj]Γ[j=1∑Kαj]Γ[j=1∑K(αj+Nj)]j=1∏KΓ[αj+Nj]j=1∏KΓ[αj+Nj]Γ[j=1∑K(αj+Nj)]j=1∏Kλjαj−1+Nj(∵j=1∑KNj=I,即取I次总共取到所有结果的总次数,因为取了I次,所以等于I)j=1∏KΓ[αj]Γ[j=1∑Kαj]Γ[I+j=1∑Kαj]j=1∏KΓ[αj+Nj]⋅Dirλ1⋯K[α1⋯K ]κ ⋅Dirλ1⋯K[α1⋯K ]
得 证 得证 得证
∏ i = 1 I Norm x i [ μ , σ 2 ] ⋅ NormInvGam μ , σ 2 [ α , β , γ , δ ] = ∏ i = 1 I 1 2 π σ 2 exp [ − 0.5 ( x i − μ ) 2 σ 2 ] ⋅ γ σ 2 π β α Γ [ α ] ( 1 σ 2 ) α + 1 exp [ − 2 β + γ ( δ − μ ) 2 2 σ 2 ] = κ ⋅ γ ~ β ~ α ~ Γ [ α ~ ] ( 1 σ ) 2 α + 2 + I 1 σ 2 π exp [ − ∑ i = 1 I ( x i − μ ) 2 + 2 β + γ ( δ − μ ) 2 2 σ 2 ] = κ ⋅ γ ~ σ 2 π β ~ α ~ Γ [ α ~ ] ( 1 σ 2 ) α ~ + 1 exp [ − 2 β ~ + γ ~ ( δ ~ − μ ) 2 2 σ 2 ] = κ ⋅ NormInvGam μ , σ 2 [ α ~ , β ~ , γ ~ , δ ~ ] \begin{aligned} & \prod_{i=1}^I\text{Norm}_{x_i}\left[\mu,\sigma^2\right] \cdot \text{NormInvGam}_{\mu,\sigma^2}[\alpha,\beta,\gamma,\delta] \\ =& \prod_{i=1}^I \frac{1}{\sqrt{2\pi \sigma^2}} \text{exp}\left[ -0.5\frac{(x_i-\mu)^2}{\sigma^2} \right] \cdot \frac{\sqrt{\gamma}}{\sigma\sqrt{2\pi}} \frac{\beta^{\alpha}}{\Gamma[\alpha]} \left(\frac{1}{\sigma^2} \right)^{\alpha+1} \text{exp}\left[ -\frac{2\beta+\gamma(\delta-\mu)^2}{2\sigma^2} \right] \\ =& \kappa \cdot \frac{\sqrt{\widetilde\gamma}\widetilde\beta^{\widetilde\alpha}}{\Gamma[\widetilde\alpha]} \left(\frac{1}{\sigma}\right)^{2\alpha+2+I} \frac{1}{\sigma\sqrt{2\pi}} \text{exp}\left[ -\frac{\displaystyle\sum_{i=1}^I(x_i-\mu)^2+2\beta+\gamma(\delta-\mu)^2}{2\sigma^2} \right] \\ =& \kappa \cdot \frac{\sqrt{\widetilde\gamma}}{\sigma\sqrt{2\pi}} \frac{\widetilde\beta^{\widetilde\alpha}}{\Gamma[\widetilde\alpha]} \left(\frac{1}{\sigma^2} \right)^{\widetilde\alpha+1} \text{exp}\left[ -\frac{2\widetilde\beta+\widetilde\gamma(\widetilde\delta-\mu)^2}{2\sigma^2} \right] \\ =& \kappa \cdot \text{NormInvGam}_{\mu,\sigma^2}[\widetilde\alpha,\widetilde\beta,\widetilde\gamma,\widetilde\delta] \end{aligned} ====i=1∏INormxi[μ,σ2]⋅NormInvGamμ,σ2[α,β,γ,δ]i=1∏I2πσ21exp[−0.5σ2(xi−μ)2]⋅σ2πγΓ[α]βα(σ21)α+1exp[−2σ22β+γ(δ−μ)2]κ⋅Γ[α ]γ β α (σ1)2α+2+Iσ2π1exp⎣⎢⎢⎢⎢⎡−2σ2i=1∑I(xi−μ)2+2β+γ(δ−μ)2⎦⎥⎥⎥⎥⎤κ⋅σ2πγ Γ[α ]β α (σ21)α +1exp[−2σ22β +γ (δ −μ)2]κ⋅NormInvGamμ,σ2[α ,β ,γ ,δ ]
得 证 得证 得证
待定
∏ i = 1 I Norm x i [ μ , Σ ] ⋅ NorIWis μ , Σ [ α , Ψ , γ , δ ] = ∏ i = 1 I 1 ( 2 π ) D / 2 ∣ Σ ∣ 1 / 2 exp [ − 0.5 ( x − μ ) T Σ − 1 ( x − μ ) ] ⋅ γ D / 2 ∣ Ψ ∣ α / 2 ∣ Σ ∣ − ( α + D + 2 ) / 2 2 α D / 2 ( 2 π ) D / 2 Γ D [ α / 2 ] exp [ − 0.5 ( Tr [ Ψ Σ − 1 ] + γ ( μ − δ ) T Σ − 1 ( μ − δ ) γ D / 2 ) ] = κ ⋅ γ ~ D / 2 ∣ Ψ ~ ∣ α ~ / 2 ∣ Σ ∣ − ( α + D + 2 + I ) / 2 2 D ( α + I ) / 2 ( 2 π ) D / 2 Γ D [ α ~ / 2 ] = κ ⋅ γ ~ D / 2 ∣ Ψ ~ ∣ α ~ / 2 ∣ Σ ∣ − ( α ~ + D + 2 ) / 2 2 D α ~ / 2 ( 2 π ) D / 2 Γ D [ α ~ / 2 ] \begin{aligned} & \prod_{i=1}^I\text{Norm}_{\bm{x_i}}\left[\bm\mu,\bm\Sigma\right] \cdot \text{NorIWis}_{\bm\mu,\bm\Sigma}[\alpha,\bm\Psi,\gamma,\bm\delta] \\ =& \prod_{i=1}^I \frac{1}{(2\pi)^{D/2}|\bm\Sigma|^{1/2}} \text{exp} \left[ -0.5(\bm x-\bm\mu)^\text T\bm\Sigma^{-1}(\bm x-\bm\mu) \right] \cdot \frac{\gamma^{D/2}|\bm\Psi|^{\alpha/2}|\bm\Sigma|^{-(\alpha+D+2)/2}}{2^{\alpha D/2}(2\pi)^{D/2}\Gamma_D[\alpha/2]} \text{exp}\left[ -0.5\left( \text{Tr}[\bm\Psi\bm\Sigma^{-1}]+\gamma(\bm\mu-\bm\delta)^{\text T}\bm\Sigma^{-1}(\bm\mu-\bm\delta) \gamma^{D/2} \right) \right] \\ =& \kappa \cdot \frac{\widetilde{\gamma}^{D/2}|\widetilde{\bm\Psi}|^{\widetilde{\alpha}/2}|\bm\Sigma|^{-(\alpha+D+2+I)/2}}{2^{D(\alpha+I)/2}(2\pi)^{D/2}\Gamma_D[\widetilde{\alpha}/2]} \\ =& \kappa \cdot \frac{\widetilde{\gamma}^{D/2}|\widetilde{\bm\Psi}|^{\widetilde{\alpha}/2}|\bm\Sigma|^{-(\widetilde\alpha+D+2)/2}}{2^{D\widetilde\alpha/2}(2\pi)^{D/2}\Gamma_D[\widetilde{\alpha}/2]} \end{aligned} ===i=1∏INormxi[μ,Σ]⋅NorIWisμ,Σ[α,Ψ,γ,δ]i=1∏I(2π)D/2∣Σ∣1/21exp[−0.5(x−μ)TΣ−1(x−μ)]⋅2αD/2(2π)D/2ΓD[α/2]γD/2∣Ψ∣α/2∣Σ∣−(α+D+2)/2exp[−0.5(Tr[ΨΣ−1]+γ(μ−δ)TΣ−1(μ−δ)γD/2)]κ⋅2D(α+I)/2(2π)D/2ΓD[α /2]γ D/2∣Ψ ∣α /2∣Σ∣−(α+D+2+I)/2κ⋅2Dα /2(2π)D/2ΓD[α /2]γ D/2∣Ψ ∣α /2∣Σ∣−(α +D+2)/2