q : 输 出 图 像 p : 输 入 图 像 I : 引 导 图 a k , b k : 线 性 变 换 模 型 r : 滤 波 半 径 s : 下 采 样 因 子 q: 输出图像 \\[2ex] p: 输入图像 \\[2ex] I: 引导图 \\[2ex] a_k,b_k: 线性变换模型 \\[2ex] r: 滤波半径 \\[2ex] s: 下采样因子 \\[2ex] q:输出图像p:输入图像I:引导图ak,bk:线性变换模型r:滤波半径s:下采样因子
文章假设滤波结果是对guidance image线性变换的结果:
q i = a k I i + b k , ∀ i ∈ ω k q_i = a_k I_i + b_k, \forall i \in \omega_k qi=akIi+bk,∀i∈ωk
这就是最终的滤波公式,但是目前 a k , b k a_k, b_k ak,bk是未知的,需要求解;另外目前为止 a k , b k a_k, b_k ak,bk在 ω k \omega_k ωk之内被假设为常量。
相对于普通的滤波,如高斯滤波,这个公式最大的不同是多了bias项 b k b_k bk。
为了求解 a k , b k a_k, b_k ak,bk,需要建立一个目标函数,目标函数的建立规则是:
于是就需要最小化以下目标函数:
E ( a k , b k ) = ∑ i ∈ ω k ( ( a k I i + b k − p i ) 2 + ϵ a k 2 ) E(a_k,b_k) = \sum_{i \in \omega_k} {((a_k I_i + b_k - p_i)^2 + \epsilon a_k^2)} E(ak,bk)=i∈ωk∑((akIi+bk−pi)2+ϵak2)
注意上式中 q i q_i qi已经用 a k I i + b k a_k I_i + b_k akIi+bk代替。
通过目标函数求解 a k , b k a_k, b_k ak,bk
使用目标函数 E ( a k , b k ) E(a_k,b_k) E(ak,bk)分别对 a k , b k a_k,b_k ak,bk求偏导并置零,来获得等式:
∂ E ∂ a k = ∑ i ( 2 ( a k I i + b k − p i ) I i + 2 ϵ a k ) = 0 ∂ E ∂ b k = ∑ i 2 ( a k I i + b k − p i ) = 0 \frac{\partial E}{\partial a_k} = \sum_{i} (2(a_k I_i + b_k - p_i) I_i + 2 \epsilon a_k) = 0 \\[4ex] \frac{\partial E}{\partial b_k} = \sum_{i} 2(a_k I_i + b_k - p_i) = 0 ∂ak∂E=i∑(2(akIi+bk−pi)Ii+2ϵak)=0∂bk∂E=i∑2(akIi+bk−pi)=0
上式中为了看起来清爽一点,把求和符号中的 i ∈ ω k i \in \omega_k i∈ωk简写为 i i i。
注意求和符号只针对 i i i,所以但凡脚标不是 i i i的量都可以移动到求和符号之外,顺便把系数2给干掉,然后上述两式变为:
a k ∑ i I i I i + b k ∑ i I i − ∑ i p i I i + ∣ ω ∣ ϵ a k = 0 a k ∑ i I i + ∣ ω ∣ b k − ∑ i p i = 0 a_k\sum_{i}{I_i I_i} + b_k\sum_{i}{I_i} - \sum_{i}{p_i I_i} + |\omega|\epsilon a_k = 0 \\[4ex] a_k\sum_{i}{I_i} + |\omega|b_k - \sum_{i}{p_i} = 0 aki∑IiIi+bki∑Ii−i∑piIi+∣ω∣ϵak=0aki∑Ii+∣ω∣bk−i∑pi=0
首先可以通过第二个式子求得:
b k = 1 ∣ ω ∣ ∑ i p i − a k 1 ∣ ω ∣ ∑ i I i (1) b_k = \frac{1}{|\omega|} \sum_{i}{p_i} - a_k \frac{1}{|\omega|}\sum_{i}{I_i} \tag{1} bk=∣ω∣1i∑pi−ak∣ω∣1i∑Ii(1)
上式中 ∣ ω ∣ |\omega| ∣ω∣表示 ω k \omega_k ωk区域中像素个数,所以 1 ∣ ω ∣ ∑ i \frac{1}{ |\omega|} \sum_{i} ∣ω∣1∑i的意思其实就是求平均。
把 b k b_k bk代入第一个式子,再经过一系列化简,可以得到 a k a_k ak的表达式:
a k ∑ i I i I i + ( 1 ∣ ω ∣ ∑ i p i − a k 1 ∣ ω ∣ ∑ i I i ) ∑ i I i − ∑ i p i I i + ∣ ω ∣ ϵ a k = 0 a k ( ∑ i I i I i − 1 ∣ ω ∣ ∑ i I i ∑ i I i + ∣ ω ∣ ϵ ) + 1 ∣ ω ∣ ∑ i p i ∑ i I i − ∑ i p i I i = 0 为 了 显 示 得 干 净 点 , 求 和 符 号 中 的 i 也 干 掉 了 : a k = ∑ p i I i − 1 ∣ ω ∣ ∑ p i ∑ I i ∑ I i I i − 1 ∣ ω ∣ ∑ I i ∑ I i + ∣ ω ∣ ϵ 上 下 同 除 以 ∣ ω ∣ 得 : a k = 1 ∣ ω ∣ ∑ p i I i − 1 ∣ ω ∣ ∑ p i 1 ∣ ω ∣ ∑ I i 1 ∣ ω ∣ ∑ I i I i − 1 ∣ ω ∣ ∑ I i 1 ∣ ω ∣ ∑ I i + ϵ (2) a_k\sum_{i}{I_i I_i} + (\frac{1}{ |\omega|} \sum_{i}{p_i} - a_k \frac{1}{|\omega|}\sum_{i}{I_i})\sum_{i}{I_i} - \sum_{i}{p_i I_i} + |\omega|\epsilon a_k = 0 \\[4ex] a_k(\sum_{i}{I_i I_i} - \frac{1}{ |\omega|}{\sum_{i}{I_i}\sum_{i}{I_i} + |\omega| \epsilon}) + \frac{1}{ |\omega|}{\sum_{i}{p_i}\sum_{i}{I_i}} - \sum_{i}{p_i I_i} = 0 \\[4ex] 为了显示得干净点,求和符号中的i也干掉了: \\[4ex] a_k = \frac{\sum{p_i I_i} - \frac{1}{ |\omega|}{\sum{p_i}\sum{I_i}}} {\sum{I_i I_i} - \frac{1}{ |\omega|}{\sum{I_i}\sum{I_i} + |\omega| \epsilon}} \\[4ex] 上下同除以 |\omega| 得: \\[4ex] a_k = \frac{\frac{1}{|\omega|}\sum{p_i I_i} - \frac{1}{|\omega|}{\sum{p_i} \frac{1}{|\omega|}\sum{I_i}}} {\frac{1}{ |\omega|}\sum{I_i I_i} - \frac{1}{|\omega|}{\sum{I_i} \frac{1}{ |\omega|}\sum{I_i} + \epsilon}} \tag{2} aki∑IiIi+(∣ω∣1i∑pi−ak∣ω∣1i∑Ii)i∑Ii−i∑piIi+∣ω∣ϵak=0ak(i∑IiIi−∣ω∣1i∑Iii∑Ii+∣ω∣ϵ)+∣ω∣1i∑pii∑Ii−i∑piIi=0为了显示得干净点,求和符号中的i也干掉了:ak=∑IiIi−∣ω∣1∑Ii∑Ii+∣ω∣ϵ∑piIi−∣ω∣1∑pi∑Ii上下同除以∣ω∣得:ak=∣ω∣1∑IiIi−∣ω∣1∑Ii∣ω∣1∑Ii+ϵ∣ω∣1∑piIi−∣ω∣1∑pi∣ω∣1∑Ii(2)
这就是最符合原文章算法流程的公式写法,不用再按照原文章的公式写法简化了。
最后需注意,虽然前面假设 a k , b k a_k,b_k ak,bk在 ω k \omega_k ωk区域内是常量,但实际上根据上述(1)(2)两公式计算出来后并不是常量,所以要考虑 ω k \omega_k ωk区域中 a k , b k a_k,b_k ak,bk变化带来的影响,一般采用平均的方式去处理,也就是:
q i = 1 ∣ ω ∣ ∑ k ∈ ω i a k I i + 1 ∣ ω ∣ ∑ k ∈ ω i b k (3) q_i = \frac{1}{|\omega|}\sum_{k\in\omega_i}a_k I_i + \frac{1}{|\omega|}\sum_{k\in\omega_i}b_k \tag{3} qi=∣ω∣1k∈ωi∑akIi+∣ω∣1k∈ωi∑bk(3)
上述(1, 2, 3)三个公式就是正常guidefilter的完整求解过程。
文章中 a k , b k a_k, b_k ak,bk的公式解释
记 μ k = 1 ∣ ω ∣ ∑ I i \mu_k = \frac{1}{|\omega|}{\sum{I_i}} μk=∣ω∣1∑Ii, p k = 1 ∣ ω ∣ ∑ p i p_k = \frac{1}{|\omega|}{\sum{p_i}} pk=∣ω∣1∑pi。
那么 a k , b k a_k,b_k ak,bk分别可写为:
a k = 1 ∣ ω ∣ ∑ p i I i − p k μ k 1 ∣ ω ∣ ∑ I i 2 − μ k 2 + ϵ b k = p k − a k μ k a_k = \frac{\frac{1}{|\omega|}\sum{p_i I_i} - p_k\mu_k} {\frac{1}{|\omega|}\sum{I_i^2} - \mu_k^2 + \epsilon} \\[4ex] b_k = p_k - a_k\mu_k ak=∣ω∣1∑Ii2−μk2+ϵ∣ω∣1∑piIi−pkμkbk=pk−akμk
然后再来看个方差公式:
σ k 2 = 1 ∣ ω ∣ ∑ i ( I i − μ k ) 2 = 1 ∣ ω ∣ ∑ i ( I i 2 − 2 I i μ k + μ k 2 ) = 1 ∣ ω ∣ ( ∑ i I i 2 − 2 μ k ∑ i I i + ∑ i μ k 2 ) ( 请 注 意 : ∑ I i = ∣ ω ∣ μ k , ∑ i μ k 2 = ∣ ω ∣ μ k 2 ) = 1 ∣ ω ∣ ( ∑ i I i 2 − 2 μ k ∣ ω ∣ μ k + ∣ ω ∣ μ k 2 ) = 1 ∣ ω ∣ ( ∑ i I i 2 − ∣ ω ∣ μ k 2 ) = 1 ∣ ω ∣ ∑ i I i 2 − μ k 2 \begin{aligned} \sigma_k^2 &= \frac {1}{|\omega|} \sum_{i}(I_i - \mu_k)^2 \\[4ex] &= \frac {1}{|\omega|} \sum_{i} (I_i^2 - 2I_i\mu_k + \mu_k^2) \\[4ex] &= \frac {1}{|\omega|} (\sum_{i}I_i^2 - 2\mu_k \sum_{i}I_i + \sum_{i}\mu_k^2) \\[4ex] & (请注意: \sum{I_i} = |\omega|\mu_k, \sum_{i}\mu_k^2 = |\omega|\mu_k^2)\\[4ex] &= \frac {1}{|\omega|} (\sum_{i}I_i^2 - 2\mu_k |\omega| \mu_k + |\omega|\mu_k^2) \\[4ex] &= \frac {1}{|\omega|} (\sum_{i}I_i^2 - |\omega|\mu_k^2) \\[4ex] &= \frac {1}{|\omega|} \sum_{i}I_i^2 - \mu_k^2 \\[4ex] \end{aligned} σk2=∣ω∣1i∑(Ii−μk)2=∣ω∣1i∑(Ii2−2Iiμk+μk2)=∣ω∣1(i∑Ii2−2μki∑Ii+i∑μk2)(请注意:∑Ii=∣ω∣μk,i∑μk2=∣ω∣μk2)=∣ω∣1(i∑Ii2−2μk∣ω∣μk+∣ω∣μk2)=∣ω∣1(i∑Ii2−∣ω∣μk2)=∣ω∣1i∑Ii2−μk2
同理可以把协方差公式也写在这里,下面推导多通道公式时候会用到
σ k 2 = 1 ∣ ω ∣ ∑ i ( I i 0 − μ k 0 ) ( I i 1 − μ k 1 ) = 1 ∣ ω ∣ ∑ i ( I i 0 I i 1 − I i 0 μ k 1 − I i 1 μ k 0 + μ k 0 μ k 1 ) = 1 ∣ ω ∣ ( ∑ i I i 0 I i 1 − μ k 1 ∑ i I i 0 − μ k 0 ∑ i I i 1 + ∑ i μ k 0 μ k 1 ) = 1 ∣ ω ∣ ( ∑ i I i 0 I i 1 − ∣ ω ∣ μ k 1 μ k 0 − ∣ ω ∣ μ k 0 μ k 1 + ∣ ω ∣ μ k 0 μ k 1 ) = 1 ∣ ω ∣ ∑ i I i 0 I i 1 − μ k 1 μ k 0 \begin{aligned} \sigma_k^2 &= \frac {1}{|\omega|} \sum_{i}(I_{i0} - \mu_{k0}) (I_{i1} - \mu_{k1}) \\[4ex] &= \frac {1}{|\omega|} \sum_{i} (I_{i0} I_{i1} - I_{i0} \mu_{k1} - I_{i1} \mu_{k0} + \mu_{k0} \mu_{k1}) \\[4ex] &= \frac {1}{|\omega|} (\sum_{i} I_{i0} I_{i1} - \mu_{k1} \sum_{i} I_{i0} - \mu_{k0} \sum_{i} I_{i1} + \sum_{i} \mu_{k0} \mu_{k1}) \\[4ex] &= \frac {1}{|\omega|} (\sum_{i} I_{i0} I_{i1} - |\omega| \mu_{k1} \mu_{k0} - |\omega| \mu_{k0} \mu_{k1} + |\omega| \mu_{k0} \mu_{k1}) \\[4ex] &= \frac {1}{|\omega|} \sum_{i} I_{i0} I_{i1} - \mu_{k1} \mu_{k0} \end{aligned} σk2=∣ω∣1i∑(Ii0−μk0)(Ii1−μk1)=∣ω∣1i∑(Ii0Ii1−Ii0μk1−Ii1μk0+μk0μk1)=∣ω∣1(i∑Ii0Ii1−μk1i∑Ii0−μk0i∑Ii1+i∑μk0μk1)=∣ω∣1(i∑Ii0Ii1−∣ω∣μk1μk0−∣ω∣μk0μk1+∣ω∣μk0μk1)=∣ω∣1i∑Ii0Ii1−μk1μk0
所以 a k a_k ak就可以进一步写为论文中的:
a k = 1 ∣ ω ∣ ∑ p i I i − p k μ k σ k 2 + ϵ a_k = \frac{\frac{1}{|\omega|}\sum{p_i I_i} - p_k\mu_k} {\sigma_k^2 + \epsilon} ak=σk2+ϵ∣ω∣1∑piIi−pkμk
使用引导图对输入图的每个通道分别滤波即可。
引导图是多通道的情况下,多个通道需要综合对输入图的单通道产生影响,按照文章公式可以记录为:
q i = a k T I i + b k , ∀ i ∈ ω k q_i = \pmb{a_k^T I_i}+ b_k, \forall i \in \omega_k qi=akTIiakTIiakTIi+bk,∀i∈ωk
拆开写为:
q i = a k 0 I i 0 + a k 1 I i 1 + a k 1 I i 1 + b k , ∀ i ∈ ω k q_i = a_{k0} I_{i0} + a_{k1} I_{i1} + a_{k1} I_{i1} + b_k, \forall i \in \omega_k qi=ak0Ii0+ak1Ii1+ak1Ii1+bk,∀i∈ωk
那么目标函数就变成:
E ( a k , b k ) = ∑ i ∈ ω k ( ( a k 0 I i 0 + a k 1 I i 1 + a k 1 I i 1 + b k − p i ) 2 + ϵ ( a k 0 2 + a k 1 2 + a k 2 2 ) ) E(a_k,b_k) = \sum_{i \in \omega_k} {((a_{k0} I_{i0} + a_{k1} I_{i1} + a_{k1} I_{i1} + b_k - p_i)^2 + \epsilon (a_{k0}^2 + a_{k1}^2 + a_{k2}^2))} E(ak,bk)=i∈ωk∑((ak0Ii0+ak1Ii1+ak1Ii1+bk−pi)2+ϵ(ak02+ak12+ak22))
(下面推导为了看着清爽,会把求和符号中的 i i i省略掉,需要时刻记住,求和符号的对象是 i i i)
现在要求 a k 0 , a k 1 , a k 2 , b k a_{k0}, a_{k1}, a_{k2}, b_k ak0,ak1,ak2,bk,使用 E ( a k , b k ) E(a_k,b_k) E(ak,bk)分别对这几项求偏导并置零得:
∂ E ∂ a k 0 = ∑ ( 2 ( a k 0 I i 0 + a k 1 I i 1 + a k 2 I i 2 + b k − p i ) I i 0 + 2 ϵ a k 0 ) = 0 ∂ E ∂ a k 1 = ∑ ( 2 ( a k 0 I i 0 + a k 1 I i 1 + a k 2 I i 2 + b k − p i ) I i 1 + 2 ϵ a k 1 ) = 0 ∂ E ∂ a k 2 = ∑ ( 2 ( a k 0 I i 0 + a k 1 I i 1 + a k 2 I i 2 + b k − p i ) I i 2 + 2 ϵ a k 2 ) = 0 ∂ E ∂ b k = ∑ 2 ( a k 0 I i 0 + a k 1 I i 1 + a k 1 I i 1 + b k − p i ) = 0 \frac{\partial E}{\partial a_{k0}} = \sum (2(a_{k0} I_{i0} + a_{k1} I_{i1} + a_{k2} I_{i2} + b_k - p_i)I_{i0} + 2\epsilon a_{k0}) = 0 \\[4ex] \frac{\partial E}{\partial a_{k1}} = \sum (2(a_{k0} I_{i0} + a_{k1} I_{i1} + a_{k2} I_{i2} + b_k - p_i)I_{i1} + 2\epsilon a_{k1}) = 0 \\[4ex] \frac{\partial E}{\partial a_{k2}} = \sum (2(a_{k0} I_{i0} + a_{k1} I_{i1} + a_{k2} I_{i2} + b_k - p_i)I_{i2} + 2\epsilon a_{k2}) = 0 \\[4ex] \frac{\partial E}{\partial b_k} = \sum 2(a_{k0} I_{i0} + a_{k1} I_{i1} + a_{k1} I_{i1} + b_k - p_i) = 0 \\[4ex] ∂ak0∂E=∑(2(ak0Ii0+ak1Ii1+ak2Ii2+bk−pi)Ii0+2ϵak0)=0∂ak1∂E=∑(2(ak0Ii0+ak1Ii1+ak2Ii2+bk−pi)Ii1+2ϵak1)=0∂ak2∂E=∑(2(ak0Ii0+ak1Ii1+ak2Ii2+bk−pi)Ii2+2ϵak2)=0∂bk∂E=∑2(ak0Ii0+ak1Ii1+ak1Ii1+bk−pi)=0
类似全部单通道的推导过程,首先可以得到 b k b_k bk的表达式:
b k = 1 ∣ ω ∣ ∑ p i − a k 0 1 ∣ ω ∣ ∑ I i 0 − a k 1 1 ∣ ω ∣ ∑ I i 1 − a k 2 1 ∣ ω ∣ ∑ I i 2 b k = p k − a k 0 μ k 0 − a k 1 μ k 1 − a k 2 μ k 2 b_k = \frac {1}{|\omega|} \sum{p_i} - a_{k0} \frac {1}{|\omega|} \sum{I_{i0}} - a_{k1} \frac {1}{|\omega|} \sum{I_{i1}} - a_{k2} \frac {1}{|\omega|} \sum{I_{i2}} \\[4ex] b_k = p_k - a_{k0} \mu_{k0} - a_{k1} \mu_{k1} - a_{k2} \mu_{k2} bk=∣ω∣1∑pi−ak0∣ω∣1∑Ii0−ak1∣ω∣1∑Ii1−ak2∣ω∣1∑Ii2bk=pk−ak0μk0−ak1μk1−ak2μk2
其中 p k = 1 ∣ ω ∣ ∑ p i p_k = \frac {1}{|\omega|} \sum{p_i} pk=∣ω∣1∑pi, μ k 0 = 1 ∣ ω ∣ ∑ I i 0 \mu_{k0} = \frac {1}{|\omega|} \sum{I_{i0}} μk0=∣ω∣1∑Ii0, μ k 1 = 1 ∣ ω ∣ ∑ I i 1 \mu_{k1} = \frac {1}{|\omega|} \sum{I_{i1}} μk1=∣ω∣1∑Ii1, μ k 2 = 1 ∣ ω ∣ ∑ I i 2 \mu_{k2} = \frac {1}{|\omega|} \sum{I_{i2}} μk2=∣ω∣1∑Ii2
把 b k b_k bk的表达式带入到第一个偏导式子中,顺便做个整理:
∑ ( a k 0 I i 0 2 + a k 1 I i 0 I i 1 + a k 2 I i 0 I i 2 + p k I i 0 − a k 0 μ k 0 I i 0 − a k 1 μ k 1 I i 0 − a k 2 μ k 2 I i 0 − p i I i 0 + ϵ a k 0 ) = 0 a k 0 ∑ I i 0 2 + a k 1 ∑ I i 0 I i 1 + a k 2 ∑ I i 0 I i 2 + p k ∑ I i 0 − a k 0 μ k 0 ∑ I i 0 − a k 1 μ k 1 ∑ I i 0 − a k 2 μ k 2 ∑ I i 0 − ∑ p i I i 0 + ∑ ϵ a k 0 ) = 0 a k 0 ∑ I i 0 2 + a k 1 ∑ I i 0 I i 1 + a k 2 ∑ I i 0 I i 2 + ∣ ω ∣ p k μ k 0 − ∣ ω ∣ a k 0 μ k 0 2 − ∣ ω ∣ a k 1 μ k 1 μ k 0 − ∣ ω ∣ a k 2 μ k 2 μ k 0 − ∑ p i I i 0 + ∣ ω ∣ ϵ a k 0 ) = 0 a k 0 1 ∣ ω ∣ ∑ I i 0 2 + a k 1 1 ∣ ω ∣ ∑ I i 0 I i 1 + a k 2 1 ∣ ω ∣ ∑ I i 0 I i 2 + p k μ k 0 − a k 0 μ k 0 2 − a k 1 μ k 1 μ k 0 − a k 2 μ k 2 μ k 0 − 1 ∣ ω ∣ ∑ p i I i 0 + ϵ a k 0 ) = 0 a k 0 ( 1 ∣ ω ∣ ∑ I i 0 2 − μ k 0 2 + ϵ ) + a k 1 ( 1 ∣ ω ∣ ∑ I i 0 I i 1 − μ k 0 μ k 1 ) + a k 2 ( 1 ∣ ω ∣ ∑ I i 0 I i 2 − μ k 0 μ k 2 ) + p k μ k 0 − 1 ∣ ω ∣ ∑ p i I i 0 = 0 a k 0 ( 1 ∣ ω ∣ ∑ I i 0 2 − μ k 0 2 + ϵ ) + a k 1 ( 1 ∣ ω ∣ ∑ I i 0 I i 1 − μ k 0 μ k 1 ) + a k 2 ( 1 ∣ ω ∣ ∑ I i 0 I i 2 − μ k 0 μ k 2 ) = 1 ∣ ω ∣ ∑ p i I i 0 − p k μ k 0 \sum (a_{k0} I_{i0}^2 + a_{k1} I_{i0} I_{i1} + a_{k2} I_{i0} I_{i2} + p_k I_{i0} - a_{k0} \mu_{k0} I_{i0} - a_{k1} \mu_{k1} I_{i0} - a_{k2} \mu_{k2} I_{i0} - p_i I_{i0} + \epsilon a_{k0}) = 0 \\[4ex] a_{k0} \sum I_{i0}^2 + a_{k1} \sum I_{i0} I_{i1} + a_{k2} \sum I_{i0} I_{i2} + p_k \sum I_{i0} - a_{k0} \mu_{k0} \sum I_{i0} - a_{k1} \mu_{k1} \sum I_{i0} - a_{k2} \mu_{k2} \sum I_{i0} - \sum p_i I_{i0} + \sum \epsilon a_{k0}) = 0 \\[4ex] a_{k0} \sum I_{i0}^2 + a_{k1} \sum I_{i0} I_{i1} + a_{k2} \sum I_{i0} I_{i2} + |\omega| p_k \mu_{k0} - |\omega|a_{k0} \mu_{k0}^2 - |\omega| a_{k1} \mu_{k1} \mu_{k0} - |\omega| a_{k2} \mu_{k2} \mu_{k0} - \sum p_i I_{i0} + |\omega| \epsilon a_{k0}) = 0 \\[4ex] a_{k0} \frac{1}{|\omega|} \sum I_{i0}^2 + a_{k1} \frac{1}{|\omega|} \sum I_{i0} I_{i1} + a_{k2} \frac{1}{|\omega|} \sum I_{i0} I_{i2} + p_k \mu_{k0} - a_{k0} \mu_{k0}^2 - a_{k1} \mu_{k1} \mu_{k0} - a_{k2} \mu_{k2} \mu_{k0} - \frac{1}{|\omega|} \sum p_i I_{i0} + \epsilon a_{k0}) = 0 \\[4ex] a_{k0} (\frac{1}{|\omega|} \sum I_{i0}^2 - \mu_{k0}^2 + \epsilon) + a_{k1} (\frac{1}{|\omega|} \sum I_{i0} I_{i1} - \mu_{k0} \mu_{k1}) + a_{k2} (\frac{1}{|\omega|} \sum I_{i0} I_{i2} - \mu_{k0} \mu_{k2}) + p_k \mu_{k0} - \frac{1}{|\omega|} \sum p_i I_{i0} = 0 \\[4ex] a_{k0} (\frac{1}{|\omega|} \sum I_{i0}^2 - \mu_{k0}^2 + \epsilon) + a_{k1} (\frac{1}{|\omega|} \sum I_{i0} I_{i1} - \mu_{k0} \mu_{k1}) + a_{k2} (\frac{1}{|\omega|} \sum I_{i0} I_{i2} - \mu_{k0} \mu_{k2}) = \frac{1}{|\omega|} \sum p_i I_{i0} - p_k \mu_{k0} ∑(ak0Ii02+ak1Ii0Ii1+ak2Ii0Ii2+pkIi0−ak0μk0Ii0−ak1μk1Ii0−ak2μk2Ii0−piIi0+ϵak0)=0ak0∑Ii02+ak1∑Ii0Ii1+ak2∑Ii0Ii2+pk∑Ii0−ak0μk0∑Ii0−ak1μk1∑Ii0−ak2μk2∑Ii0−∑piIi0+∑ϵak0)=0ak0∑Ii02+ak1∑Ii0Ii1+ak2∑Ii0Ii2+∣ω∣pkμk0−∣ω∣ak0μk02−∣ω∣ak1μk1μk0−∣ω∣ak2μk2μk0−∑piIi0+∣ω∣ϵak0)=0ak0∣ω∣1∑Ii02+ak1∣ω∣1∑Ii0Ii1+ak2∣ω∣1∑Ii0Ii2+pkμk0−ak0μk02−ak1μk1μk0−ak2μk2μk0−∣ω∣1∑piIi0+ϵak0)=0ak0(∣ω∣1∑Ii02−μk02+ϵ)+ak1(∣ω∣1∑Ii0Ii1−μk0μk1)+ak2(∣ω∣1∑Ii0Ii2−μk0μk2)+pkμk0−∣ω∣1∑piIi0=0ak0(∣ω∣1∑Ii02−μk02+ϵ)+ak1(∣ω∣1∑Ii0Ii1−μk0μk1)+ak2(∣ω∣1∑Ii0Ii2−μk0μk2)=∣ω∣1∑piIi0−pkμk0
仿照上述流程,处理一下第二和第三个偏导式,再把第一个也罗列下来,可得:
a k 0 ( 1 ∣ ω ∣ ∑ I i 0 2 − μ k 0 2 + ϵ ) + a k 1 ( 1 ∣ ω ∣ ∑ I i 0 I i 1 − μ k 0 μ k 1 ) + a k 2 ( 1 ∣ ω ∣ ∑ I i 0 I i 2 − μ k 0 μ k 2 ) = 1 ∣ ω ∣ ∑ p i I i 0 − p k μ k 0 a k 0 ( 1 ∣ ω ∣ ∑ I i 0 I i 1 − μ k 0 μ k 1 ) + a k 1 ( 1 ∣ ω ∣ ∑ I i 1 2 − μ k 0 2 + ϵ ) + a k 2 ( 1 ∣ ω ∣ ∑ I i 1 I i 2 − μ k 1 μ k 2 ) = 1 ∣ ω ∣ ∑ p i I i 1 − p k μ k 1 a k 0 ( 1 ∣ ω ∣ ∑ I i 0 I i 2 − μ k 0 μ k 2 ) + a k 1 ( 1 ∣ ω ∣ ∑ I i 1 I i 2 − μ k 1 μ k 2 ) + a k 2 ( 1 ∣ ω ∣ ∑ I i 2 2 − μ k 2 2 + ϵ ) = 1 ∣ ω ∣ ∑ p i I i 2 − p k μ k 2 a_{k0} (\frac{1}{|\omega|} \sum I_{i0}^2 - \mu_{k0}^2 + \epsilon) + a_{k1} (\frac{1}{|\omega|} \sum I_{i0} I_{i1} - \mu_{k0} \mu_{k1}) + a_{k2} (\frac{1}{|\omega|} \sum I_{i0} I_{i2} - \mu_{k0} \mu_{k2}) = \frac{1}{|\omega|} \sum p_i I_{i0} - p_k \mu_{k0} \\[4ex] a_{k0} (\frac{1}{|\omega|} \sum I_{i0} I_{i1} - \mu_{k0} \mu_{k1}) + a_{k1} (\frac{1}{|\omega|} \sum I_{i1}^2 - \mu_{k0}^2 + \epsilon) + a_{k2} (\frac{1}{|\omega|} \sum I_{i1} I_{i2} - \mu_{k1} \mu_{k2}) = \frac{1}{|\omega|} \sum p_i I_{i1} - p_k \mu_{k1} \\[4ex] a_{k0} (\frac{1}{|\omega|} \sum I_{i0} I_{i2} - \mu_{k0} \mu_{k2}) + a_{k1} (\frac{1}{|\omega|} \sum I_{i1} I_{i2} - \mu_{k1} \mu_{k2}) + a_{k2} (\frac{1}{|\omega|} \sum I_{i2}^2 - \mu_{k2}^2 + \epsilon) = \frac{1}{|\omega|} \sum p_i I_{i2} - p_k \mu_{k2} ak0(∣ω∣1∑Ii02−μk02+ϵ)+ak1(∣ω∣1∑Ii0Ii1−μk0μk1)+ak2(∣ω∣1∑Ii0Ii2−μk0μk2)=∣ω∣1∑piIi0−pkμk0ak0(∣ω∣1∑Ii0Ii1−μk0μk1)+ak1(∣ω∣1∑Ii12−μk02+ϵ)+ak2(∣ω∣1∑Ii1Ii2−μk1μk2)=∣ω∣1∑piIi1−pkμk1ak0(∣ω∣1∑Ii0Ii2−μk0μk2)+ak1(∣ω∣1∑Ii1Ii2−μk1μk2)+ak2(∣ω∣1∑Ii22−μk22+ϵ)=∣ω∣1∑piIi2−pkμk2
上面 a k a_k ak相关的系数都是方差或者协方差,记:
σ k m n = σ k n m = 1 ∣ ω ∣ ∑ I i m I i n − μ k m μ k n \sigma_{k}^{mn} = \sigma_{k}^{nm} = \frac{1}{|\omega|} \sum I_{im} I_{in} - \mu_{km} \mu_{kn} σkmn=σknm=∣ω∣1∑IimIin−μkmμkn
(上式的相关推导可以在上面翻翻看)
那么上面线性方程组可以进一步写为:
[ σ k 00 + ϵ σ k 01 σ k 02 σ k 10 σ k 11 + ϵ σ k 12 σ k 20 σ k 21 σ k 22 + ϵ ] [ a k 0 a k 1 a k 2 ] = [ 1 ∣ ω ∣ ∑ p i I i 0 − p k μ k 0 1 ∣ ω ∣ ∑ p i I i 1 − p k μ k 1 1 ∣ ω ∣ ∑ p i I i 2 − p k μ k 2 ] \begin{bmatrix} \sigma_{k}^{00} + \epsilon & \sigma_{k}^{01} & \sigma_{k}^{02} \\ \sigma_{k}^{10} & \sigma_{k}^{11} + \epsilon & \sigma_{k}^{12} \\ \sigma_{k}^{20} & \sigma_{k}^{21} & \sigma_{k}^{22} + \epsilon \\ \end{bmatrix} \begin{bmatrix} a_{k0} \\ a_{k1} \\ a_{k2} \\ \end{bmatrix} = \begin{bmatrix} \frac{1}{|\omega|} \sum p_i I_{i0} - p_k \mu_{k0} \\ \frac{1}{|\omega|} \sum p_i I_{i1} - p_k \mu_{k1} \\ \frac{1}{|\omega|} \sum p_i I_{i2} - p_k \mu_{k2} \\ \end{bmatrix} ⎣⎡σk00+ϵσk10σk20σk01σk11+ϵσk21σk02σk12σk22+ϵ⎦⎤⎣⎡ak0ak1ak2⎦⎤=⎣⎢⎡∣ω∣1∑piIi0−pkμk0∣ω∣1∑piIi1−pkμk1∣ω∣1∑piIi2−pkμk2⎦⎥⎤
求解方法:
[ a k 0 a k 1 a k 2 ] = [ σ k 00 + ϵ σ k 01 σ k 02 σ k 10 σ k 11 + ϵ σ k 12 σ k 20 σ k 21 σ k 22 + ϵ ] − 1 [ σ I p 0 σ I p 1 σ I p 2 ] (4) \begin{bmatrix} a_{k0} \\ a_{k1} \\ a_{k2} \\ \end{bmatrix} = \begin{bmatrix} \sigma_{k}^{00} + \epsilon & \sigma_{k}^{01} & \sigma_{k}^{02} \\ \sigma_{k}^{10} & \sigma_{k}^{11} + \epsilon & \sigma_{k}^{12} \\ \sigma_{k}^{20} & \sigma_{k}^{21} & \sigma_{k}^{22} + \epsilon \\ \end{bmatrix} ^{-1} \begin{bmatrix} \sigma_{Ip0} \\ \sigma_{Ip1} \\ \sigma_{Ip2} \\ \end{bmatrix} \tag{4} ⎣⎡ak0ak1ak2⎦⎤=⎣⎡σk00+ϵσk10σk20σk01σk11+ϵσk21σk02σk12σk22+ϵ⎦⎤−1⎣⎡σIp0σIp1σIp2⎦⎤(4)
公式(4)中的 σ I p 0 , σ I p 1 , σ I p 2 \sigma_{Ip0},\sigma_{Ip1},\sigma_{Ip2} σIp0,σIp1,σIp2请自行与前面公式对应起来看。
由于3x3矩阵的求逆公式比较容易手写,所以可以进一步简化,采用伴随矩阵的求逆方式。
A − 1 = a d j ( A ) d e t ( A ) A^{-1} = \frac {adj(A)} {det(A)} A−1=det(A)adj(A)
上式中 a d j ( A ) adj(A) adj(A)是A的伴随矩阵, d e t ( A ) det(A) det(A)是A的行列式。
3x3矩阵的行列式为:
d e t ( a 00 a 01 a 02 a 10 a 11 a 12 a 20 a 21 a 22 ) = ∣ a 00 a 01 a 02 a 10 a 11 a 12 a 20 a 21 a 22 ∣ = a 00 a 11 a 22 + a 10 a 21 a 02 + a 20 a 01 a 12 − a 00 a 12 a 21 − a 01 a 10 a 22 − a 02 a 11 a 20 det \left ( \begin{matrix} a_{00} & a_{01} & a_{02} \\ a_{10} & a_{11} & a_{12} \\ a_{20} & a_{21} & a_{22} \\ \end{matrix} \right ) = \left | \begin{matrix} a_{00} & a_{01} & a_{02} \\ a_{10} & a_{11} & a_{12} \\ a_{20} & a_{21} & a_{22} \\ \end{matrix} \right | = a_{00}a_{11}a_{22} + a_{10}a_{21}a_{02} + a_{20}a_{01}a_{12} - a_{00}a_{12}a_{21} - a_{01}a_{10}a_{22} - a_{02}a_{11}a_{20} det⎝⎛a00a10a20a01a11a21a02a12a22⎠⎞=∣∣∣∣∣∣a00a10a20a01a11a21a02a12a22∣∣∣∣∣∣=a00a11a22+a10a21a02+a20a01a12−a00a12a21−a01a10a22−a02a11a20
3x3矩阵的伴随矩阵等于代数余子矩阵的转置:
a d j ( a 00 a 01 a 02 a 10 a 11 a 12 a 20 a 21 a 22 ) = [ + ∣ a 11 a 12 a 21 a 22 ∣ − ∣ a 01 a 02 a 21 a 22 ∣ + ∣ a 01 a 02 a 11 a 12 ∣ − ∣ a 10 a 12 a 20 a 22 ∣ + ∣ a 00 a 02 a 20 a 22 ∣ − ∣ a 00 a 02 a 10 a 12 ∣ + ∣ a 10 a 11 a 20 a 21 ∣ − ∣ a 00 a 01 a 20 a 21 ∣ + ∣ a 00 a 01 a 10 a 11 ∣ ] = [ a 11 a 22 − a 12 a 21 a 02 a 21 − a 01 a 22 a 01 a 12 − a 02 a 11 a 12 a 20 − a 10 a 22 a 00 a 22 − a 02 a 20 a 02 a 10 − a 00 a 12 a 10 a 21 − a 11 a 20 a 01 a 20 − a 00 a 21 a 00 a 11 − a 10 a 01 ] adj \left ( \begin{matrix} a_{00} & a_{01} & a_{02} \\ a_{10} & a_{11} & a_{12} \\ a_{20} & a_{21} & a_{22} \\ \end{matrix} \right ) = \begin{bmatrix} +\left | \begin{matrix} a_{11} & a_{12} \\ a_{21} & a_{22} \\ \end{matrix} \right | & -\left | \begin{matrix} a_{01} & a_{02} \\ a_{21} & a_{22} \\ \end{matrix} \right | & +\left | \begin{matrix} a_{01} & a_{02} \\ a_{11} & a_{12} \\ \end{matrix} \right | \\[4ex] -\left | \begin{matrix} a_{10} & a_{12} \\ a_{20} & a_{22} \\ \end{matrix} \right | & +\left | \begin{matrix} a_{00} & a_{02} \\ a_{20} & a_{22} \\ \end{matrix} \right | & -\left | \begin{matrix} a_{00} & a_{02} \\ a_{10} & a_{12} \\ \end{matrix} \right | \\[4ex] +\left | \begin{matrix} a_{10} & a_{11} \\ a_{20} & a_{21} \\ \end{matrix} \right | & -\left | \begin{matrix} a_{00} & a_{01} \\ a_{20} & a_{21} \\ \end{matrix} \right | & +\left | \begin{matrix} a_{00} & a_{01} \\ a_{10} & a_{11} \\ \end{matrix} \right | \\ \end{bmatrix} = \begin{bmatrix} a_{11}a_{22} - a_{12}a_{21} & a_{02}a_{21} - a_{01}a_{22} & a_{01}a_{12} - a_{02}a_{11} \\ a_{12}a_{20} - a_{10}a_{22} & a_{00}a_{22} - a_{02}a_{20} & a_{02}a_{10} - a_{00}a_{12} \\ a_{10}a_{21} - a_{11}a_{20} & a_{01}a_{20} - a_{00}a_{21} & a_{00}a_{11} - a_{10}a_{01} \end{bmatrix} adj⎝⎛a00a10a20a01a11a21a02a12a22⎠⎞=⎣⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎡+∣∣∣∣a11a21a12a22∣∣∣∣−∣∣∣∣a10a20a12a22∣∣∣∣+∣∣∣∣a10a20a11a21∣∣∣∣−∣∣∣∣a01a21a02a22∣∣∣∣+∣∣∣∣a00a20a02a22∣∣∣∣−∣∣∣∣a00a20a01a21∣∣∣∣+∣∣∣∣a01a11a02a12∣∣∣∣−∣∣∣∣a00a10a02a12∣∣∣∣+∣∣∣∣a00a10a01a11∣∣∣∣⎦⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎤=⎣⎡a11a22−a12a21a12a20−a10a22a10a21−a11a20a02a21−a01a22a00a22−a02a20a01a20−a00a21a01a12−a02a11a02a10−a00a12a00a11−a10a01⎦⎤
现在把公式(4)中矩阵求逆的部分展开写一下,注意: