周志华《机器学习》读书笔记与习题答案(持续更新)

9 章

  • 9.1 p ≥ 1 p\geq 1 p1 时,闵可夫斯基距离满足距离度量的证明;

    已知闵可夫斯基距离的定义: dist ( x i , x j ) = ( ∑ u = 1 n ∣ x i u − x j u ∣ p ) 1 / p \text{dist}(\mathbf x_i,\mathbf x_j)=\left(\sum_{u=1}^n|x_{iu}-x_{ju}|^p\right)^{1/p} dist(xi,xj)=(u=1nxiuxjup)1/p,易证非负,对称,主要证明其三角不等式特性:

    证明如下:

    根据闵可夫斯基不等式( ( ∑ u = 1 n ∣ x i u + x j u ∣ p ) 1 / p ≤ ( ∑ u = 1 n ∣ x i u ∣ p ) 1 / p + ( ∑ u = 1 n ∣ x j u ∣ p ) 1 / p \left(\sum_{u=1}^n|x_{iu}+x_{ju}|^p\right)^{1/p}\leq \left(\sum_{u=1}^n|x_{iu}|^p\right)^{1/p}+\left(\sum_{u=1}^n|x_{ju}|^p\right)^{1/p} (u=1nxiu+xjup)1/p(u=1nxiup)1/p+(u=1nxjup)1/p),则
    dist ( x i , x j ) = ( ∑ u = 1 n ∣ x i u − x j u ∣ p ) 1 / p = ( ∑ u = 1 n ∣ x i u − x k u + x k u − x j u ∣ p ) 1 / p ≤ ( ∑ u = 1 n ∣ x i u − x k u ∣ p ) 1 / p + ( ∑ u = 1 n ∣ x k u − x j u ∣ p ) 1 / p = dist ( x i , x k ) + dist ( x k , x j ) \begin{array}{ll} \text{dist}(\mathbf x_i,\mathbf x_j)&=\left(\sum_{u=1}^n|x_{iu}-x_{ju}|^p\right)^{1/p}\\ &=\left(\sum_{u=1}^n|x_{iu}-x_{ku}+x_{ku}-x_{ju}|^p\right)^{1/p}\\ &\leq\left(\sum_{u=1}^n|x_{iu}-x_{ku}|^p\right)^{1/p}+\left(\sum_{u=1}^n|x_{ku}-x_{ju}|^p\right)^{1/p}\\ &=\text{dist}(\mathbf x_i,\mathbf x_k)+\text{dist}(\mathbf x_k,\mathbf x_j) \end{array} dist(xi,xj)=(u=1nxiuxjup)1/p=(u=1nxiuxku+xkuxjup)1/p(u=1nxiuxkup)1/p+(u=1nxkuxjup)1/p=dist(xi,xk)+dist(xk,xj)

    注闵可夫斯基不等式 p = 2 p=2 p=2 时的证明:

    ( ∑ u = 1 n ∣ x i u + x j u ∣ 2 ) 1 / 2 ≤ ( ∑ u = 1 n ∣ x i u ∣ 2 ) 1 / 2 + ( ∑ u = 1 n ∣ x j u ∣ 2 ) 1 / 2 \left(\sum_{u=1}^n|x_{iu}+x_{ju}|^2\right)^{1/2}\leq \left(\sum_{u=1}^n|x_{iu}|^2\right)^{1/2}+\left(\sum_{u=1}^n|x_{ju}|^2\right)^{1/2} (u=1nxiu+xju2)1/2(u=1nxiu2)1/2+(u=1nxju2)1/2
    两边同时平方得:

( ∑ u = 1 n ∣ x i u ∣ 2 ) + ( ∑ u = 1 n ∣ x j u ∣ 2 ) + 2 ∑ u ∣ x i u ∣ ∣ x j u ∣ ≤ ( ∑ u = 1 n ∣ x i u ∣ 2 ) + ( ∑ u = 1 n ∣ x j u ∣ 2 ) + 2 ( ∑ u = 1 n ∣ x i u ∣ 2 ) ( ∑ u = 1 n ∣ x j u ∣ 2 ) \left(\sum_{u=1}^n|x_{iu}|^2\right)+\left(\sum_{u=1}^n|x_{ju}|^2\right)+2\sum_u|x_{iu}||x_{ju}|\leq \left(\sum_{u=1}^n|x_{iu}|^2\right)+\left(\sum_{u=1}^n|x_{ju}|^2\right)+2\sqrt{\left(\sum_{u=1}^n|x_{iu}|^2\right)}\sqrt{\left(\sum_{u=1}^n|x_{ju}|^2\right)} (u=1nxiu2)+(u=1nxju2)+2uxiuxju(u=1nxiu2)+(u=1nxju2)+2(u=1nxiu2) (u=1nxju2)
显然成立。

11章

  • 11.8 x k + 1 = arg ⁡ min ⁡ x L 2 ∥ x − z ∥ 2 2 + λ ∥ x ∥ 1 \mathbf x_{k+1}=\arg\min_{\mathbf x}\frac{L}{2}\|\mathbf x-\mathbf z\|_2^2+\lambda \|\mathbf x\|_1 xk+1=argminx2Lxz22+λx1

    对等式右侧按分量展开得: L 2 ∑ i ( x i − z i ) 2 + λ ∑ i ∣ x i ∣ \frac{L}2\sum_i(x^i-z^i)^2+\lambda\sum_i|x^i| 2Li(xizi)2+λixi x i x^i xi 表示 x \mathbf x x 的第 i i i 个分量),其对 x i x^i xi 的偏导为:

    x i > 0 x^i>0 xi>0 时, ( x i − z i ) ⋅ L + λ = 0 (x^i-z^i)\cdot L+\lambda=0 (xizi)L+λ=0,得 x i = − λ L + z i x^{i}=-\frac{\lambda}{L}+z^i xi=Lλ+zi,结合条件 x i > 0 x^i>0 xi>0才得到这样的结论,因此有: z i > λ L z^i\gt \frac{\lambda }{L} zi>Lλ
    同样地, x i < 0 x^i\lt 0 xi<0 时, ( x i − z i ) ⋅ L − λ = 0 (x^i-z^i)\cdot L-\lambda=0 (xizi)Lλ=0,得 x i = λ L + z i x^{i}=\frac{\lambda}{L}+z^i xi=Lλ+zi,结合条件 x i < 0 x^i<0 xi<0才得到这样的结论,因此有: z i < − λ L z^i\lt -\frac{\lambda }{L} zi<Lλ

你可能感兴趣的:(面试)