机器学习导论(张志华):正定核应用

前言

这个笔记是北大那位老师课程的学习笔记,讲的概念浅显易懂,非常有利于我们掌握基本的概念,从而掌握相关的技术。

basic concepts

If a function is positive definite,then matrix is P.S.D.
x 1 , , , , x n ⊂ X = > K 0 ( x i , x j ) = g ( x i ) g ( x j ) {x_1,,,,x_n} \subset X => K_0(x_i,x_j)=g(x_i)g(x_j) x1,,,,xnX=>K0(xi,xj)=g(xi)g(xj)
= > k 0 = [ g ( x 1 ) , . . , g ( x n ) ] ′ ∗ [ g ( x 1 ) , . . . , g ( x n ) ] => k_0 =[g(x_1),..,g(x_n)]' *[g(x_1 ),..., g(x_n)] =>k0=[g(x1),..,g(xn)][g(x1),...,g(xn)]

Thm:

Let F be a probalility measure on the half low Pat such that 0 < ∫ 0 ∞ s d F ( s ) < ∞ 0< \int _0^{\infin} s dF(s)<\infin 0<0sdF(s)<
and l(F,u)= ∫ 0 ∞ e x p ( − t s ϕ ) d F \int_0^{\infin} exp(-ts\phi)dF 0exp(tsϕ)dF is P.D for all t>0;
example:
polynomial kernel.
RBF Gauss kernel.
two advantages: 1. l o w d i m e n s i o n − > ∞ d i m e n s i o n 1.low dimension-> \infin dimension 1.lowdimension>dimension
2. n o r m a l i z e . 2.normalize. 2.normalize.

Levy distribution

( B / 2 ∗ p i ) 1 / 2 e x p ( s q r t ( 2 B ) ) ∣ f ( s ) = s q r t ( t / 2 ∗ p i ) u − 3 / 2 e x p ( − t / 2 u ) d u (B/2*pi)^1/2 exp(sqrt(2B))| f(s)=sqrt(t/2*pi)u^-{3/2}exp(-t/2u)du (B/2pi)1/2exp(sqrt(2B))f(s)=sqrt(t/2pi)u3/2exp(t/2u)du
if ϕ ( x ) = K + 1 / 2 \phi (x)=K^{+1/2} ϕ(x)=K+1/2
K 1 2 ∗ K 1 2 = K T K^{\frac{1}{2}}* K^{\frac{1}{2}}=K^T K21K21=KT

Thm

let k X ∗ X − > R k X*X -> R kXX>R be a P.D kernel then exists a HILBERT space H and from x->H such that ϕ ( H ) \phi(H) ϕ(H)
∀ x , y ⊂ x , K ( x , y ) = < ϕ ( x ) , ϕ ( y ) > \forall x,y \subset x,K(x,y)=<\phi(x),\phi(y)> x,yx,K(x,y)=<ϕ(x),ϕ(y)> three kernels.

你可能感兴趣的:(数据科学)