几个算法比较-草稿版

算法名称 模型 策略 求解算法
线性回归 f ( x ) = W T ⋅ x + b f(x)=W^T \cdot x + b f(x)=WTx+b 最小二乘 L ( W , b ) = ( f ( x ) − y ) 2 L(W,b) = (f(x)-y)^2 L(W,b)=(f(x)y)2 梯度下降、牛顿法
LOSSO回归 f ( x ) = W T ⋅ x + b f(x)=W^T \cdot x + b f(x)=WTx+b 最小二乘 L ( W , b ) = ( f ( x ) − y ) 2 + λ ∣ ∣ W ∣ ∣ 1 L(W,b) = (f(x)-y)^2+\lambda ||W||_1 L(W,b)=(f(x)y)2+λ∣∣W1 坐标下降法
岭回归 f ( x ) = W T ⋅ x + b f(x)=W^T \cdot x + b f(x)=WTx+b 最小二乘 L ( W , b ) = ( f ( x ) − y ) 2 + 1 2 λ W 2 L(W,b) = (f(x)-y)^2+\frac{1}{2}\lambda W^2 L(W,b)=(f(x)y)2+21λW2 梯度下降
逻辑回归 f ( x ) = 1 1 + e − ( W T ⋅ x + b ) f(x)=\frac{1}{1+e^{-(W^T\cdot x + b)}} f(x)=1+e(WTx+b)1 交叉熵损失 − l n p ( y ∣ x ) = − 1 m ∑ i = 1 m ( y l n y ^ + ( 1 − y ) l n ( 1 − y ^ ) ) -lnp(y|x)=-\dfrac{1}{m}\sum_{i=1}^m(yln\hat y+(1-y)ln(1-\hat y)) lnp(yx)=m1i=1m(ylny^+(1y)ln(1y^)), y ^ = 1 1 + e − ( W T ⋅ X + b ) \hat y = \dfrac{1}{1+e^{-(W^T \cdot X + b)}} y^=1+e(WTX+b)1 梯度下降、牛顿法
感知机 f ( x ) = s i g n ( W T ⋅ x + b ) f(x)=sign(W^T \cdot x+b) f(x)=sign(WTx+b),sign是一个符号函数 让分错的点距离当前分离超平面距离尽可能的小 L ( w , b ) = − ∑ x i ∈ M y i ( w x i + b ) L(w,b)=-\sum_{x_i \in M} y_i(wx_i+b) L(w,b)=xiMyi(wxi+b) M是所有分类错误点的集合 随机梯度下降:每找到一个错误的样本更新一次参数
K近邻 y = a r g m a x ∑ x i ∈ N K ( x ) I ( y i = c j ) y=argmax \sum_{x_i\in N_K(x)} I(y_i=c_j) y=argmaxxiNK(x)I(yi=cj) - -
朴素贝叶斯 P ( Y = c k ∣ X = x ) = P ( Y = c k ) ∏ j P ( X ( j ) = x ( j ) ∣ Y = c k ) ∑ k P ( Y = c k ) ∏ j P ( X ( j ) = x ( j ) ∣ Y = c k ) P(Y=c_k|X=x)=\frac{P(Y=c_k)\prod_j P(X^{(j)}=x^{(j)}|Y=c_k)}{\sum_kP(Y=c_k)\prod_j P(X^{(j)}=x^{(j)}|Y=c_k)} P(Y=ckX=x)=kP(Y=ck)jP(X(j)=x(j)Y=ck)P(Y=ck)jP(X(j)=x(j)Y=ck),条件独立性假设 经验风险期望最小化: y = f ( x ) = a r g m a x c k P ( Y = c k ) ∏ j P ( X ( j ) = x ( j ) ∣ Y = c k ) y=f(x)=argmax_{c_k}P(Y=c_k)\prod_j P(X^{(j)}=x^{(j)}|Y=c_k) y=f(x)=argmaxckP(Y=ck)jP(X(j)=x(j)Y=ck) 极大似然估计: P ( Y = c k ) = ∑ i = 1 N I ( y i = c k ) N P(Y=c_k)=\frac{\sum_{i=1}^N I(y_i=c_k)}{N} P(Y=ck)=Ni=1NI(yi=ck), P ( X ( j ) = a j l ∣ Y = c k ) = ∑ i = 1 N I ( x i ( j ) = a j l , y i = c k ) ∑ i = 1 N I ( y i = c k ) P(X^{(j)}=a_{jl}|Y=c_k)=\frac{\sum_{i=1}^N I(x_i^{(j)}=a_{jl},y_i=c_k)}{\sum_{i=1}^N I(y_i=c_k)} P(X(j)=ajlY=ck)=i=1NI(yi=ck)i=1NI(xi(j)=ajl,yi=ck)
SVM-线性可分 f ( x ) = s i g n ( W T ⋅ x + b ) f(x)=sign(W^T \cdot x+b) f(x)=sign(WTx+b) 间隔最大化/合页损失: m i n 1 2 ∣ ∣ w ∣ ∣ 2 min\dfrac{1}{2}||w||^2 min21∣∣w2同时包含约束: y i ( W T ⋅ x i + b ) − 1 > = 0 y_i(W^T\cdot x_i+b)-1>=0 yi(WTxi+b)1>=0 拉格朗日、SMO
SVM-线性近似可分 f ( x ) = s i g n ( W T ⋅ x + b ) f(x)=sign(W^T \cdot x+b) f(x)=sign(WTx+b) 间隔最大化/合页损失: m i n 1 2 ∣ ∣ w ∣ ∣ 2 + C ∑ i = 1 N ξ i min\frac{1}{2}||w||^2+C\sum_{i=1}^N\xi_i min21∣∣w2+Ci=1Nξi,同时包含约束: y i ( w ⋅ x i + b ) > = 1 − ξ i y_i(w\cdot x_i+b)>=1-\xi_i yi(wxi+b)>=1ξi 拉格朗日、SMO
SVM-支持向量机 f ( x ) = s i g n ( W T ⋅ x + b ) f(x)=sign(W^T \cdot x+b) f(x)=sign(WTx+b) 间隔最大化/合页损失: m i n α 1 2 ∑ i = 1 N ∑ j = 1 N α i α j y i y j K ( x i ⋅ x j ) − ∑ i = 1 N α i min_{\alpha}\dfrac{1}{2}\sum_{i=1}^N\sum_{j=1}^N\alpha_i\alpha_jy_iy_jK(x_i\cdot x_j)-\sum_{i=1}^N\alpha_i minα21i=1Nj=1NαiαjyiyjK(xixj)i=1Nαi同时包含约束: ∑ i = 1 N α i y i = 0 \sum_{i=1}^N\alpha_iy_i=0 i=1Nαiyi=0 0 < = α i < = C 0<=\alpha_i<=C 0<=αi<=C 拉格朗日、SMO

你可能感兴趣的:(机器学习,算法)