对数几率回归模型推导

Sigmoid函数

y = 1 1 + e − z y=\frac{1}{1+e^{-z}} y=1+ez1
之所以选择sigmoid函数是因为在二分类任务中,如果单纯采用阶梯函数的话,其不连续及在 z = 0 z=0 z=0处不可导的性质为后续的优化带来麻烦,所以采用sigmoid函数作为一个替代。

对数几率回归模型推导

原理不再赘述,以西瓜书为基础(P59),对数几率回归的似然函数为
l ( w , b ) = ∑ i = 1 m ln ⁡ p ( y i ∣ x i ; w , b ) = ∑ i = 1 m ln ⁡ ( y i p 1 ( x i ; β ^ + ( 1 − y i ) p 0 ( x i ^ ; β ) ) ) = ∑ i = 1 m ln ⁡ ( y i e β T x i ^ 1 + e β T x i ^ + ( 1 − y i ) 1 1 + e β T x i ^ ) = − ∑ i = 1 m ln ⁡ ( ( y i − 1 ) 1 1 + e β T x i ^ − y i e β T x i ^ 1 + e β T x i ^ ) \begin{aligned} l(w,b)&=\sum_{i=1}^{m}\ln p(y_i|x_i;w,b) \\ &=\sum_{i=1}^{m} \ln(y_ip_1(\hat{x_i;\beta}+(1-y_i)p_0(\hat{x_i};\beta))) \\ &=\sum_{i=1}^{m} \ln(y_i\frac{e^{\beta^T\hat{x_i}}}{1+e^{\beta^T\hat{x_i}}}+(1-y_i)\frac{1}{1+e^{\beta^T\hat{x_i}}}) \\ &=-\sum_{i=1}^{m} \ln((y_i-1)\frac{1}{1+e^{\beta^T\hat{x_i}}}-y_i\frac{e^{\beta^T\hat{x_i}}}{1+e^{\beta^T\hat{x_i}}}) \end{aligned} l(w,b)=i=1mlnp(yixi;w,b)=i=1mln(yip1(xi;β^+(1yi)p0(xi^;β)))=i=1mln(yi1+eβTxi^eβTxi^+(1yi)1+eβTxi^1)=i=1mln((yi1)1+eβTxi^1yi1+eβTxi^eβTxi^)
考虑 y 1 = 0 y_1=0 y1=0 y i = 1 y_i=1 yi=1两种情况:

  • y = 1 y=1 y=1时:
    l ( w , b ) = ∑ i = 1 m ( β T x i ^ − ln ⁡ ( 1 + e β T x i ^ ) ) \begin{aligned} l(w,b)&=\sum_{i=1}^{m} (\beta^T\hat{x_i}-\ln(1+e^{\beta^T\hat{x_i}})) \end{aligned} l(w,b)=i=1m(βTxi^ln(1+eβTxi^))
  • y = 0 y=0 y=0时:
    l ( w , b ) = ∑ i = 1 m ( − ln ⁡ ( 1 + e β T x i ^ ) ) \begin{aligned} l(w,b)&=\sum_{i=1}^{m} (-\ln(1+e^{\beta^T\hat{x_i}})) \end{aligned} l(w,b)=i=1m(ln(1+eβTxi^))
    综上, l ( w , b ) l(w,b) l(w,b)表示为:
    l ( w , b ) = ∑ i = 1 m ( y i β T x i ^ − ln ⁡ ( 1 + e β T x i ^ ) ) l(w,b)=\sum_{i=1}^{m} (y_i \beta^T\hat{x_i}-\ln(1+e^{\beta^T\hat{x_i}})) l(w,b)=i=1m(yiβTxi^ln(1+eβTxi^))

你可能感兴趣的:(回归,机器学习,逻辑回归)