python实现逻辑回归

参考

https://zhuanlan.zhihu.com/p/27699336?utm_source=weibo&utm_medium=social
https://zhuanlan.zhihu.com/p/27188729
https://www.cnblogs.com/Finley/p/5325417.html
https://zhuanlan.zhihu.com/p/30659982

概念

逻辑回归可将 [ − ∞ , ∞ ] [-\infty,\infty] [,]范围的输出映射到 [ 0 , 1 ] [0,1] [0,1]范围中,而 [ 0 , 1 ] [0,1] [0,1]可理解为某种结果的概率 P P P
Logistic模型:
P { Y = 1 } = 1 1 + e − z = 1 1 + e − ( β 0 + β 1 x 1 + . . . + β n x n ) P\{Y=1\}=\frac{1}{1+e^{-z}}=\frac{1}{1+e^{-(\beta_0+\beta_1x_1+...+\beta_nx_n)}} P{Y=1}=1+ez1=1+e(β0+β1x1+...+βnxn)1
Logit模型为Logistic模型的逆变换:
l n P { Y = 1 } P { Y = 0 } = β 0 + β 1 x 1 + . . . + β n x n ln\frac{P\{Y=1\}}{P\{Y=0\}}=\beta_0+\beta_1x_1+...+\beta_nx_n lnP{Y=0}P{Y=1}=β0+β1x1+...+βnxn
其中, P { Y = 1 } P { Y = 0 } = O d d s = P 1 − P \frac{P\{Y=1\}}{P\{Y=0\}}=Odds=\frac{P}{1-P} P{Y=0}P{Y=1}=Odds=1PP

代码

# encoding: utf-8
import pylab as pl
import pandas as pd
import numpy as np
import statsmodels.api as sm

# 读取数据
df0 = pd.read_excel('')

# 数据预处理
df0['intercept'] = 1.0  # Logit()函数不会自动添加常数项[1],需要手动添加常数项

x1_ranks = pd.get_dummies(df0['x1_rank'], prefix='x1_rank')  # 利用pandas中的get_dummies()函数对分类变量进行哑变量/虚拟变量化操作
# print(x1_ranks)
# 选取某一列为基变量
cols_to_keep = ['y', 'x2', 'intercept']
df1 = df0[cols_to_keep].join(x1_ranks.ix[:, :'x1_rank_3'])
# print(df1)

# 模型拟合
df_xColumns = df0.columns[1:]  # 提取自变量的列名
logit = sm.Logit(df1['y'], df1[df_xColumns])  # 模型实例化
result = logit.fit()  # 模型拟合
print(result.summary())  # 输出拟合结果

你可能感兴趣的:(python,机器学习,传统机器学习,python,逻辑回归)