逻辑回归 Logistics Regression

Regression

Linear Regression

当x和y都是连续性变量时,使用回归模型。

y = β0 + β1x + ε

x: explanatory/predictor

y: response

ε: error, but always drop off since the model is focus on average outcome

Coefficient: Least Square Estimation,找到一条直线minimize the sum of the squared residuals.(SSE)

S(β0, β1) = [y- (β0+β1x)]^2, 求β0, β1偏导, 使值等于0,解方程可得Coefficient

Why Least Square Estimation?

1. It is the most commonly used method.

2. Computing the least squares line is widely supported in statistical software.

3. In many applications, a residual twice as large as another residual is more than twice as bad. For example, being off by 4 is usually more than twice as bad as being off by 2. Squaring the residuals accounts for this discrepancy.

Logistics Regression

Predictor variables: continuous or categorical;response: two-level categorical

逻辑回归可以干嘛?

有监督学习,可以用来预测分类。用历史的数据训练出回归模型,然后用它预测新数据所属的分类。Eg: 新的event属于A 以及 不属于A的可能性

原理

逻辑回归 Logistics Regression_第1张图片

 

Conditions

1. Each outcome Yi is independent of the other outcomes.

2. Each predictor xi is linearly related to logit(pi) if all other predictors are held constant

Building the logistic model with many variables

A lower Akaike information criterion (AIC) through a backward elimination strategy.

p-values are calculated using the normal distribution rather than the t-distribution.

你可能感兴趣的:(逻辑回归,人工智能,机器学习)