【Python】绘制R中线性回归诊断图

【参考】

1.  如何在Python绘制与R一样的线性回归诊断图?

2. 6 ways to run a "simple" regression(使用6种工具)

(1)原文:https://underthecurve.github.io/jekyll/update/2016/07/01/one-regression-six-ways.html#Python

(2)脚本:https://github.com/OpenNewsLabs/one-regression-six-ways/blob/master/Python/statsmodels_method.py

3. 4种图的含义

 

【补充】

import os
import math
import pandas as pd
import numpy as np
import matplotlib
import matplotlib.pyplot as plt
import statsmodels.api as sm
import statsmodels.formula.api as smf

results = pd.DataFrame({'index': y, # y实际值
                        'resids': lm.resid, # 残差
                        'std_resids': lm.resid_pearson, # 方差标准化的残差
                        'fitted': lm.predict() # y预测值
                       })
print(results.head()) 

# 1. 图表分别显示
## raw residuals vs. fitted
# 残差拟合图:横坐标是拟合值,纵坐标是残差。
residsvfitted = plt.plot(results['fitted'], results['resids'],  'o')
l = plt.axhline(y = 0, color = 'grey', linestyle = 'dashed') # 绘制y=0水平线
plt.xlabel('Fitted values')
plt.ylabel('Residuals')
plt.title('Residuals vs Fitted')
plt.show(residsvfitted)


## q-q plot
# 残差QQ图:用来描述残差是否符合正态分布。
qqplot = sm.qqplot(results['std_resids'], line='s')
plt.xlabel('Theoretical quantiles')
plt.ylabel('Sample quantiles')
plt.title('Normal Q-Q')
plt.show(qqplot)


## scale-location
# 标准化的残差对拟合值:对标准化残差平方根和拟合值作图,横坐标是拟合值,纵坐标是标准化后的残差平方根。
scalelocplot = plt.plot(results['fitted'], abs(results['std_resids'])**.5,  'o')
plt.xlabel('Fitted values')
plt.ylabel('Square Root of |standardized residuals|')
plt.title('Scale-Location')
plt.show(scalelocplot)


## residuals vs. leverage
# 标准化残差对杠杆值:通常用Cook距离度量的回归影响点。
residsvlevplot = sm.graphics.influence_plot(lm, criterion = 'Cooks', size = 2)
plt.xlabel('Obs.number')
plt.ylabel("Cook's distance")
plt.title("Cook's distance")
plt.show(residsvlevplot)
plt.close()


# 2 绘制在一张画布
fig = plt.figure(figsize = (10, 10), dpi = 100)

ax1 = fig.add_subplot(2, 2, 1)
ax1.plot(results['fitted'], results['resids'],  'o')
l = plt.axhline(y = 0, color = 'grey', linestyle = 'dashed')
ax1.set_xlabel('Fitted values')
ax1.set_ylabel('Residuals')
ax1.set_title('Residuals vs Fitted')


ax2 = fig.add_subplot(2, 2, 2)
sm.qqplot(results['std_resids'], line='s', ax = ax2)
ax2.set_title('Normal Q-Q')


ax3 = fig.add_subplot(2, 2, 3)
ax3.plot(results['fitted'], abs(results['std_resids'])**.5,  'o')
ax3.set_xlabel('Fitted values')
ax3.set_ylabel('Sqrt(|standardized residuals|)')
ax3.set_title('Scale-Location')

ax4 = fig.add_subplot(2, 2, 4)
sm.graphics.influence_plot(lm, criterion = 'Cooks', size = 2, ax = ax4)

plt.tight_layout()

 

你可能感兴趣的:(编程,Python)