目录
(a)拟合模型:
(b)数据变换拟合模型2
(c)因此画残差图然后去掉outlier[0,1,22],在进行拟合模型3
data=pd.read_csv('C:/Users/可乐怪/Desktop/csv/P174.csv')
data.insert(1,'constant',1)
model=sm.OLS(data['R'],data[['constant','P']]).fit()
print(model.summary())
OLS Regression Results
==============================================================================
Dep. Variable: R R-squared: 0.126
Model: OLS Adj. R-squared: 0.104
Method: Least Squares F-statistic: 5.636
Date: Sun, 07 Nov 2021 Prob (F-statistic): 0.0226
Time: 12:21:40 Log-Likelihood: -157.64
No. Observations: 41 AIC: 319.3
Df Residuals: 39 BIC: 322.7
Df Model: 1
Covariance Type: nonrobust
==============================================================================
coef std err t P>|t| [0.025 0.975]
------------------------------------------------------------------------------
constant 7.6041 2.406 3.160 0.003 2.737 12.471
P 0.3527 0.149 2.374 0.023 0.052 0.653
==============================================================================
Omnibus: 18.608 Durbin-Watson: 0.340
Prob(Omnibus): 0.000 Jarque-Bera (JB): 27.578
Skew: 1.308 Prob(JB): 1.03e-06
Kurtosis: 6.050 Cond. No. 21.5
==============================================================================
R^2=0.126,所以拟合是不充分
P1=[1/i for i in data['P']]
data['P1']=P1
R1=[data['R'][i]/data['P'][i] for i in range(41)]
data['R1']=R1
model2=sm.OLS(data['R1'],data[['constant','P1']]).fit()
print(model2.summary())
OLS Regression Results
==============================================================================
Dep. Variable: R1 R-squared: 0.497
Model: OLS Adj. R-squared: 0.484
Method: Least Squares F-statistic: 38.47
Date: Sun, 07 Nov 2021 Prob (F-statistic): 2.71e-07
Time: 12:22:08 Log-Likelihood: -63.837
No. Observations: 41 AIC: 131.7
Df Residuals: 39 BIC: 135.1
Df Model: 1
Covariance Type: nonrobust
==============================================================================
coef std err t P>|t| [0.025 0.975]
------------------------------------------------------------------------------
constant 0.2577 0.255 1.012 0.318 -0.257 0.773
P1 5.4243 0.875 6.202 0.000 3.655 7.193
==============================================================================
Omnibus: 2.318 Durbin-Watson: 0.548
Prob(Omnibus): 0.314 Jarque-Bera (JB): 1.285
Skew: 0.315 Prob(JB): 0.526
Kurtosis: 3.597 Cond. No. 4.96
==============================================================================
R^2= 0.497,经过数据变换后模型拟合效果得到了很大提升,但是拟合效果还是不好
outliers = model.get_influence()
fitvalue=model.fittedvalues
res=outliers.resid_studentized_internal
color=['r' if abs(i)>2 else 'b' for i in res]
for i in range(len(res)):
if abs(res[i])>2:
plt.annotate("(%s,%s)" % (data['Magazine'][i],round(res[i],2)),xy=(fitvalue[i],res[i]),size=15)
plt.scatter(fitvalue,res,c=color)
plt.xlabel('Predicted Values',size=15)
plt.ylabel('Residuals',size=15)
plt.title('Figure1',size=20)
plt.show()
model3=sm.OLS(data_new['P'],data_new[['constant','R']]).fit()
print(model3.summary())
OLS Regression Results
==============================================================================
Dep. Variable: P R-squared: 0.680
Model: OLS Adj. R-squared: 0.671
Method: Least Squares F-statistic: 76.52
Date: Sun, 07 Nov 2021 Prob (F-statistic): 1.95e-10
Time: 12:22:22 Log-Likelihood: -99.080
No. Observations: 38 AIC: 202.2
Df Residuals: 36 BIC: 205.4
Df Model: 1
Covariance Type: nonrobust
==============================================================================
coef std err t P>|t| [0.025 0.975]
------------------------------------------------------------------------------
constant 3.2227 0.808 3.990 0.000 1.585 4.861
R 0.5493 0.063 8.747 0.000 0.422 0.677
==============================================================================
Omnibus: 7.557 Durbin-Watson: 2.507
Prob(Omnibus): 0.023 Jarque-Bera (JB): 6.292
Skew: 0.805 Prob(JB): 0.0430
Kurtosis: 4.176 Cond. No. 19.1
==============================================================================
outliers2 = model3.get_influence()
fitvalue2=model3.fittedvalues
res2=outliers2.resid_studentized_internal
plt.scatter(fitvalue2,res2,color='r')
plt.xlabel('Predicted',size=15)
plt.ylabel('Residuals',size=15)
plt.title('Figure2',size=20)
plt.show()
此时R^2= 0.68,拟合效果比较可以了,残差图也没有随P变化的趋势