Python自动绘制散点图并进行线性回归

将散点图、线性回归打包成一个函数:

from plotnine import *
import statsmodels.formula.api as smf

def linear_regression(y,x,df): # y和x均为连续变量:y为因变量而x为自变量
    """散点图"""
    print(ggplot(df,aes(y=y, x=x))  
              +geom_point(size=3))
    """线性回归""" 
    model = smf.ols(formula = f"{y}~{x}",         
                    data = df).fit()
    print(model.summary()) # R-squared=0.05说明自变量仅能解释5%的因变量变异

以上函数linear_regression(y,x,df)单独放在linear_regression.py文件中,参考http://t.csdn.cn/ovlRZ

即可实现统计自动化。

 准备数据:

import pandas as pd

df = pd.DataFrame()
df['RBC'] = [5, 2, 8, 9, 10]
df['Hb'] = [15, 22, 7, 8, 5]

df

数据有两列(即两个连续变量,分别为“RBC”、“Hb”),如下

   RBC  Hb
0    5  15
1    2  22
2    8   7
3    9   8
4   10   5

调用linear_regression(y,x,df)即可自动绘制散点图,并进行线性回归:

linear_regression(y='Hb',x='RBC',df=df)

结果如下:

Python自动绘制散点图并进行线性回归_第1张图片

                            OLS Regression Results                            
==============================================================================
Dep. Variable:                     Hb   R-squared:                       0.973
Model:                            OLS   Adj. R-squared:                  0.963
Method:                 Least Squares   F-statistic:                     106.2
Date:                Sun, 18 Jun 2023   Prob (F-statistic):            0.00195
Time:                        16:27:09   Log-Likelihood:                -7.2944
No. Observations:                   5   AIC:                             18.59
Df Residuals:                       3   BIC:                             17.81
Df Model:                           1                                         
Covariance Type:            nonrobust                                         
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
Intercept     25.7944      1.520     16.966      0.000      20.956      30.633
RBC           -2.1168      0.205    -10.307      0.002      -2.770      -1.463
==============================================================================
Omnibus:                          nan   Durbin-Watson:                   2.992
Prob(Omnibus):                    nan   Jarque-Bera (JB):                0.547
Skew:                          -0.766   Prob(JB):                        0.761
Kurtosis:                       2.475   Cond. No.                         19.0
==============================================================================

线性回归方程为Hb=-2.1168*RBC+25.7944,其中-2.1168是系数,+25.7944是截距。

你可能感兴趣的:(统计,python,线性回归,开发语言)