[转] R 简单线性回归 18.06.18

原文:https://mp.weixin.qq.com/s/SGIvgqX7mLv563fqYmze9g


-目标: 建立平均每周锻炼时间(minute)与平均肺活量(VC )回归方程

-输入:

minute <- c(110,118,120,123,131,137,144,149,152,160)
VC <- c(5283,5299,5358,5292,5602,6014,5830,6102,6075,6411)
lrdata <- data.frame(minute,VC)


model <- lm(VC~minute,data=lrdata)
summary(model)

- 结果:

Call:
lm(formula = VC ~ minute, data = lrdata)
Residuals:
    Min      1Q  Median      3Q     Max 
-162.71  -64.39  -30.81   62.18  225.39 
Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept) 2521.184    342.088   7.370 7.84e-05 ***
minute        23.850      2.528   9.435 1.31e-05 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 126.2 on 8 degrees of freedom
Multiple R-squared:  0.9175,    Adjusted R-squared:  0.9072 
F-statistic: 89.01 on 1 and 8 DF,  p-value: 1.309e-05  

- 输入:

cbind( coef=coef(modelle), confint(modelle))

- 结果: 系数 + 置信区间

  coef      2.5 %     97.5 %
(Intercept) 2521.18375 1732.32765 3310.03985
minute        23.84982   18.02041   29.67924

- 输入:

predict(model,newdata=data.frame(minute=c(140,145,150)),interval = "confidence")
#求给定minute下VC均值的预测值和置信区间

- 结果:

      fit      lwr      upr
1 5860.159 5762.545 5957.773
2 5979.408 5868.588 6090.228
3 6098.657 5969.302 6228.013

- 输入:

library(ggplot2)
pre <- predict(model,newdata=data.frame(minute),interval = "prediction")
int <- predict(model,newdata=data.frame(minute),interval = "confidence")
    # 求个值的预测区间:interval = {"none","prediction","confidence"}分别表示预测值、预测值+预测区间、预测值加期望值
    # 举例:95% 置信区间:(562.931, 575.483),95% 预测区间:(556.186, 582.227)
    # 置信区间解释为:有95%的把握断言,断裂强度平均值将落入(562.931,575.438)之内;
    # 预测区间解释为:95%的把握断言,任何一块钢板的断裂强度将落入(556.186,582.227)之内,一般还是使用预测区间;
newlr <- cbind(lrdata,pre)
ggplot(newlr,aes(x=minute,y=VC))+geom_point(size=3,colour="blue")+  # 原始数据点图
geom_segment(aes(x=minute, xend=minute, y=VC, yend=fit),size=1,linetype=2,colour="red")+  # 预测值~实际值虚线
geom_smooth(method=lm,se=T)+   # 添加拟合图形,其中包括模型曲线、置信区间
geom_line(aes(y=lwr), color = "red", linetype = "dashed",size=1)+  # 模型预测区间的下沿
geom_line(aes(y=upr), color = "red", linetype = "dashed",size=1)   # 模型预测区间的上沿

- 结果:

[转] R 简单线性回归 18.06.18_第1张图片



以下内容,暂时没有学习

标化回归系数——方法1:

- 输入:

install.packages("sjstats")
library(sjstats)
std_beta(model,type="std",ci.lvl=0.95)

- 结果:

term     std.estimate std.error conf.low conf.high
minute        0.958     0.102    0.759      1.16

标化回归系数——方法2:

- 输入:

install.packages("lm.beta")
library(lm.beta)
stdco <- lm.beta(modelle)
summary(stdco)

- 结果:

Call:
lm(formula = VC ~ minutedata = lrdata)
Residuals:
    Min      1Q  Median      3Q     Max 
-162.71  -64.39  -30.81   62.18  225.39 
Coefficients:
             Estimate Standardized Std. Error t value Pr(>|t|)    
(Intercept) 2521.1837       0.0000   342.0879   7.370 7.84e-05 ***
minute        23.8498       0.9579     2.5279   9.435 1.31e-05 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error126.2 on 8 degrees of freedom
Multiple R-squared:  0.9175,    Adjusted R-squared:  0.9072 
F-statistic: 89.01 on 1 and 8 DF,  p-value1.309e-05 

你可能感兴趣的:(R,案例)