Logistic Regression+ROC学习

zhuang xiaojin

8月-16-2021

Step 1: Load the Data

rm(list = ls())
data("iris")
head(iris)
##   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
## 1          5.1         3.5          1.4         0.2  setosa
## 2          4.9         3.0          1.4         0.2  setosa
## 3          4.7         3.2          1.3         0.2  setosa
## 4          4.6         3.1          1.5         0.2  setosa
## 5          5.0         3.6          1.4         0.2  setosa
## 6          5.4         3.9          1.7         0.4  setosa

Step 2: Fit the Logistic Regression Model

#make this example reproducible
set.seed(1)

#Use 70% of dataset as training set and remaining 30% as testing set
sample <- sample(c(TRUE, FALSE), nrow(iris), replace=TRUE, prob=c(0.6,0.3))
train <- iris[sample, ]
test <- iris[!sample, ] 
names(iris)
## [1] "Sepal.Length" "Sepal.Width"  "Petal.Length" "Petal.Width"  "Species"
paste(names(iris),collapse = "+")
## [1] "Sepal.Length+Sepal.Width+Petal.Length+Petal.Width+Species"
#fit logistic regression model
model <- glm(Species~Sepal.Length+Sepal.Width+Petal.Length+Petal.Width, 
             family="binomial", 
             data=train)

Step 3: Calculate the AUC of the Model

接下来,我们将使用pROC包中的auc()函数来计算模型的 AUC。此函数使用以下语法:

auc(response, predicted)

以下是在我们的示例中如何使用此函数:

#calculate probability of default for each individual in test dataset
predicted <- predict(model, test, type="response")

#calculate AUC
library(pROC)
roc1 <- roc(test$Species,predicted);roc1  # Build a ROC object and compute the AUC
## 
## Call:
## roc.default(response = test$Species, predictor = predicted)
## 
## Data: predicted in 19 controls (test$Species setosa) < 16 cases (test$Species versicolor).
## Area under the curve: 1
auc(test$Species, predicted)
## Area under the curve: 1
plot(x = roc(response = test$Species, predictor = predicted, 
             percent = TRUE, ci = TRUE, of = "se", 
             sp = seq(0, 100, 5)), ci.type="shape")
image.png
plot(roc1, # roc1换为roc2,更改参数可绘制roc2曲线
       print.auc=TRUE,print.auc.x=0.5,print.auc.y=0.5, # 图像上输出AUC值,坐标为(x,y)
       auc.polygon=TRUE, auc.polygon.col="skyblue", # 设置ROC曲线下填充色
       max.auc.polygon=TRUE, # 填充整个图像
       grid=c(0.1,0.2), grid.col=c("green", "red"), # 设置间距为0.1,0.2,线条颜色
       print.thres=TRUE, print.thres.cex=0.8,  # 图像上输出最佳截断值,字体缩放0.8倍
       legacy.axes=T)  # 使横轴从0到1,表示为1-特异度

ggroc1 <- ggroc(roc1,
                  legacy.axes = TRUE,
                  linetype = 2, size = 1, # 设置曲线线型和大小 
                  colour = "#CC6666"); ggroc1
image.png
image.png

参考来源

近来发现一个学习统计的优秀网站、这个老师的代码超级简洁明了,让我们一起围观下。

  • Statology is a site that makes learning statistics easy by explaining topics in simple and straightforward ways

你可能感兴趣的:(Logistic Regression+ROC学习)