R|可视化|决策树

决策树对于事物的分类特别有用,能够对新出现的事物给出正确的分类,比如生死,如何进行治疗等。比起文本描述的规则, 图形的方式展现分类结果就非常的直观——决策树结果可视化。

核心函数:

  • rpart.plot()

  • rattle::fancyRpartPlot()

方法一:rpart.plot()

示例数据:ptitanic(rpart.plot)

ptitanic:不包含乘客姓名和其他细节的Titanic数据。

二叉树

p_load(rpart.plot)
data("ptitanic")
head(ptitanic,3)
##   pclass survived    sex     age sibsp parch
## 1    1st survived female 29.0000     0     0
## 2    1st survived   male  0.9167     1     2
## 3    1st     died female  2.0000     1     2
binary.model <- rpart(survived~.,data = ptitanic,cp=0.02)
rpart.plot(binary.model,
           type = 1,#调整样式
           box.palette = "yellow" #调整节点颜色
           )
R|可视化|决策树_第1张图片
rpart.plot

区分不同的节点:box.palette设置多种颜色

rpart.plot(binary.model,type = 2,box.palette = c("pink","gray"))
R|可视化|决策树_第2张图片
调整颜色

多个分叉(连续变量)

anova.model <- rpart(Mileage~.,data = cu.summary)
rpart.plot(anova.model,
           shadow.col = "gray",
           main="miles per gallon\n(continuous response)\n")
R|可视化|决策树_第3张图片
连续变量

多个分叉(分类变量)

multi.class.model <- rpart(Reliability~.,data = cu.summary)
rpart.plot(multi.class.model,
           main="vehicle reliability\n(multi class response")
R|可视化|决策树_第4张图片
分类变量

方法二:rattle::fancyRpartPlot()

p_load(rattle)
p_load(RColorBrewer)
p_load(biotops)

# 模拟数据
set.seed(42)
ds     <- weather
target <- "RainTomorrow"
risk   <- "RISK_MM"
ignore <- c("Date", "Location", risk)
vars   <- setdiff(names(ds), ignore)
nobs   <- nrow(ds)
form   <- formula(paste(target, "~ ."))
train  <- sample(nobs, 0.7*nobs)
test   <- setdiff(seq_len(nobs), train)
actual <- ds[test, target]
risks  <- ds[test, risk]
fit <- rpart(form,data = ds[train,vars])
fancyRpartPlot(fit,
               type = 0,#调整颜色
               palettes = c("Greys","Blues")#调整颜色
               )

R|可视化|决策树_第5张图片
fancyRpartPlot

参考文献:
rpart.plot: https://CRAN.R-project.org/package=rpart.plot
rattle: https://rattle.togaware.com/

你可能感兴趣的:(R|可视化|决策树)