R语言实现决策树

R语言实现决策树

提示:本文使用R语言实现决策树,并对决策树结构图进行美化


文章目录

  • R语言实现决策树
  • 数据介绍
  • 一、相关R包的下载
  • 二、实现过程
    • 1.数据读取
    • 2.训练集与验证集划分
    • 3.构建决策树并绘制图形
    • 4.测试模型
  • 总结


数据介绍

group就是分类结果:case或control两个标签;其余为自变量
R语言实现决策树_第1张图片
部分数据内容:

> mydata
    age    sex dishistory index1 index2 index3 index4 index5 index6   group
1    80   male        yes     26    4.2    1.4   53.0   50.0   0.10    case
2    80   male        yes     22    6.6    2.0   63.1   67.2   1.90 control
3    80   male        yes     24    6.9    1.8   58.1   50.3   2.10 control
4    80   male        yes     21    6.9    2.9   57.3   53.0  -2.10 control
5    73   male        yes     28    4.3    0.9   72.0   70.0   3.00    case
6    73   male        yes     22    6.3    0.8   50.2   50.3  -1.40 control
7    73   male        yes     25    6.9    1.6   53.1   61.0   1.90 control
8    73   male        yes     22    5.6    0.9   53.4   53.0  -1.60 control
9    82   male        yes     27    5.7    0.8   69.3   71.2   3.40    case
10   82   male        yes     24    7.1    1.8   59.0   51.0   0.40 control
11   82   male        yes     22    5.9    2.1   59.1   53.1  -2.90 control
12   82   male        yes     21    5.9    3.0   62.4   64.0  -1.30 control
13   77   male        yes     23    5.0    1.5   72.3   62.8   1.60    case
14   77   male        yes     25    6.1    2.9   49.9   50.1  -1.30 control
15   77   male        yes     23    7.2    2.6   54.2   51.0   0.60 control
16   77   male        yes     24    5.1    1.7   51.6   64.1  -1.30 control
17   76   male        yes     22    4.9    1.9   53.4   51.9  -0.60    case
18   76   male        yes     22    6.4    1.5   65.2   60.3  -1.60 control
19   76   male        yes     26    6.0    2.3   51.3   67.2  -1.20 control
20   76   male        yes     24    6.0    2.6   53.4   51.3  -2.10 control
21   68   male        yes     26    4.8    1.8   65.5   67.5   0.10    case
22   68   male        yes     22    6.5    2.3   51.3   50.6  -1.90 control
23   68   male        yes     20    6.9    1.7   59.6   62.7   0.90 control
24   68   male        yes     22    6.9    2.8   57.2   52.3  -1.90 control
25   75   male         no     26    6.3    2.6   69.3   62.7   0.40    case
26   75   male         no     21    6.4    2.3   53.1   64.9   1.00 control
27   75   male         no     22    4.7    1.6   59.1   50.1  -2.30 control
28   75   male         no     25    6.9    2.8   57.2   53.0   0.60 control
29   81   male        yes     27    4.2    1.4   64.8   63.8   0.20    case
30   81   male        yes     20    5.8    2.9   52.9   53.1  -1.30 control
31   81   male        yes     22    6.9    2.6   56.2   56.4  -1.60 control
32   81   male        yes     21    5.9    1.2   57.4   52.1  -1.90 control
33   69   male        yes     27    5.4    1.8   68.2   64.1   0.60    case
34   69   male        yes     23    7.2    3.0   68.4   61.2   0.90 control
35   69   male        yes     23    9.8    1.8   63.1   49.3  -1.10 control
36   69   male        yes     21    7.2    2.5   60.5   62.9   1.50 control
37   84   male        yes     22    5.2    2.9   69.8   65.4   0.30    case
38   84   male        yes     21    6.2    1.4   51.3   51.3   1.90 control
39   84   male        yes     22    6.8    2.9   65.2   60.2   0.10 control
40   84   male        yes     24    6.2    2.0   59.1   52.1   1.50 control
41   72   male        yes     21    5.1    1.2   71.2   66.1   1.30    case
42   72   male        yes     21    7.1    1.7   55.9   49.0  -2.30 control
43   72   male        yes     21    7.1    3.0   61.3   62.1  -1.80 control
44   72   male        yes     27    5.9    1.7   56.1   55.1   0.90 control
45   77   male        yes     22    4.7    2.1   70.6   67.2   1.60    case
46   77   male        yes     22    7.2    3.1   64.9   51.4   0.60 control
47   77   male        yes     22    5.1    2.6   69.0   53.6  -2.80 control
48   77   male        yes     21    7.2    2.9   55.9   65.2   0.40 control
49   80   male        yes     26    4.6    1.5   69.0   68.9   0.80    case
50   80   male        yes     21    7.1    2.6   68.4   52.8   0.90 control
51   80   male        yes     21    6.0    2.6   56.2   55.3  -1.80 control
52   80   male        yes     22    7.1    2.8   52.9   50.1  -1.30 control
53   67   male        yes     27    5.8    1.4   62.5   69.0   1.40    case
54   67   male        yes     21    6.0    2.6   57.3   50.5  -0.20 control
55   67   male        yes     23    5.6    2.1   58.1   53.6  -1.40 control
56   67   male        yes     22    6.1    1.2   70.0   52.0  -1.82 control
57   76   male        yes     24    6.0    1.7   59.8   70.2   1.20    case
58   76   male        yes     26    6.1    1.8   57.3   66.2  -0.20 control
59   76   male        yes     22    6.5    2.9   68.4   64.0  -2.10 control
60   76   male        yes     22    7.1    2.6   55.3   59.7   1.90 control
61   81   male        yes     24    5.9    2.4   72.1   61.2   2.10    case
62   81   male        yes     21    7.3    3.1   60.5   53.1   0.90 control
63   81   male        yes     22    5.2    1.6   56.8   48.9   1.50 control
64   81   male        yes     24    5.8    1.0   54.2   50.1  -1.90 control
65   92   male        yes     23    5.5    2.6   60.4   60.5   2.00    case
66   91   male        yes     21    7.2    2.8   67.5   49.3  -1.82 control
67   92   male        yes     24    6.6    2.0   68.4   59.7  -1.30 control
68   92   male        yes     20    5.4    2.6   57.3   50.1   0.90 control
69   83   male        yes     26    4.4    1.8   69.4   69.3   2.20    case
70   83   male        yes     24    7.2    1.6   65.2   62.8  -1.60 control
71   83   male        yes     20    6.9    1.2   52.0   60.2  -2.60 control
72   90 female        yes     23    6.7    0.9   51.6   52.9  -2.90 control
73   64   male        yes     28    5.2    1.7   68.9   58.1  -1.20    case
74   64   male        yes     21    6.5    3.2   59.1   51.0  -2.90 control
75   64   male        yes     23    6.3    3.0   54.0   51.3   0.80 control
76   64   male        yes     22    5.3    1.9   57.2   59.4  -2.30 control
77   78   male        yes     28    4.8    2.6   67.5   57.3  -1.80    case
78   78   male        yes     20    5.4    1.1   58.7   53.0   0.60 control
79   78   male        yes     25    6.4    2.9   52.0   64.0   1.50 control
80   78   male        yes     23    7.2    2.9   58.1   59.0   0.80 control
81   70   male        yes     29    4.7    1.1   71.6   56.4  -2.60    case
82   70   male        yes     22    6.4    2.6   69.0   48.0  -1.10 control
83   70   male        yes     23    7.2    2.4   58.3   62.8   0.90 control
84   70   male        yes     23    6.5    2.1   64.9   50.1  -1.60 control
85   81   male        yes     29    4.8    2.1   71.9   55.2  -2.10    case
86   81   male        yes     23    5.6    1.2   53.1   28.1  -1.30 control
87   81   male        yes     21    5.6    1.7   57.2   52.1  -0.20 control
88   81   male        yes     23    6.2    3.0   61.3   51.3  -1.90 control
89   73   male        yes     26    5.2    2.1   72.0   69.1   0.10    case
90   73   male        yes     22    6.5    2.8   50.6   62.9  -1.30 control
91   73   male        yes     22    6.9    1.6   50.6   49.6  -1.60 control
92   73   male        yes     21    6.3    1.0   57.2   68.9  -1.30 control
93   77   male         no     27    5.1    2.0   70.9   55.6   0.60    case
94   77   male         no     20    7.2    2.6   64.9   62.9  -1.82 control
95   84 female         no     27    7.1    1.8   56.2   61.2  -1.90 control
96   84 female         no     20    6.4    2.1   54.0   51.0  -2.60 control
97   68   male        yes     27    5.6    1.8   70.5   70.0   0.90    case
98   68   male        yes     21    6.2    1.6   56.1   50.4  -1.80 control
99   68   male        yes     24    5.8    2.7   56.8   50.9  -2.90 control
100  68   male        yes     24    6.9    2.4   54.2   61.2   0.80 control

提示:以上为部分数据,实际数据共392个样例


一、相关R包的下载

为实现决策树并对决策树的结构图进行美化,需要用到的R包如下:

  1. rpart
  2. tibble
  3. bitops
  4. rattle
  5. rpart.plot
  6. RColorBrewer

加载代码如下:

library(rpart)
library(tibble)
library(bitops)
library(rattle)
library(rpart.plot)
library(RColorBrewer)

二、实现过程

1.数据读取

read.table("D:\\Rprojects\\tree.csv",header=TRUE,sep=",")->mydata #读取数据

2.训练集与验证集划分

260个样例作为训练集,其余作为测试集

sub<-sample(1:392,260)
train<-mydata[sub,]
test<-mydata[-sub,]

3.构建决策树并绘制图形

利用训练集构建决策树:

model <- rpart(group~age+dishistory+index1+index2+index3+index4+index5+index6,data = train)
fancyRpartPlot(model)

结果如下图所示:
R语言实现决策树_第2张图片

4.测试模型

利用验证集对模型结果进行验证:

x<-subset(test,select=-group)
pred<-predict(model,x,type="class")
k<-test[,"group"]
table(pred,k)

得到结果矩阵如下:

真实值 case control
预测值 - -
case 31 1
control 7 93

准确率:(31+93)/(31+93+7+1)
灵敏度:31/(31+7)
特异度:31/(93+1)


总结

以上就是本文所分享的内容,本文简单介绍了R语言实现决策树的基本操作以及利用fancyRpartplot()对生成的决策树结构图美化,具体可根据自己喜好调整。

你可能感兴趣的:(R语言学习,r语言,数据挖掘,决策树)