决策树可视化(使用sklearn.tree 的export_graphviz方法)

GraphViz安装配置连接https://yunyaniu.blog.csdn.net/article/details/79008351

使用GraphViz可视化dot文件时,使用命令行的方式报错:'dot' 不是内部或外部命令,也不是可运行的程序

解决办法是直接调用gvedit.exe打开.dot文件就能生成决策树

1.可以在GraphViz的bin目录下,找到gvedit.exe文件。如图所示:

决策树可视化(使用sklearn.tree 的export_graphviz方法)_第1张图片

2.将其打开,界面如下图:

决策树可视化(使用sklearn.tree 的export_graphviz方法)_第2张图片

3将我们程序生成的.dot文件打开即可生成决策树

4.然后点击下图所示按键,将输出并保存png/pdf文件

参考博客:https://blog.csdn.net/dengdengma520/article/details/79593020

用到的程序如下:

#-*-coding:utf-8-*- 
__author__ = 'fankai'
from sklearn import tree
from sklearn.datasets import load_wine
from sklearn.model_selection import train_test_split,cross_val_score
import pandas as pd
import numpy as np
from sklearn.tree import export_graphviz
#加载数据集,将数据和类别区分呢
wine=load_wine()
x=wine.data
y=wine.target
# print(pd.DataFrame(x))
# print(pd.DataFrame(y))

xtrain,xtest,ytrain,ytest=train_test_split(x,y,test_size=0.3)
# 构建模型
clf=tree.DecisionTreeClassifier()
clf.fit(xtrain,ytrain)
#评估模型使用十次交叉验证
score = cross_val_score(clf, x, y, cv=10, scoring='accuracy')
print(np.mean(score))
print(clf.feature_importances_)

# 可视化决策树
feature_name = ['酒精','苹果酸','灰','灰的碱性','镁','总酚','类黄酮','非黄烷类酚类','花青素','颜色强度','色调','od280/od315稀释葡萄酒','脯氨酸']
with open('./wine.dot','w',encoding='utf-8') as f:
    f=export_graphviz(clf,feature_names=feature_name,out_file=f)

上面出现一个问题:就是生成的决策树中文是乱码的形式

解决乱码:https://blog.csdn.net/qq_22194911/article/details/80882853 好像很复杂的样子

我是直接将中文的特征属性名通过:

wine.feature_names

改过后的代码如下:

#-*-coding:utf-8-*- 
__author__ = 'fankai'

from sklearn import tree
from sklearn.datasets import load_wine
from sklearn.model_selection import train_test_split,cross_val_score
import pandas as pd
import numpy as np
from sklearn.tree import export_graphviz

#加载数据集,将数据和类别区分呢
wine=load_wine()
x=wine.data
y=wine.target
# print(pd.DataFrame(x))
# print(pd.DataFrame(y))
print(wine.feature_names)


xtrain,xtest,ytrain,ytest=train_test_split(x,y,test_size=0.3)
# 构建模型
clf=tree.DecisionTreeClassifier()
clf.fit(xtrain,ytrain)
#评估模型使用十次交叉验证
score = cross_val_score(clf, x, y, cv=10, scoring='accuracy')
print(np.mean(score))
print(clf.feature_importances_)

# 可视化决策树
# feature_name = ['酒精','苹果酸','灰','灰的碱性','镁','总酚','类黄酮','非黄烷类酚类','花青素','颜色强度','色调','od280/od315稀释葡萄酒','脯氨酸']
with open('./wine.dot','w',encoding='utf-8') as f:
    f=export_graphviz(clf,feature_names=wine.feature_names,out_file=f)

最后的决策树生成图为:

决策树可视化(使用sklearn.tree 的export_graphviz方法)_第3张图片

 

 

 

 

 

 

 

你可能感兴趣的:(sklearn,决策树可视化,GraphViz)