决策树learning

1.0 理论

条件熵

信息增益

信息增益比








1.0 sklearn.tree

首先,http://scikit-learn.org给的入门代码是有问题的...

from sklearn.datasets import load_iris

from sklearn import tree

from sklearn.externals.six import StringIO

import pydot

dot_data = StringIO()

iris = load_iris()

clf = tree.DecisionTreeClassifier()

clf = clf.fit(iris.data, iris.target)

tree.export_graphviz(clf, out_file=dot_data)

graph = pydot.graph_from_dot_data(dot_data.getvalue())

graph.write_pdf("iris.pdf")

这么粘下来,报的第一个错是:

AttributeError: 'list' object has no attribute 'write_pdf'

不禁显然了深深的思考...

然后stackoverflow告诉我,pydot已经升级了,请使用plus版...

于是麻溜的,pydotplus搞起!

果然,报错变了!(我就知道不会这么顺利...)

InvocationException:GraphViz's executables not found

赶紧再google起来,stackoverflow这次告诉我:小子!你没装GraphViz或者没配环境吧!

soga!GraphViz装起来~

搜一个GraphViz安装大保健~安装,重启IDE

from sklearn.datasets import load_iris

from sklearn import tree

from sklearn.externals.six import StringIO

from IPython.display import Image

import numpy as np

import pandas as pd

import os

import pydotplus

iris = load_iris()

test = tree.DecisionTreeClassifier()

test = test.fit(iris.data, iris.target)

dot_data = StringIO()

tree.export_graphviz(test, out_file=dot_data)

graph = pydotplus.graph_from_dot_data(dot_data.getvalue())

完美~

接下来研究怎么出图....

你可能感兴趣的:(决策树learning)