机器学习算法有很多,但是它们可以被归类为四类:分类、回归、聚类或降维,应用什么算法取决于你的目标变量和数据建模目标。我们下次介绍聚类和降维,本文将并专注于分类和回归。我们需要一个回归算法归纳出一个连续的目标变量,需要一个分类算法归纳出一个离散的目标变量。Logistic regression,虽然它的名称里有回归,但实际上是一个分类算法。由于我们的问题是预测乘客是否幸存,这是一个离散的目标变量。我们将使用Sklearn库中的分类算法来开始我们的分析。我们将使用交叉验证和评分指标来评估算法的准确性,在后面的章节中讨论,排名和比较我们的算法的性能。
重要: 当谈到数据建模时,初学者的问题总是:“什么是最好的机器学习算法?”对此,初学者必须明白机器学习的无免费午餐定理(NFLT)。简而言之,NFLT状态下,没有超级算法,在所有情况下都适用于所有数据集。因此,最好的方法是尝试多个MLA,调整它们,并将它们与特定场景进行比较。有人说,一些比较好的研究已经做了比较算法,如Caruana & Niculescu-Mizil2006 的视频讲座里面的MLA比较,Ogutu2011由NIH进行基因组选择,Fernandez Delgado 2014比较了来自17个家庭的179个分类器,THOMA 2016 SKlearn比较;还有一个学派认为,更多的数据胜过一个更好的算法。
# 机器学习算法(MLA)始化
MLA = [
# Ensemble Methods
# Gaussian Processes
# Navies Bayes
# Nearest Neighbor
# Trees
# Discriminant Analysis
# xgboost: http://xgboost.readthedocs.io/en/latest/model.html
# 应用分割器进行分割http://scikit-learn.org/stable/modules/generated/sklearn.model_selection.ShuffleSplit.html#sklearn.model_selection.ShuffleSplit
# note: this is an alternative to train_test_split这是一个测试训练集划分的变化形式
cv_split = model_selection.ShuffleSplit(n_splits=10, test_size=.3, train_size=.6,
random_state=0) # run model 10x with 60/30 split intentionally leaving out 10%
# create table to compare MLA metrics MLA列表
MLA_columns = ['MLA Name', 'MLA Parameters', 'MLA Train Accuracy Mean', 'MLA Test Accuracy Mean','MLA Test Accuracy 3*STD', 'MLA Time']
MLA_compare = pd.DataFrame(columns=MLA_columns)
# create table to compare MLA predictions MLA预测对比表
MLA_predict = data1[Target]
# index through MLA and save performance to table MLA的参数计算,及各算法表现形式
row_index = 0
for alg in MLA:
# set name and parameters
MLA_name = alg.__class__.__name__
MLA_compare.loc[row_index, 'MLA Name'] = MLA_name
MLA_compare.loc[row_index, 'MLA Parameters'] = str(alg.get_params())
# score model with cross validation交叉验证评分模型: http://scikit-learn.org/stable/modules/generated/sklearn.model_selection.cross_validate.html#sklearn.model_selection.cross_validate
cv_results = model_selection.cross_validate(alg, data1[data1_x_bin], data1[Target], cv=cv_split)
MLA_compare.loc[row_index, 'MLA Time'] = cv_results['fit_time'].mean()
MLA_compare.loc[row_index, 'MLA Train Accuracy Mean'] = cv_results['train_score'].mean()
MLA_compare.loc[row_index, 'MLA Test Accuracy Mean'] = cv_results['test_score'].mean()
# 如果这是一个无偏随机样本, 则平均值 +/-3 标准差 (std) ,将覆盖99.7%的样本
MLA_compare.loc[row_index, 'MLA Test Accuracy 3*STD'] = cv_results['test_score'].std() * 3 # let's know the worst that can happen!
# save MLA predictions - see section 6 for usage
alg.fit(data1[data1_x_bin], data1[Target])
MLA_predict[MLA_name] = alg.predict(data1[data1_x_bin])
row_index += 1
# print and sort table: https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.sort_values.html
MLA_compare.sort_values(by=['MLA Test Accuracy Mean'], ascending=False, inplace=True)
MLA_NAME | MLA Parameters | MLA Train Accuracy Mean | MLA Test Accuracy Mean | MLA Test Accuracy 3*STD | MLA Time | ||
21 | XGBClassifier | {'base_score': 0.5, 'booster': 'gbtree', 'cols... | 0.856367 |
0.829478 |
0.0527546 |
0.0338062 |
4 | RandomForestClassifier | {'bootstrap': True, 'class_weight':None, 'cri... |
0.892322 |
0.826493 |
0.0679525 |
0.0147755 |
14 | SVC | {'C': 1.0, 'cache_size': 200, 'class_weight': ... | 0.837266 |
0.826119 |
0.0453876 |
0.0445107 |
3 | GradientBoostingClassifier | {'criterion': 'friedman_mse', 'init': None, 'l... | 0.866667 |
0.822761 |
0.0498731 | 0.0715864 |
15 |
0.835768 |
0.822761 |
0.0493681 |
0.0524707 |
12 | ExtraTreesClassifier | {'bootstrap':False, 'class_weight': None, 'cr... |
0.895131 |
0.821269 |
0.0690863 |
0.0144257 |
17 | DecisionTreeClassifier |
{'class_weight': None, 'criterion': 'gini','m... |
0.895131 |
0.81903 |
0.0575704 |
0.00189724 |
1 | BaggingClassifier | {'base_estimator':None, 'bootstrap': True, 'b... |
0.890449 |
0.813806 |
0.0614041 |
0.0157245 |
13 | KNeighborsClassifier | {'algorithm':'auto', 'leaf_size': 30, 'metric... |
0.850375 |
0.813806 |
0.0690863 |
0.00233798 |
18 | ExtraTreeClassifier | {'class_weight':None, 'criterion': 'gini', 'm... |
0.895131 |
0.812687 |
0.0634811 |
0.00160697 |
0 | AdaBoostClassifier |
{'algorithm': 'SAMME.R', 'base_estimator':Non... |
0.820412 |
0.81194 |
0.0498606 |
0.072931 |
5 | GaussianProcessClassifier |
{'copy_X_train': True, 'kernel': None,'max_it... |
0.871723 |
0.810448 |
0.0492537 |
0.350273 |
20 | QuadraticDiscriminantAnalysis | {'priors': None, 'reg_param': 0.0,'store_cova... |
0.821536 |
0.80709 | 0.0810389 |
0.0176577 |
8 | RidgeClassifierCV | {'alphas': (0.1, 1.0, 10.0), 'class_weight':N... |
0.796629 |
0.79403 |
0.0360302 |
0.0105472 |
19 | LinearDiscriminantAnalysis | {'n_components': None, 'priors': None,'shrink... |
0.796816 |
0.79403 |
0.0360302 |
0.00550387 |
16 | LinearSVC | {'C':'1.0,'class_weight': None, 'dual':True,... |
0.79794 |
0.793657 |
0.0400646 |
0.0274618 |
6 | LogisticRegressionCV | {'Cs': 10, 'class_weight': None, 'cv': None,'... |
0.797004 |
0.790672 |
0.0653582 |
0.129134 |
12 | GaussianNB | {'priors': None} |
0.794757 |
0.781343 |
0.0874568 |
0.00183613 |
11 | BernoulliNB | {'alpha': 1.0, 'binarize': 0.0,'class_prior':... |
0.785768 |
0.775373 |
0.0570347 |
0.00200269 |
7 | PassiveAggressiveClassifier | {'C': 1.0, 'average': False, 'class_weight':N.. |
0.734457 |
0.730597 |
0.148826 |
0.00238907 |
10 | Perceptron | {'alpha': 0.0001, 'class_weight': None,'eta0'... |
0.740075 |
0.728731 |
0.162221 |
0.00185683 |
9 | SGDClassifier | {'alpha': 0.0001, 'average': False,'class_wei... |
0.737079 |
0.726119 |
0.17372 |
0.00182471 |
#barplot using https://seaborn.pydata.org/generated/seaborn.barplot.html
sns.barplot(x='MLA Test Accuracy Mean', y = 'MLA Name', data = MLA_compare, color = 'm')
#prettify using pyplot: https://matplotlib.org/api/pyplot_api.html
plt.title('Machine Learning Algorithm Accuracy Score \n')
plt.xlabel('Accuracy Score (%)')
我们的准确性越来越高,但是我们能做得更好吗?我们的数据中有任何的信息吗?为了说明这一点,我们将建立自己的决策树模型,因为它是最容易概念化的,并且需要简单的加法和乘法运算。当创建一个决策树时,你想问的问题将分割你的目标响应,把幸存/ 1和死亡/ 0 划分成均匀的子群。这是科学和艺术的一部分,所以让我们玩2 1问游戏,告诉你它是如何运作的。如果你想自己跟着做,下载训练数据集并导入Excel。在列中创建包含Survived的数据透视表,其值中包含count和%,并在行中描述其中的特性。
记住,游戏的名称是使用决策树模型创建子群存活/ 1在一个箱中,在另一个箱中不存活/ 0。我们的大多数规则是经验规则。意思是,如果大多数或50%或更多的存活,那么我们小组中的每个人存活1,但如果50%个或更少的存活下来,如果我们小组中的每个人都死了/ 0。此外,如果子组小于10 and/or 我们的模型精度平台减小,我们将停止。知道了?开始吧!
问题3a(女性分支,count=314):你在1, 2还是3等仓?1等,多数(97%)存活,2等(92%)存活。由于死亡子群小于10%,我们将停止这个分支。class,50-50分裂。并没有新的信息来改进我们的模型。
#IMPORTANT: This is a handmade model for learning purposes only.
#However, it is possible to create your own predictive model without a fancy algorithm :)
#coin flip model with random 1/survived 0/died
#iterate over dataFrame rows as (index, Series) pairs: https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.iterrows.html
for index, row in data1.iterrows():
#random number generator: https://docs.python.org/2/library/random.html
if random.random() > .5: # Random float x, 0.0 <= x < 1.0
data1.set_value(index, 'Random_Predict', 1) #predict survived/1
data1.set_value(index, 'Random_Predict', 0) #predict died/0
#score random guess of survival. Use shortcut 1 = Right Guess and 0 = Wrong Guess
#the mean of the column will then equal the accuracy
data1['Random_Score'] = 0 #assume prediction wrong
data1.loc[(data1['Survived'] == data1['Random_Predict']), 'Random_Score'] = 1 #set to 1 for correct prediction
print('Coin Flip Model Accuracy: {:.2f}%'.format(data1['Random_Score'].mean()*100))
#we can also use scikit's accuracy_score function to save us a few lines of code
print('Coin Flip Model Accuracy w/SciKit: {:.2f}%'.format(metrics.accuracy_score(data1['Survived'], data1['Random_Predict'])*100))
Coin Flip Model Accuracy: 47.81%
Coin Flip Model Accuracy w/SciKit: 47.81%
#group by or pivot table: https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.groupby.html
pivot_female = data1[data1.Sex=='female'].groupby(['Sex','Pclass', 'Embarked','FareBin'])['Survived'].mean()
print('Survival Decision Tree w/Female Node: \n',pivot_female)
pivot_male = data1[data1.Sex=='male'].groupby(['Sex','Title'])['Survived'].mean()
print('\n\nSurvival Decision Tree w/Male Node: \n',pivot_male)
Survival Decision Tree w/Female Node:
Sex Pclass Embarked FareBin
female 1 C (14.454, 31.0] 0.666667
(31.0, 512.329] 1.000000
Q (31.0, 512.329] 1.000000
S (14.454, 31.0] 1.000000
(31.0, 512.329] 0.955556
2 C (7.91, 14.454] 1.000000
(14.454, 31.0] 1.000000
(31.0, 512.329] 1.000000
Q (7.91, 14.454] 1.000000
S (7.91, 14.454] 0.875000
(14.454, 31.0] 0.916667
(31.0, 512.329] 1.000000
3 C (-0.001, 7.91] 1.000000
(7.91, 14.454] 0.428571
(14.454, 31.0] 0.666667
Q (-0.001, 7.91] 0.750000
(7.91, 14.454] 0.500000
(14.454, 31.0] 0.714286
S (-0.001, 7.91] 0.533333
(7.91, 14.454] 0.448276
(14.454, 31.0] 0.357143
(31.0, 512.329] 0.125000
Name: Survived, dtype: float64
Survival Decision Tree w/Male Node:
Sex Title
male Master 0.575000
Misc 0.250000
Mr 0.156673
Name: Survived, dtype: float64
#下面是手动构建的决策树handmade data model using brain power (and Microsoft Excel Pivot Tables for quick calculations)
def mytree(df):
#initialize table to store predictions
Model = pd.DataFrame(data = {'Predict':[]})
male_title = ['Master'] #survived titles
for index, row in df.iterrows():
#Question 1: Were you on the Titanic; majority died
Model.loc[index, 'Predict'] = 0
#Question 2: Are you female; majority survived
if (df.loc[index, 'Sex'] == 'female'):
Model.loc[index, 'Predict'] = 1
#Question 3A Female - Class and Question 4 Embarked gain minimum information
#Question 5B Female - FareBin; set anything less than .5 in female node decision tree back to 0
if ((df.loc[index, 'Sex'] == 'female') &
(df.loc[index, 'Pclass'] == 3) &
(df.loc[index, 'Embarked'] == 'S') &
(df.loc[index, 'Fare'] > 8)
Model.loc[index, 'Predict'] = 0
#Question 3B Male: Title; set anything greater than .5 to 1 for majority survived
if ((df.loc[index, 'Sex'] == 'male') &
(df.loc[index, 'Title'] in male_title)
Model.loc[index, 'Predict'] = 1
return Model
#model data
Tree_Predict = mytree(data1)
print('Decision Tree Model Accuracy/Precision Score: {:.2f}%\n'.format(metrics.accuracy_score(data1['Survived'], Tree_Predict)*100))
#Accuracy Summary Report with http://scikit-learn.org/stable/modules/generated/sklearn.metrics.classification_report.html#sklearn.metrics.classification_report
#Where recall score = (true positives)/(true positive + false negative) w/1 being best:http://scikit-learn.org/stable/modules/generated/sklearn.metrics.recall_score.html#sklearn.metrics.recall_score
#And F1 score = weighted average of precision and recall w/1 being best: http://scikit-learn.org/stable/modules/generated/sklearn.metrics.f1_score.html#sklearn.metrics.f1_score
print(metrics.classification_report(data1['Survived'], Tree_Predict))
Decision Tree Model Accuracy/Precision Score: 82.04%
precision recall f1-score support
0 0.82 0.91 0.86 549
1 0.82 0.68 0.75 342
avg / total 0.82 0.82 0.82 891
#Plot Accuracy Summary
#Credit: http://scikit-learn.org/stable/auto_examples/model_selection/plot_confusion_matrix.html
import itertools
def plot_confusion_matrix(cm, classes,
title='Confusion matrix',
This function prints and plots the confusion matrix.
Normalization can be applied by setting `normalize=True`.
if normalize:
cm = cm.astype('float') / cm.sum(axis=1)[:, np.newaxis]#对于np.newaxis(相当于行向量),为该行向量的所有值
print("Normalized confusion matrix")
print('Confusion matrix, without normalization')
plt.imshow(cm, interpolation='nearest', cmap=cmap)
tick_marks = np.arange(len(classes))
plt.xticks(tick_marks, classes, rotation=45)
plt.yticks(tick_marks, classes)
fmt = '.2f' if normalize else 'd'
thresh = cm.max() / 2.
for i, j in itertools.product(range(cm.shape[0]), range(cm.shape[1])):
plt.text(j, i, format(cm[i, j], fmt),
color="white" if cm[i, j] > thresh else "black")
plt.ylabel('True label')
plt.xlabel('Predicted label')
# Compute confusion matrix
cnf_matrix = metrics.confusion_matrix(data1['Survived'], Tree_Predict)
class_names = ['Dead', 'Survived']
# Plot non-normalized confusion matrix
plot_confusion_matrix(cnf_matrix, classes=class_names,
title='Confusion matrix, without normalization')
# Plot normalized confusion matrix
plot_confusion_matrix(cnf_matrix, classes=class_names, normalize=True,
title='Normalized confusion matrix')
Confusion matrix, without normalization
[[497 52]
[108 234]]
Normalized confusion matrix
[[ 0.91 0.09]
[ 0.32 0.68]]
1. 简单易懂,易于理解,决策树可以被可视化
2. 需要很少的数据准备。其他技术通常需要数据归一化,需要创建虚拟变量和删除空白值。注意,决策树不支持缺失值。
3. 使用树(即预测数据)的成本与训练树的数据点的数量成对数关系。
4. 能够处理数字和分类数据。其他技术通常专门用于分析只有一种类型变量的数据集。有关更多信息,请参见算法。
5. 能够处理多输出问题。
6. 决策树使用白盒模型。如果在模型中可观察到给定的情况,则用布尔逻辑很容易地解释条件的解释。相比之下,在黑箱模型(例如,在人工神经网络中),结果可能更难以解释。
7. 可以使用统计检验来验证模型。这使得有可能解释模型的可靠性。
8. 即使其假设在某种程度上违反了生成数据的真实模型,也能很好地执行。
1. 决策树学习者可以创建过于复杂的树,这类树不能很好地概括数据。这被称为过度拟合。诸如剪枝(目前不支持)、设置叶子节点所需的最小样本数或设置树的最大深度等机制是可以避免这种问题的。
2. 因为数据中的微小变化可能导致生成完全不同的树,决策树可能是不稳定的。通过集成使用决策树来减轻这个问题。
3. 学习最优决策树NP在几个方面都是最优性,甚至是对简单的概念他都非常有效。因此,实际的决策树学习算法是基于启发式算法,如贪婪算法,在每个节点上进行局部最优决策。这样的算法不能保证返回全局最优决策树。可以通过在集成学习器中训练多个树,以此来避免局部最优解,其中特征和样本被随机抽样替换
4. 有些概念是很难学的,因为决策树不容易表达它们,例如异或、奇偶或多路复用器问题。
5. 如果一些类占主导地位,决策树学习者会创建具有偏见的树。因此,建议在与决策树拟合之前平衡数据集。
classsklearn.tree.DecisionTreeClassifier(criterion=’gini’, splitter=’best’,max_depth=None, min_samples_split=2, min_samples_leaf=1,min_weight_fraction_leaf=0.0, max_features=None, random_state=None,max_leaf_nodes=None, min_impurity_decrease=0.0, min_impurity_split=None,class_weight=None, presort=False)
#base model
dtree = tree.DecisionTreeClassifier(random_state = 0)
base_results = model_selection.cross_validate(dtree, data1[data1_x_bin], data1[Target], cv = cv_split)
dtree.fit(data1[data1_x_bin], data1[Target])
print('BEFORE DT Parameters: ', dtree.get_params())
print("BEFORE DT Training w/bin score mean: {:.2f}". format(base_results['train_score'].mean()*100))
print("BEFORE DT Test w/bin score mean: {:.2f}". format(base_results['test_score'].mean()*100))
print("BEFORE DT Test w/bin score 3*std: +/- {:.2f}". format(base_results['test_score'].std()*100*3))
#print("BEFORE DT Test w/bin set score min: {:.2f}". format(base_results['test_score'].min()*100))
#tune hyper-parameters: http://scikit-learn.org/stable/modules/generated/sklearn.tree.DecisionTreeClassifier.html#sklearn.tree.DecisionTreeClassifier
param_grid = {'criterion': ['gini', 'entropy'], #scoring methodology; two supported formulas for calculating information gain - default is gini
#'splitter': ['best', 'random'], #splitting methodology; two supported strategies - default is best
'max_depth': [2,4,6,8,10,None], #max depth tree can grow; default is none
#'min_samples_split': [2,5,10,.03,.05], #minimum subset size BEFORE new split (fraction is % of total); default is 2
#'min_samples_leaf': [1,5,10,.03,.05], #minimum subset size AFTER new split split (fraction is % of total); default is 1
#'max_features': [None, 'auto'], #max features to consider when performing split; default none or all
'random_state': [0] #seed or control random number generator: https://www.quora.com/What-is-seed-in-random-number-generation
#choose best model with grid_search: #http://scikit-learn.org/stable/modules/grid_search.html#grid-search
tune_model = model_selection.GridSearchCV(tree.DecisionTreeClassifier(), param_grid=param_grid, scoring = 'roc_auc', cv = cv_split)
tune_model.fit(data1[data1_x_bin], data1[Target])
print('AFTER DT Parameters: ', tune_model.best_params_)
print("AFTER DT Training w/bin score mean: {:.2f}". format(tune_model.cv_results_['mean_train_score'][tune_model.best_index_]*100))
print("AFTER DT Test w/bin score mean: {:.2f}". format(tune_model.cv_results_['mean_test_score'][tune_model.best_index_]*100))
print("AFTER DT Test w/bin score 3*std: +/- {:.2f}". format(tune_model.cv_results_['std_test_score'][tune_model.best_index_]*100*3))
#duplicates gridsearchcv
#tune_results = model_selection.cross_validate(tune_model, data1[data1_x_bin], data1[Target], cv = cv_split)
#print('AFTER DT Parameters: ', tune_model.best_params_)
#print("AFTER DT Training w/bin set score mean: {:.2f}". format(tune_results['train_score'].mean()*100))
#print("AFTER DT Test w/bin set score mean: {:.2f}". format(tune_results['test_score'].mean()*100))
#print("AFTER DT Test w/bin set score min: {:.2f}". format(tune_results['test_score'].min()*100))
BEFORE DT Parameters: {'class_weight': None, 'criterion': 'gini', 'max_depth': None, 'max_features': None, 'max_leaf_nodes': None, 'min_impurity_decrease': 0.0, 'min_impurity_split': None, 'min_samples_leaf': 1, 'min_samples_split': 2, 'min_weight_fraction_leaf': 0.0, 'presort': False, 'random_state': 0, 'splitter': 'best'}
BEFORE DT Training w/bin score mean: 89.51
BEFORE DT Test w/bin score mean: 82.09
BEFORE DT Test w/bin score 3*std: +/- 5.57
AFTER DT Parameters: {'criterion': 'gini', 'max_depth': 4, 'random_state': 0}
AFTER DT Training w/bin score mean: 89.35
AFTER DT Test w/bin score mean: 87.40
AFTER DT Test w/bin score 3*std: +/- 5.00
正如一开始所说的,不是说预测变量越多模型就越好,而正确的预测因子可以提高模型的准确率。因此,数据建模的另一个步骤是特征选择。Sklearn有几个选择,我们将使用递归特征消除recursive feature elimination(RFE)与交叉验证cross validation(CV)。
#base model
print('BEFORE DT RFE Training Shape Old: ', data1[data1_x_bin].shape)
print('BEFORE DT RFE Training Columns Old: ', data1[data1_x_bin].columns.values)
print("BEFORE DT RFE Training w/bin score mean: {:.2f}". format(base_results['train_score'].mean()*100))
print("BEFORE DT RFE Test w/bin score mean: {:.2f}". format(base_results['test_score'].mean()*100))
print("BEFORE DT RFE Test w/bin score 3*std: +/- {:.2f}". format(base_results['test_score'].std()*100*3))
#feature selection
dtree_rfe = feature_selection.RFECV(dtree, step = 1, scoring = 'accuracy', cv = cv_split)
dtree_rfe.fit(data1[data1_x_bin], data1[Target])
#transform x&y to reduced features and fit new model
#alternative: can use pipeline to reduce fit and transform steps: http://scikit-learn.org/stable/modules/generated/sklearn.pipeline.Pipeline.html
X_rfe = data1[data1_x_bin].columns.values[dtree_rfe.get_support()]
rfe_results = model_selection.cross_validate(dtree, data1[X_rfe], data1[Target], cv = cv_split)
print('AFTER DT RFE Training Shape New: ', data1[X_rfe].shape)
print('AFTER DT RFE Training Columns New: ', X_rfe)
print("AFTER DT RFE Training w/bin score mean: {:.2f}". format(rfe_results['train_score'].mean()*100))
print("AFTER DT RFE Test w/bin score mean: {:.2f}". format(rfe_results['test_score'].mean()*100))
print("AFTER DT RFE Test w/bin score 3*std: +/- {:.2f}". format(rfe_results['test_score'].std()*100*3))
#tune rfe model
rfe_tune_model = model_selection.GridSearchCV(tree.DecisionTreeClassifier(), param_grid=param_grid, scoring = 'roc_auc', cv = cv_split)
rfe_tune_model.fit(data1[X_rfe], data1[Target])
print('AFTER DT RFE Tuned Parameters: ', rfe_tune_model.best_params_)
print("AFTER DT RFE Tuned Training w/bin score mean: {:.2f}". format(rfe_tune_model.cv_results_['mean_train_score'][tune_model.best_index_]*100))
print("AFTER DT RFE Tuned Test w/bin score mean: {:.2f}". format(rfe_tune_model.cv_results_['mean_test_score'][tune_model.best_index_]*100))
print("AFTER DT RFE Tuned Test w/bin score 3*std: +/- {:.2f}". format(rfe_tune_model.cv_results_['std_test_score'][tune_model.best_index_]*100*3))
#Graph MLA version of Decision Tree: http://scikit-learn.org/stable/modules/generated/sklearn.tree.export_graphviz.html
import graphviz
dot_data = tree.export_graphviz(dtree, out_file=None,
feature_names = data1_x_bin, class_names = True,
filled = True, rounded = True)
graph = graphviz.Source(dot_data)