大师兄的数据分析学习笔记(三十二):模型评估(一)

大师兄的数据分析学习笔记(三十一):机器学习模型总结
大师兄的数据分析学习笔记(三十三):模型评估(二)

一、分类模型评估

1. 二分类
  • 二分类就是标注分类时有两类的分类,在数据挖掘中是常见的类型。
  • 通常会将二分类中更被关注的类定义成正类,用数字1来表示。
  • 另一个类定义成负类,用数字0表示。
  • 有时0和1不是直接得到的,而是经过模型输出后被划分为正类的概率:
  • 这时需要确定一个阈值(比如0.5),大于阈值为1,否则为0。
2. 混淆矩阵
  • 如果把测试集的真实分类和经过模型预测后最终的判别结果进行整理,会得到四种映射关系:
名称 实际值 预测值
TP(True Positive)
FN(False Negative),漏分类
FP(False Positive),假正类
TN(True Negative)
  • 如果把上图中各类映射的数量数出来,并整理成为一个矩阵的形式,就是混淆矩阵
/ 0 1
0 Y_00 Y_01
1 Y_10 Y_11
  • 矩阵中的每一行代表一个真实的分类
  • 矩阵中的每一列代表一个预测的分类
  • 如果把四个映射找到混淆矩阵中的位置,则如图:
/ 0 1
0 TN FP
1 FN TP
  • 可以看出,对角线上的分类属于正确分类。
  • 不在对角线上的分类数据错误分类。
  • 所以理想的模型应该是一个对角阵,如果得不到对角阵,对角线上的数字加和占统治地位也是可以的。
3. 关键指标
  • 可以通过混淆矩阵获得关键指标
  • 正确率(Accuracy Rate)
  • 召回率(Recall,True Positive Rate)
  • F-measure(正确率和召回率的权衡值)
  • 准确率(Precision)
  • 错误接收率(FPR)
  • 错误拒绝率(FRR)
4. 多元混淆矩阵
  • 二分类不同,多分类中的每个类都是被关注的。
  • 多分类也可以制作成混淆矩阵,同样对角线上的值表示正确值。
/ 0 1 2
0 Y_00 Y_01 Y_02
1 Y_10 Y_11 Y_12
2 Y_20 Y_21 Y_22
  • 准确率:和二分类保持一致。
  • 召回率/F-measure
  1. 先计算所有的TP、FN等,再以二分类方法计算。
  2. 分别把每个类当做正类,都算一个召回率/F-measure,然后取加权或者不加权的平均值。
  • 如果值是模型输出后被划分的概率,可以使用ROCAUC
4.1 ROC
  • ROC(Receiver Operating characteristic Curve)可以很容易查出任意界限值时的对性能的识别能力 。
  • 首先将模型输出的预测结果得分从大到小进行排列:


  • 将不同阈值得出的关键指标结果画到坐标系上,连成一条线:
4.2 AUC
  • AUC(Area Under Curve)被定义为ROC下与坐标轴围成的面积。
  • 由于ROC一般都处于y=x直线的上方,所以AUC的取值范围在0.5和1之间。
  • AUC越接近1.0,检测方法真实性越高;等于0.5时,则真实性最低,无应用价值。
4.3 增益图
  • 增益图可以在宏观上反应分类器的分类效果。
4.4 KS图
  • KS图可以通过TPRFPR的差距,反映出对正类样本分类的区分度。
5. 代码实现
>>>import os
>>>import numpy as np
>>>import pandas as pd
>>>import tensorflow as tf
>>>import matplotlib.pyplot as plt
>>>from sklearn.model_selection import train_test_split
>>>from sklearn.metrics import roc_curve,auc,roc_auc_score
>>>from sklearn.preprocessing import StandardScaler
>>>from keras.models import Sequential
>>>from keras.layers.core import Dense,Activation

>>>df = pd.read_csv(os.path.join(".", "data", "WA_Fn-UseC_-HR-Employee-Attrition.csv"))

>>>X_tt,X_validation,Y_tt,Y_validation = train_test_split(df.JobLevel,df.JobSatisfaction,test_size=0.2)
>>>StandardScaler().fit_transform(np.array(X_tt).reshape(-1,1))
>>>X_train,X_test,Y_train,Y_test = train_test_split(X_tt,Y_tt,test_size=0.25)

>>>mdl = Sequential()
>>>mdl.add(Dense(50))
>>>mdl.add(Activation("sigmoid"))
>>>mdl.add(Dense(2))
>>>mdl.add(Activation("softmax"))
>>>mdl.compile(loss="mean_squared_error",optimizer=tf.keras.optimizers.SGD(lr=0.05))
>>>mdl.fit(X_train,np.array([[0,1] if i==1 else [1,0] for i in Y_train]),epochs=50,batch_size=800)

>>>f = plt.figure()

>>>xy_lst = [(X_train,Y_train),(X_validation,Y_validation),(X_test,Y_test)]
>>>for i in range(len(xy_lst)):
>>>    X_part = xy_lst[i][0]
>>>    Y_part = [0 if x<=1 else 1 for x in xy_lst[i][1]]

>>>    Y_pred = mdl.predict(X_part)
>>>    Y_pred = np.array(Y_pred[:,1]).reshape((1,-1))[0]
>>>    f.add_subplot(1,3,i+1)
>>>    fpr,tpr,threshold = roc_curve(Y_part,Y_pred)
>>>    plt.plot(fpr,tpr)
>>>    print("NN","AUC",auc(fpr,tpr))
>>>    print("NN","AUC_Score",roc_auc_score(Y_part,Y_pred))
>>>    print("="*40)
>>>plt.show()
Epoch 1/50
2/2 [==============================] - 0s 1ms/step - loss: 0.2929
Epoch 2/50
2/2 [==============================] - 0s 1ms/step - loss: 0.2168
Epoch 3/50
2/2 [==============================] - 0s 1000us/step - loss: 0.1858
Epoch 4/50
2/2 [==============================] - 0s 1ms/step - loss: 0.1738
Epoch 5/50
2/2 [==============================] - 0s 1000us/step - loss: 0.1673
Epoch 6/50
2/2 [==============================] - 0s 0s/step - loss: 0.1638
Epoch 7/50
2/2 [==============================] - 0s 1000us/step - loss: 0.1623
Epoch 8/50
2/2 [==============================] - 0s 1000us/step - loss: 0.1616
Epoch 9/50
2/2 [==============================] - 0s 1000us/step - loss: 0.1613
Epoch 10/50
2/2 [==============================] - 0s 1000us/step - loss: 0.1606
Epoch 11/50
2/2 [==============================] - 0s 1000us/step - loss: 0.1602
Epoch 12/50
2/2 [==============================] - 0s 1000us/step - loss: 0.1599
Epoch 13/50
2/2 [==============================] - 0s 1ms/step - loss: 0.1598
Epoch 14/50
2/2 [==============================] - 0s 0s/step - loss: 0.1596
Epoch 15/50
2/2 [==============================] - 0s 1ms/step - loss: 0.1596
Epoch 16/50
2/2 [==============================] - 0s 0s/step - loss: 0.1596
Epoch 17/50
2/2 [==============================] - 0s 1ms/step - loss: 0.1596
Epoch 18/50
2/2 [==============================] - 0s 0s/step - loss: 0.1596
Epoch 19/50
2/2 [==============================] - 0s 1000us/step - loss: 0.1596
Epoch 20/50
2/2 [==============================] - 0s 0s/step - loss: 0.1596
Epoch 21/50
2/2 [==============================] - 0s 0s/step - loss: 0.1596
Epoch 22/50
2/2 [==============================] - 0s 1ms/step - loss: 0.1595
Epoch 23/50
2/2 [==============================] - 0s 1000us/step - loss: 0.1596
Epoch 24/50
2/2 [==============================] - 0s 0s/step - loss: 0.1596
Epoch 25/50
2/2 [==============================] - 0s 1000us/step - loss: 0.1596
Epoch 26/50
2/2 [==============================] - 0s 1000us/step - loss: 0.1595
Epoch 27/50
2/2 [==============================] - 0s 0s/step - loss: 0.1595
Epoch 28/50
2/2 [==============================] - 0s 1ms/step - loss: 0.1595
Epoch 29/50
2/2 [==============================] - 0s 1000us/step - loss: 0.1595
Epoch 30/50
2/2 [==============================] - 0s 1000us/step - loss: 0.1596
Epoch 31/50
2/2 [==============================] - 0s 0s/step - loss: 0.1596
Epoch 32/50
2/2 [==============================] - 0s 0s/step - loss: 0.1596
Epoch 33/50
2/2 [==============================] - 0s 1000us/step - loss: 0.1596
Epoch 34/50
2/2 [==============================] - 0s 1000us/step - loss: 0.1595
Epoch 35/50
2/2 [==============================] - 0s 0s/step - loss: 0.1595
Epoch 36/50
2/2 [==============================] - 0s 1000us/step - loss: 0.1595
Epoch 37/50
2/2 [==============================] - 0s 1000us/step - loss: 0.1596
Epoch 38/50
2/2 [==============================] - 0s 0s/step - loss: 0.1596
Epoch 39/50
2/2 [==============================] - 0s 1000us/step - loss: 0.1596
Epoch 40/50
2/2 [==============================] - 0s 1000us/step - loss: 0.1595
Epoch 41/50
2/2 [==============================] - 0s 1000us/step - loss: 0.1596
Epoch 42/50
2/2 [==============================] - 0s 1ms/step - loss: 0.1595
Epoch 43/50
2/2 [==============================] - 0s 1ms/step - loss: 0.1596
Epoch 44/50
2/2 [==============================] - 0s 0s/step - loss: 0.1597
Epoch 45/50
2/2 [==============================] - 0s 1000us/step - loss: 0.1596
Epoch 46/50
2/2 [==============================] - 0s 1000us/step - loss: 0.1596
Epoch 47/50
2/2 [==============================] - 0s 1ms/step - loss: 0.1596
Epoch 48/50
2/2 [==============================] - 0s 0s/step - loss: 0.1595
Epoch 49/50
2/2 [==============================] - 0s 1000us/step - loss: 0.1596
Epoch 50/50
2/2 [==============================] - 0s 1000us/step - loss: 0.1595
28/28 [==============================] - 0s 444us/step
NN AUC 0.4800573010558846
NN AUC_Score 0.4800573010558846
========================================
10/10 [==============================] - 0s 556us/step
NN AUC 0.5361630625365283
NN AUC_Score 0.5361630625365283
========================================
10/10 [==============================] - 0s 667us/step
NN AUC 0.5459870673259795
NN AUC_Score 0.5459870673259795
========================================

你可能感兴趣的:(大师兄的数据分析学习笔记(三十二):模型评估(一))