机器学习代价曲线作业题目:
二分类问题,0为正类,1为负类,预测正类的概率如下表。非均等代价,代价矩阵为cost01=1, cost10=20,costii=0(i=0,i=1)。请画出期望总体代价。(编程实现)
#计算期望总代价
#0为正类/1为负类
label= [0,0,1,0,0,0,1,1,0,1,0,1,0,1,1,1,0,1,0,1]
score=[0.9,0.8,0.7,0.6,0.55,0.54,0.53,0.52,0.51,0.505,0.4,0.39,0.38,0.37,0.36,0.35,0.34,0.33,0.30,0.1]
#计算TPR/FRP 绘制ROC曲线
#以score作为阈值 然后进行分类 计算得到TPR/FPR 从而得到FNR/FPR
#首先1个循环是用于确定阈值
#另一个循环是用于遍历score 确定预测标签
#然后计算每次循环的TPR/FPR/FNR
#定义一个函数方便计算
def cost_curve_FPR_FNR(true,predict):
count=len(predict)
true_1=sum(true)
true_0=count-true_1
TP=0
FP=0
for i in range(count):
if predict[i]==true[i] and predict[i]==0:
TP+=1
elif predict[i]!=true[i] and predict[i]==0:
FP+=1
TPR = TP/true_0
FNR = 1- TPR#假反率
FPR = FP/true_1
return TPR,FPR,FNR
#计算TPR/FPR/FNR的数列
count=len(score)
TPR_array=[]
FPR_array=[]
FNR_array=[]
for i in range(count):
predict=[]
for j in range(count):
if score[j]>=score[i]:
predict.append(0)
else:
predict.append(1)
TPR,FPR,FNR= cost_curve_FPR_FNR(label,predict)
TPR_array.append(TPR)
FPR_array.append(FPR)
FNR_array.append(FNR)
#绘制图片
import matplotlib.pyplot as plt
import numpy as np
fig,axis = plt.subplots(1,2,figsize = (15,6))
axis = axis.ravel()
axis[0].plot(FPR_array,TPR_array,'r')
axis[0].set_title('ROC Curve')
axis[0].set_xlabel('FPR')
axis[0].set_ylabel('TPR')
axis[1].plot((0,1),(FPR_array,FNR_array),'b')
axis[1].set_title('Cost Curve')
axis[1].set_xlabel('FPR')
axis[1].set_ylabel('FNR')
最终结果图如下: