ROC曲线的绘制

几个概念

ROC曲线的绘制_第1张图片

场景

AdaBoost的基本分类器的线性组合

f(x)=m=1MαmGm(x)

最终的分类器

G(x)=sign(f(x))=sign(m=1MαmGm(x))

这里已知 {f(xi)|i=1,2,,N}{labeli|i=1,2,,N},前者是每个样本xi对应的基本分类器的输出的加权组合,后者是对应的标签数据。

接下来基于这两个数据做ROC曲线图。

作图

ROC曲线的绘制_第2张图片

绘图代码:

#predStrengths 和classLabels都是299个元素的ndarray对象。
ySum = 0.0 #variable to calculate AUC
N = classLabels.shape[0] #总样本个数
numPosClas = np.sum(classLabels==1.0) #样本中正例的个数
yStep = 1.0/numPosClas;  #真阳率(在纵轴上)的分母是正样本的个数
xStep = 1.0/(N-numPosClas) #假阳率(在横轴上)的分母是负样本的个数
srtidxs = predStrengths.argsort()# 从小到大排列的序号

fig = plt.figure()
fig.clf()
ax = plt.subplot(111)

cur = (1.0,1.0) #左上顶角坐标,全部样本都判为正,真阳率和假阳率都为1
for idx in srtidxs: 
    #从值最小到值最大,作为判断门限,将大于该值的样本判为正,将小于等于该值的样本判为负
    if classLabels[idx] == 1.0: # 样本为正,影响的是真阳率,判错了,所以真阳率要减小一个刻度
        delX = 0; 
        delY = yStep;
    else: # 样本为负,影响的是假阳率,盘对了,故假阳率要减小一个刻度
        delX = xStep; 
        delY = 0;

        #每次x轴(即假阳率)调整时,将ySum加上当前的y轴刻度值,
        ySum += cur[1] 

    ax.plot([cur[0],cur[0]-delX],[cur[1],cur[1]-delY], c='b')
    cur = (cur[0]-delX,cur[1]-delY) #更新坐标,从右上角向左下角画的曲线    
ax.plot([0,1],[0,1],'b--') # 画一条对角线,从(0,0)到(1,1)

auc = np.str( "%.4f"%(ySum*xStep)) #曲线下的面积
plt.xlabel(u'假阳率',{'fontname':'STFangsong','fontsize':15}); 
plt.ylabel(u'真阳率',{'fontname':'STFangsong','fontsize':15})
plt.title(u'ROC曲线'+'(AUC = ('+auc+')',{'fontname':'STFangsong','fontsize':15})

ax.axis([0,1,0,1]) 
fig.savefig('roc.png',dpi=300,bbox_inches='tight')
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 准确率(Accuracy), 精确率(Precision), 召回率(Recall)和F1-Measure 
    http://argcv.com/articles/1036.c
  • 准确率(Precision)、召回率(Recall)以及综合评价指标(F1-Measure ) 
    http://www.cnblogs.com/bluepoint2009/archive/2012/09/18/2690035.html
  • ROC和AUC介绍以及如何计算AUC 
    http://alexkong.net/2013/06/introduction-to-auc-and-roc/
  • An introduction to ROC analysis 
    https://ccrma.stanford.edu/workshops/mir2009/references/ROCintro.pdf
  • http://en.wikipedia.org/wiki/Precision_and_recall 
    公式比较全
  • Macro- and micro-averaged evaluation measures 
    http://www.cnts.ua.ac.be/~vincent/pdf/microaverage.pdf
  • Micro- and Macro-average of Precision, Recall and F-Score 
    http://rushdishams.blogspot.com/2011/08/micro-and-macro-average-of-precision.html
  • Area Under the Precision-Recall Curve:Point Estimates and Confidence Intervals 
    http://www.ecmlpkdd2013.org/wp-content/uploads/2013/07/aucpr_2013ecml_corrected.pdf
  • PRROC: computing and visualizing precision-recall and receiver operating characteristic curves in R 
    http://cran.r-project.org/web/packages/PRROC/vignettes/PRROC.pdf
  • 再理解下ROC曲线和PR曲线 
    http://www.zhizhihu.com/html/y2012/4076.html
  • Performance Measures for Machine Learning 
    http://www.cs.cornell.edu/courses/cs578/2003fa/performance_measures.pdf
  • Computational Statistics with Application to Bioinformatics 
    http://www.cs.cornell.edu/courses/cs578/2003fa/performance_measures.pdf
  • The Relationship Between PR & ROC Curves 
    http://www.autonlab.org/icml_documents/camera-ready/030_The_Relationship_Bet.pdf
  • Differences between Receiver Operating Characteristic AUC (ROC AUC) and Precision Recall AUC (PR AUC) 
    http://www.chioka.in/differences-between-roc-auc-and-pr-auc/

你可能感兴趣的:(机器学习)