python计算PR曲线sklearn.metrics.precision_recall_curve

PR曲线实则是以precision(精准率)和recall(召回率)这两个为变量而做出的曲线,其中recall为横坐标,precision为纵坐标。设定一系列阈值,计算每个阈值对应的recall和precision,即可计算出PR曲线各个点。

precision=tp / (tp + fp)

recall=tp / (tp + fn)

可以用sklearn.metrics.precision_recall_curve计算PR曲线

from sklearn.metrics import precision_recall_curve
y_true = [0, 0, 1, 1]
y_score = [0.1, 0.4, 0.35, 0.8]


precision, recall, thresholds = precision_recall_curve(y_true, y_score)
print(precision)
print(recall)
print(thresholds)
"""
[0.66666667 0.5 1. 1.]
[1.  0.5 0.5 0. ]
[0.35 0.4  0.8 ]
"""

其中y_true是正确标签,y_score是概率输出值,thresholds是阈值,当y_score>=thresholds,则预测为正样本,当y_score

  • 当index=0,thresholds[index]=0.35,此时预测的标签为[0,1,1,1],tp=2,fp=1,fn=0,所以precision=0.67,recall=1
  • 当index=1,thresholds[index]=0.4,此时预测的标签为[0,1,0,1],tp=1,fp=1,fn=1,所以precision=0.5,recall=0.5
  • 当index=2,thresholds[index]=0.8,此时预测的标签为[0,0,0,1],tp=1,fp=0,fn=1,所以precision=1,recall=0.5

你可能感兴趣的:(python)