使用python画precision-recall曲线的代码是:
sklearn.metrics.precision_recall_curve(y_true, probas_pred, pos_label=None, sample_weight=None)
以上代码会根据预测值和真实值,并通过改变判定阈值来计算一条precision-recall典线。
注意:以上命令只限制于二分类任务
precision(精度)为tp / (tp + fp),其中tp为真阳性数,fp为假阳性数。
recall(召回率)是tp / (tp + fn),其中tp是真阳性数,fn是假阴性数。
参数:
返回值:
例子:
import numpy as np
from sklearn.metrics import precision_recall_curve
y_true = np.array([0, 0, 1, 1])
y_scores = np.array([0.1, 0.4, 0.35, 0.8])
precision, recall, thresholds = precision_recall_curve(y_true, y_scores)
>>> precision
array([0.66666667, 0.5, 1., 1.])
>>> recall
array([1., 0.5, 0.5, 0.])
>>> thresholds
array([0.35, 0.4, 0.8])
sklearn.metrics.average_precision_score则计算预测值的平均准确率(AP: average precision)。该分数对应于presicion-recall曲线下的面积。该值在0和1之间,而且更高更好。
sklearn.metrics.average_precision_score(y_true, y_score, average='macro', pos_label=1, sample_weight=None)
参数:
y_true:array, shape = [n_samples] or [n_samples, n_classes]
二元真实标签或二元标签指示器
y_score:array, shape = [n_samples] or [n_samples, n_classes]
目标分数,可以是正类的预测概率,置信度,或者无阈值决策
average:string, [None, ‘micro’, ‘macro’ (default), ‘samples’, ‘weighted’]
pos_label:int or str (default=1)
返回值:
average_precision:float
举例:
import numpy as np
from sklearn.metrics import average_precision_score
y_true = np.array([0, 0, 1, 1])
y_scores = np.array([0.1, 0.4, 0.35, 0.8])
average_precision_score(y_true, y_scores)
# 0.83...