预测模型(predictive models)被广泛地应用于诊断(diagnosis)或预后预测(prognosis)。通常,这些模型的价值是通过统计学指标如敏感性、特异性、ROC曲线下面积、校准度来评估的,而这些指标无法考虑特定模型的临床实用性(clinical utility)。决策曲线分析(decision curve analysis, DCA)是衡量临床实用性的一种广泛使用的方法。
一个预测模型的输出通常为介于0到1之间的一个值(pi),根据事前确定的阈值概率(cutoff value, probability threshold, pt),当pi > pt时,判断为阳性;当pi < pt时,判断为阴性。因此,患者被分成了预测阳性而施加干预和预测阴性而不施加干预的两组。在预测阳性组中,存在着真阳性病人(TP)和假阳性病人(FP)。显然,治疗真阳性病人会带来受益(benefits),而治疗假阳性病人会造成伤害(harms)。选择不同的阈值概率,会改变TP和FP的比值,从而受益和伤害的改变。
为了同时考虑受益和伤害,决策曲线分析中,将模型的临床效用量化为净获益(net benefit)。
对于一个总样本量为 n , 阈值为pt的诊断试验,可以画出四格表:
金标准(+) | 金标准(-) | |
---|---|---|
模型(+) | TP | FP |
模型(-) | FN | TN |
阳性组的净获益为:
n e t b e n e f i t t r e a t e d = T P n − F P n ∗ ( p t 1 − p t ) net \; benefit \; treated = \frac{TP}{n}-\frac{FP}{n}*(\frac{p_t}{1-p_t} ) netbenefittreated=nTP−nFP∗(1−ptpt)
阴性组的净获益为:
n e t b e n e f i t u n t r e a t e d = T N n − F N n ∗ ( 1 − p t p t ) net \; benefit \; untreated = \frac{TN}{n}-\frac{FN}{n}*(\frac{1-p_t}{p_t} ) netbenefituntreated=nTN−nFN∗(pt1−pt)
决策曲线定义了这样一种关系:
n e t b e n e f i t t r e a t e d − n e t b e n e f i t t r e a t a l l ( p t 1 − p t ) = n e t b e n e f i t u n t r e a t e d \frac{net\;benefit\;treated - net\;benefit\;treat\:all}{(\frac{p_t}{1-p_t})} = net\;benefit\;untreated (1−ptpt)netbenefittreated−netbenefittreatall=netbenefituntreated
因此,可以计算得到treat all策略(即无论预测模型结果如何,所以病人都进行干预)的净获益为:
n e t b e n e f i t t r e a t a l l = T P + F N n − T N + F P n ∗ ( p t 1 − p t ) net\;benefit\;treat\:all=\frac{TP+FN}{n} -\frac{TN+FP}{n}*(\frac{p_t}{1-p_t}) netbenefittreatall=nTP+FN−nTN+FP∗(1−ptpt)
对于treat none策略,所有病人无论模型结果如果,都不进行干预,其净获益恒为0。
所谓决策曲线,即是以不同的probability threshold为横坐标,其所对应的net benefit为纵坐标,画出的曲线。
理论成立,实践开始!
绘制模型的决策曲线,我们只需要模型输出的 每个样本的预测概率(y_pred_score) 和 每个样本真实的分类(y_label) 。
模型带来的获益即是模型预测出阳性的部分,因为只有预测阳性的部分会施加和原本不同的干预,因此net benefit treated即为net benefit of model:
def calculate_net_benefit_model(thresh_group, y_pred_score, y_label):
net_benefit_model = np.array([])
for thresh in thresh_group:
y_pred_label = y_pred_score > thresh
tn, fp, fn, tp = confusion_matrix(y_label, y_pred_label).ravel()
n = len(y_label)
net_benefit = (tp / n) - (fp / n) * (thresh / (1 - thresh))
net_benefit_model = np.append(net_benefit_model, net_benefit)
return net_benefit_model
def calculate_net_benefit_all(thresh_group, y_label):
net_benefit_all = np.array([])
tn, fp, fn, tp = confusion_matrix(y_label, y_label).ravel()
total = tp + tn
for thresh in thresh_group:
net_benefit = (tp / total) - (tn / total) * (thresh / (1 - thresh))
net_benefit_all = np.append(net_benefit_all, net_benefit)
return net_benefit_all
def plot_DCA(ax, thresh_group, net_benefit_model, net_benefit_all):
#Plot
ax.plot(thresh_group, net_benefit_model, color = 'crimson', label = 'Model')
ax.plot(thresh_group, net_benefit_all, color = 'black',label = 'Treat all')
ax.plot((0, 1), (0, 0), color = 'black', linestyle = ':', label = 'Treat none')
#Fill,显示出模型较于treat all和treat none好的部分
y2 = np.maximum(net_benefit_all, 0)
y1 = np.maximum(net_benefit_model, y2)
ax.fill_between(thresh_group, y1, y2, color = 'crimson', alpha = 0.2)
#Figure Configuration, 美化一下细节
ax.set_xlim(0,1)
ax.set_ylim(net_benefit_model.min() - 0.15, net_benefit_model.max() + 0.15)#adjustify the y axis limitation
ax.set_xlabel(
xlabel = 'Threshold Probability',
fontdict= {'family': 'Times New Roman', 'fontsize': 15}
)
ax.set_ylabel(
ylabel = 'Net Benefit',
fontdict= {'family': 'Times New Roman', 'fontsize': 15}
)
ax.grid('major')
ax.spines['right'].set_color((0.8, 0.8, 0.8))
ax.spines['top'].set_color((0.8, 0.8, 0.8))
ax.legend(loc = 'upper right')
return ax
import numpy as np
import matplotlib.pyplot as plt
from sklearn.metrics import confusion_matrix
def calculate_net_benefit_model(thresh_group, y_pred_score, y_label):
net_benefit_model = np.array([])
for thresh in thresh_group:
y_pred_label = y_pred_score > thresh
tn, fp, fn, tp = confusion_matrix(y_label, y_pred_label).ravel()
n = len(y_label)
net_benefit = (tp / n) - (fp / n) * (thresh / (1 - thresh))
net_benefit_model = np.append(net_benefit_model, net_benefit)
return net_benefit_model
def calculate_net_benefit_all(thresh_group, y_label):
net_benefit_all = np.array([])
tn, fp, fn, tp = confusion_matrix(y_label, y_label).ravel()
total = tp + tn
for thresh in thresh_group:
net_benefit = (tp / total) - (tn / total) * (thresh / (1 - thresh))
net_benefit_all = np.append(net_benefit_all, net_benefit)
return net_benefit_all
def plot_DCA(ax, thresh_group, net_benefit_model, net_benefit_all):
#Plot
ax.plot(thresh_group, net_benefit_model, color = 'crimson', label = 'Model')
ax.plot(thresh_group, net_benefit_all, color = 'black',label = 'Treat all')
ax.plot((0, 1), (0, 0), color = 'black', linestyle = ':', label = 'Treat none')
#Fill,显示出模型较于treat all和treat none好的部分
y2 = np.maximum(net_benefit_all, 0)
y1 = np.maximum(net_benefit_model, y2)
ax.fill_between(thresh_group, y1, y2, color = 'crimson', alpha = 0.2)
#Figure Configuration, 美化一下细节
ax.set_xlim(0,1)
ax.set_ylim(net_benefit_model.min() - 0.15, net_benefit_model.max() + 0.15)#adjustify the y axis limitation
ax.set_xlabel(
xlabel = 'Threshold Probability',
fontdict= {'family': 'Times New Roman', 'fontsize': 15}
)
ax.set_ylabel(
ylabel = 'Net Benefit',
fontdict= {'family': 'Times New Roman', 'fontsize': 15}
)
ax.grid('major')
ax.spines['right'].set_color((0.8, 0.8, 0.8))
ax.spines['top'].set_color((0.8, 0.8, 0.8))
ax.legend(loc = 'upper right')
return ax
if __name__ == '__main__':
#构造一个分类效果不是很好的模型
y_pred_score = np.arange(0, 1, 0.001)
y_label = np.array([1]*25 + [0]*25 + [0]*450 + [1]*25 + [0]*25+ [1]*25 + [0]*25 + [1]*25 + [0]*25+ [1]*25 + [0]*25 + [1]*25 + [0]*25 + [1]*25 + [0]*25 + [1]*25 + [0]*50 + [1]*125)
thresh_group = np.arange(0,1,0.01)
net_benefit_model = calculate_net_benefit_model(thresh_group, y_pred_score, y_label)
net_benefit_all = calculate_net_benefit_all(thresh_group, y_label)
fig, ax = plt.subplots()
ax = plot_DCA(ax, thresh_group, net_benefit_model, net_benefit_all)
# fig.savefig('fig1.png', dpi = 300)
plt.show()
由于存在抽样误差,单次建模的结果可能存在偏倚。通常情况下, 可以采用bootstrapping或者k折交叉验证的方法来对净获益进行校正。同时,还可以用这种方法获得净获益的置信区间。
原理:根据bootstrapping法或k折交叉验证法得到的净获益结果,可以根据中心极限定理通过正态近似的方法求得置信区间