文献解读|柳叶刀:基于机器学习的急性冠脉综合征不良事件预测:一项汇集数据集的建模研究
DOI: 10.1016/S0140-6736(20)32519-8
Background
急性冠脉综合征(ACS)患者发生缺血和出血事件的风险很高,两者都是不良预后的驱动因素。
风险评估在每个患者的临床管理中起着至关重要的作用,对于选择二级预防的最佳药物治疗具有重要意义。
目前对急性冠脉综合征(ACS)后缺血和出血事件的预测工具的准确性对于个体化的患者管理策略来说仍然不够。
机器学习方法可能能够克服当前分析方法在风险预测中的一些限制,且有效性已在几个心血管应用中得到证明。
Methods and Result
Datasets
- 为了建立机器学习模型,我们使用了19826名成年急性冠脉综合征患者(≥18岁)的派生队列,并进行了1年的随访。
- 为了评估模型的性能,我们使用了外部验证队列,包括3444名住院的成年ACS患者,随访1年和2年。
研究人群的临床和治疗特征
Study outcomes:
特征量选择
结构化数据集包括25个变量:
- 16 clinical variables
- 5 thera peutic variables
- 2 angiographic variables
- 2 procedural variables
机器学习算法
- K-Nearest Neighbours (KNN)
- Naive Bayes (NB)
- Random Forest (RF)
- Adaptive Boosting (ADB)
机器学习(ML)算法的评价
学习指标
ROC曲线
AUC值
校正图(观测与预测风险的十分位数)
-
其他评价指标如下:
ROC curves and AUC values
Death, ReAMI and BARC-major bleeding prediction
Performance metrics and algorithm choice for the PRAISE score
Observed vs. Predicted Risk
- Calibration plots
- Observed vs. predicted decile risk comparative bar plots
PRAISE model
AUCs for death, myocardial infarction, and major bleeding for the training, internal validation, and external validation datasets at 1-year follow-up
Risk of observed death, myocardial infarction, and major bleeding according to deciles of event probability based on PRAISE scores
特征相对重要性
- 在训练过程中是否选择变量来分割节点中的数据
- 平方误差改善了多少
总之,如果在一个变量中找到最多的加权树和产生高纯度分裂,它将有较高的相对重要值。
scaled importance:每个变量的相对重要性与最高的变量相对重要性之间的比值.
feature importance weight on the PRAISE risk prediction:每个变量的相对重要性与所有变量的相对重要性之和的比值。
Radar plot for the eight most important predictors of death, myocardial infarction, and major bleeding
Classes of risk
stratified by deciles of event prob ability according to the relating PRAISE score
- low risk: first to sixth deciles;
- intermediate risk: seventh to ninth deciles;
- and high risk: tenth decile
Compared with low risk, being categorised as being at intermediate risk and high risk was associated with increased (p<0·0001) event occurrence for all the PRAISE scores.
Cross-classification of myocardial infarction and major bleeding risk classes and illustration of the hypothetical trade-off between the two types
of risk
PRAISE score with a lower number of patient features
Discussion
- Our score offers very high accuracy in detecting the risk of allcause death after an ACS in a population treated with current standard therapies.
- According to such stratification, a tenth of patients (the highest decile) would be classified at discharge as being at high risk of either death, recurrent myocardial infarction, or major bleeding, thus being candidates for a tighter followup.
Limitations
- The first is the observational retrospective design of the two registries
composing the derivation cohort. - A further possible limitation of our approach can be identified in the slight under estimation of the adaptive boosting classifier among highrisk patients
写在后面:
小木舟水平有限,文中难免有些纰漏,希望各位读者能够不吝赐教。欢迎大家关注我的
B站:木舟笔记
,获取更多视频讲解。制作不易,希望大家多多点赞
、在看
。
往期文章
- 跟着CELL学作图|1.火山图
- 跟着Cell学作图 | 2.柱状图+误差棒+散点+差异显著性检验
- 跟着 Cell 学作图 | 3.箱线图+散点+差异显著性检验
- 跟着 Cell 学作图 | 4.小提琴图
- 跟着Cell学作图 | 5.UMAP降维分析
- 跟着Cell学作图 | 6.时间序列分析(Mfuzz包)
- 跟着Cell学作图|7.富集分析(Metascape数据库)
- 跟着Cell学作图|8.富集分析网络图(Cytoscape/ClueGO)
- 跟着Cell学作图|9.PPI分析(GeNets数据库)
- 跟着Cell学作图|10.复杂热图
- 跟着Cell学作图| 11.Ingenuity Pathway Analysis(IPA)
- 跟着Cell学作图 | 12.韦恩图(Vennerable包)