Machine Learning-Based Plasma Protein Risk Score Improves Atrial Fibrillation Prediction Over Clinical and Genomic Models.

IF 6 2区 医学 Q1 CARDIAC & CARDIOVASCULAR SYSTEMS
Min Seo Kim, Shaan Khurshid, Shinwan Kany, Lu-Chen Weng, Sarah Urbut, Carolina Roselli, Leonoor Wijdeveld, Sean J Jurgens, Joel T Rämö, Patrick T Ellinor, Akl C Fahed
{"title":"Machine Learning-Based Plasma Protein Risk Score Improves Atrial Fibrillation Prediction Over Clinical and Genomic Models.","authors":"Min Seo Kim, Shaan Khurshid, Shinwan Kany, Lu-Chen Weng, Sarah Urbut, Carolina Roselli, Leonoor Wijdeveld, Sean J Jurgens, Joel T Rämö, Patrick T Ellinor, Akl C Fahed","doi":"10.1161/CIRCGEN.124.004943","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Clinical factors discriminate incident atrial fibrillation (AF) risk with moderate accuracy, with only modest improvement after incorporation of polygenic risk scores. Whether emerging large-scale proteomic profiling can augment AF risk estimation is unknown.</p><p><strong>Methods: </strong>In the UK Biobank cohort, we derived and validated a machine learning model to predict incident AF risk using serum proteins (Pro-AF). We compared Pro-AF to a validated clinical risk score (Cohorts for Aging and Genomic Epidemiology-Atrial Fibrillation) and an AF polygenic risk score. Models were evaluated in a multiply resampled test set from nested cross-validation (internal test set), and a sample of UK Biobank participants separate from model development (hold-out test set). Metrics included discrimination of 5-year incident AF using time-dependent area under the receiver operating characteristic curve and net reclassification.</p><p><strong>Results: </strong>Trained in 32 631 UK Biobank participants, Pro-AF predicts incident AF using 121 protein levels (out of 2911 protein analytes). When assessed in the internal test set comprising 30 632 individuals (mean age 57±8 years, 54% women, 2045 AF events) and hold-out test set comprising 13 998 individuals (mean age 57±8 years, 54% women, 870 AF events), discrimination of 5-year incident AF was highest using Pro-AF (area under the receiver operating characteristic curve internal: 0.761 [95% CI, 0.745-0.780], hold-out: 0.763 [0.734-0.784]), followed by Cohorts for Aging and Genomic Epidemiology-Atrial Fibrillation (0.719 [0.700-0.737]; 0.702 [0.668-0.730]) and the polygenic risk score (0.686 [0.668-0.702]; 0.682 [0.660-0.710]). AF risk estimates were well-calibrated, and the addition of Pro-AF led to substantial continuous net reclassification improvement over Cohorts for Aging and Genomic Epidemiology-Atrial Fibrillation (eg, internal test set 0.410 [0.330-0.492]). A simplified Pro-AF including only the 5 most influential proteins (NT-proBNP, EDA2R [ectodysplasin A2 receptor], NPPB [B-type natriuretic peptide], BCAN [brevican core protein], and GDF15 [growth/differentiation factor 15]), retained favorable discriminative value (area under the receiver operating characteristic curve internal: 0.750 [0.733-0.768]; hold-out: 0.759 [0.732-0.790]).</p><p><strong>Conclusions: </strong>A machine learning-based protein score discriminates 5-year incident AF risk favorably compared with clinical and genetic risk factors. Large-scale proteomic analysis may assist in the prioritization of individuals at risk for AF for screening and related preventive interventions.</p>","PeriodicalId":10326,"journal":{"name":"Circulation: Genomic and Precision Medicine","volume":" ","pages":"e004943"},"PeriodicalIF":6.0000,"publicationDate":"2025-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12257488/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Circulation: Genomic and Precision Medicine","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1161/CIRCGEN.124.004943","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CARDIAC & CARDIOVASCULAR SYSTEMS","Score":null,"Total":0}
引用次数: 0

Abstract

Background: Clinical factors discriminate incident atrial fibrillation (AF) risk with moderate accuracy, with only modest improvement after incorporation of polygenic risk scores. Whether emerging large-scale proteomic profiling can augment AF risk estimation is unknown.

Methods: In the UK Biobank cohort, we derived and validated a machine learning model to predict incident AF risk using serum proteins (Pro-AF). We compared Pro-AF to a validated clinical risk score (Cohorts for Aging and Genomic Epidemiology-Atrial Fibrillation) and an AF polygenic risk score. Models were evaluated in a multiply resampled test set from nested cross-validation (internal test set), and a sample of UK Biobank participants separate from model development (hold-out test set). Metrics included discrimination of 5-year incident AF using time-dependent area under the receiver operating characteristic curve and net reclassification.

Results: Trained in 32 631 UK Biobank participants, Pro-AF predicts incident AF using 121 protein levels (out of 2911 protein analytes). When assessed in the internal test set comprising 30 632 individuals (mean age 57±8 years, 54% women, 2045 AF events) and hold-out test set comprising 13 998 individuals (mean age 57±8 years, 54% women, 870 AF events), discrimination of 5-year incident AF was highest using Pro-AF (area under the receiver operating characteristic curve internal: 0.761 [95% CI, 0.745-0.780], hold-out: 0.763 [0.734-0.784]), followed by Cohorts for Aging and Genomic Epidemiology-Atrial Fibrillation (0.719 [0.700-0.737]; 0.702 [0.668-0.730]) and the polygenic risk score (0.686 [0.668-0.702]; 0.682 [0.660-0.710]). AF risk estimates were well-calibrated, and the addition of Pro-AF led to substantial continuous net reclassification improvement over Cohorts for Aging and Genomic Epidemiology-Atrial Fibrillation (eg, internal test set 0.410 [0.330-0.492]). A simplified Pro-AF including only the 5 most influential proteins (NT-proBNP, EDA2R [ectodysplasin A2 receptor], NPPB [B-type natriuretic peptide], BCAN [brevican core protein], and GDF15 [growth/differentiation factor 15]), retained favorable discriminative value (area under the receiver operating characteristic curve internal: 0.750 [0.733-0.768]; hold-out: 0.759 [0.732-0.790]).

Conclusions: A machine learning-based protein score discriminates 5-year incident AF risk favorably compared with clinical and genetic risk factors. Large-scale proteomic analysis may assist in the prioritization of individuals at risk for AF for screening and related preventive interventions.

基于机器学习的血浆蛋白风险评分在临床和基因组模型上改善房颤预测。
背景:临床因素区分房颤(AF)风险的准确度中等,合并多基因风险评分后仅略有改善。新出现的大规模蛋白质组学分析是否能增加房颤风险评估尚不清楚。方法:在英国生物银行队列中,我们推导并验证了一个机器学习模型,该模型使用血清蛋白(Pro-AF)预测AF事件风险。我们将Pro-AF与经过验证的临床风险评分(老化和基因组流行病学-房颤队列)和房颤多基因风险评分进行比较。模型在来自嵌套交叉验证(内部测试集)的多次重采样测试集中进行评估,并在英国生物银行参与者的样本中与模型开发分开(保留测试集)。指标包括使用受试者工作特征曲线下的时间依赖面积和净重分类来区分5年的AF事件。结果:对32631名英国生物银行参与者进行了培训,Pro-AF使用121种蛋白质水平(2911种蛋白质分析物)预测AF事件。在包括30632人(平均年龄57±8岁,54%女性,2045例房颤事件)和包括13998人(平均年龄57±8岁,54%女性,870例房颤事件)的内部测试集中进行评估时,使用Pro-AF(受者工作特征曲线下面积:0.761 [95% CI, 0.745-0.780])对5年房颤事件的鉴别率最高。0.763[0.734-0.784]),其次是老龄化和基因组流行病学队列-房颤(0.719 [0.700-0.737];0.702[0.668-0.730])和多基因风险评分(0.686 [0.668-0.702];0.682(0.660 - -0.710))。房颤风险估计值经过了很好的校准,与老龄化和基因组流行病学-房颤队列(例如,内部测试集0.410[0.330-0.492])相比,Pro-AF的加入导致了大量持续的净重新分类改善。简化的Pro-AF仅包括5种最具影响力的蛋白(NT-proBNP, EDA2R[外泌素A2受体],NPPB [b型利钠肽],BCAN [brevican核心蛋白]和GDF15[生长/分化因子15]),保留了良好的判别值(内部受者工作特征曲线下面积:0.750 [0.733-0.768];hold: 0.759[0.732-0.790])。结论:与临床和遗传风险因素相比,基于机器学习的蛋白质评分可以更好地区分5年房颤事件风险。大规模蛋白质组学分析可能有助于确定AF风险个体的优先级,以便进行筛查和相关的预防干预。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Circulation: Genomic and Precision Medicine
Circulation: Genomic and Precision Medicine Biochemistry, Genetics and Molecular Biology-Genetics
CiteScore
9.20
自引率
5.40%
发文量
144
期刊介绍: Circulation: Genomic and Precision Medicine is a distinguished journal dedicated to advancing the frontiers of cardiovascular genomics and precision medicine. It publishes a diverse array of original research articles that delve into the genetic and molecular underpinnings of cardiovascular diseases. The journal's scope is broad, encompassing studies from human subjects to laboratory models, and from in vitro experiments to computational simulations. Circulation: Genomic and Precision Medicine is committed to publishing studies that have direct relevance to human cardiovascular biology and disease, with the ultimate goal of improving patient care and outcomes. The journal serves as a platform for researchers to share their groundbreaking work, fostering collaboration and innovation in the field of cardiovascular genomics and precision medicine.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信