R2ROC: an efficient method of comparing two or more correlated AUC from out-of-sample prediction using polygenic scores.

IF 3.8 2区 生物学 Q2 GENETICS & HEREDITY
Human Genetics Pub Date : 2024-10-01 Epub Date: 2024-06-20 DOI:10.1007/s00439-024-02682-1
Md Moksedul Momin, Naomi R Wray, S Hong Lee
{"title":"R2ROC: an efficient method of comparing two or more correlated AUC from out-of-sample prediction using polygenic scores.","authors":"Md Moksedul Momin, Naomi R Wray, S Hong Lee","doi":"10.1007/s00439-024-02682-1","DOIUrl":null,"url":null,"abstract":"<p><p>Polygenic risk scores (PRSs) enable early prediction of disease risk. Evaluating PRS performance for binary traits commonly relies on the area under the receiver operating characteristic curve (AUC). However, the widely used DeLong's method for comparative significance tests suffer from limitations, including computational time and the lack of a one-to-one mapping between test statistics based on AUC and <math><msup><mi>R</mi> <mn>2</mn></msup> </math> . To overcome these limitations, we propose a novel approach that leverages the Delta method to derive the variance and covariance of AUC values, enabling a comprehensive and efficient comparative significance test. Our approach offers notable advantages over DeLong's method, including reduced computation time (up to 150-fold), making it suitable for large-scale analyses and ideal for integration into machine learning frameworks. Furthermore, our method allows for a direct one-to-one mapping between AUC and <math><msup><mi>R</mi> <mn>2</mn></msup> </math> values for comparative significance tests, providing enhanced insights into the relationship between these measures and facilitating their interpretation. We validated our proposed approach through simulations and applied it to real data comparing PRSs for diabetes and coronary artery disease (CAD) prediction in a cohort of 28,880 European individuals. The PRSs were derived using genome-wide association study summary statistics from two distinct sources. Our approach enabled a comprehensive and informative comparison of the PRSs, shedding light on their respective predictive abilities for diabetes and CAD. This advancement contributes to the assessment of genetic risk factors and personalized disease prediction, supporting better healthcare decision-making.</p>","PeriodicalId":13175,"journal":{"name":"Human Genetics","volume":" ","pages":"1193-1205"},"PeriodicalIF":3.8000,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Human Genetics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1007/s00439-024-02682-1","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/6/20 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"GENETICS & HEREDITY","Score":null,"Total":0}
引用次数: 0

Abstract

Polygenic risk scores (PRSs) enable early prediction of disease risk. Evaluating PRS performance for binary traits commonly relies on the area under the receiver operating characteristic curve (AUC). However, the widely used DeLong's method for comparative significance tests suffer from limitations, including computational time and the lack of a one-to-one mapping between test statistics based on AUC and R 2 . To overcome these limitations, we propose a novel approach that leverages the Delta method to derive the variance and covariance of AUC values, enabling a comprehensive and efficient comparative significance test. Our approach offers notable advantages over DeLong's method, including reduced computation time (up to 150-fold), making it suitable for large-scale analyses and ideal for integration into machine learning frameworks. Furthermore, our method allows for a direct one-to-one mapping between AUC and R 2 values for comparative significance tests, providing enhanced insights into the relationship between these measures and facilitating their interpretation. We validated our proposed approach through simulations and applied it to real data comparing PRSs for diabetes and coronary artery disease (CAD) prediction in a cohort of 28,880 European individuals. The PRSs were derived using genome-wide association study summary statistics from two distinct sources. Our approach enabled a comprehensive and informative comparison of the PRSs, shedding light on their respective predictive abilities for diabetes and CAD. This advancement contributes to the assessment of genetic risk factors and personalized disease prediction, supporting better healthcare decision-making.

Abstract Image

R2ROC:一种利用多基因评分比较样本外预测中两个或多个相关 AUC 的有效方法。
多基因风险评分(PRS)可用于疾病风险的早期预测。评估二元性状的 PRS 性能通常依赖于接收者操作特征曲线下的面积(AUC)。然而,广泛使用的 DeLong 方法在比较显著性检验方面存在局限性,包括计算时间以及基于 AUC 和 R 2 的检验统计量之间缺乏一对一的映射。为了克服这些局限性,我们提出了一种新方法,利用德尔塔法推导出 AUC 值的方差和协方差,从而实现全面高效的显著性比较检验。与 DeLong 的方法相比,我们的方法具有显著的优势,包括计算时间缩短(最多可缩短 150 倍),因此适用于大规模分析,也非常适合集成到机器学习框架中。此外,我们的方法允许在 AUC 值和 R 2 值之间直接进行一对一的映射,以进行显著性比较测试,从而提高了对这些指标之间关系的洞察力,并方便了对它们的解释。我们通过模拟验证了我们提出的方法,并将其应用于真实数据,比较了由 28,880 名欧洲人组成的队列中用于糖尿病和冠状动脉疾病(CAD)预测的 PRS。PRS是通过两个不同来源的全基因组关联研究汇总统计得出的。我们的方法对 PRSs 进行了全面、翔实的比较,揭示了它们各自对糖尿病和冠心病的预测能力。这一进展有助于评估遗传风险因素和个性化疾病预测,从而支持更好的医疗决策。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Human Genetics
Human Genetics 生物-遗传学
CiteScore
10.80
自引率
3.80%
发文量
94
审稿时长
1 months
期刊介绍: Human Genetics is a monthly journal publishing original and timely articles on all aspects of human genetics. The Journal particularly welcomes articles in the areas of Behavioral genetics, Bioinformatics, Cancer genetics and genomics, Cytogenetics, Developmental genetics, Disease association studies, Dysmorphology, ELSI (ethical, legal and social issues), Evolutionary genetics, Gene expression, Gene structure and organization, Genetics of complex diseases and epistatic interactions, Genetic epidemiology, Genome biology, Genome structure and organization, Genotype-phenotype relationships, Human Genomics, Immunogenetics and genomics, Linkage analysis and genetic mapping, Methods in Statistical Genetics, Molecular diagnostics, Mutation detection and analysis, Neurogenetics, Physical mapping and Population Genetics. Articles reporting animal models relevant to human biology or disease are also welcome. Preference will be given to those articles which address clinically relevant questions or which provide new insights into human biology. Unless reporting entirely novel and unusual aspects of a topic, clinical case reports, cytogenetic case reports, papers on descriptive population genetics, articles dealing with the frequency of polymorphisms or additional mutations within genes in which numerous lesions have already been described, and papers that report meta-analyses of previously published datasets will normally not be accepted. The Journal typically will not consider for publication manuscripts that report merely the isolation, map position, structure, and tissue expression profile of a gene of unknown function unless the gene is of particular interest or is a candidate gene involved in a human trait or disorder.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信