利用Rhapsody-2对致病性错义变异进行准确鉴定和机制评价

IF 9.4 1区 综合性期刊 Q1 MULTIDISCIPLINARY SCIENCES
Anupam Banerjee, Anthony T. Bogetti, Ivet Bahar
{"title":"利用Rhapsody-2对致病性错义变异进行准确鉴定和机制评价","authors":"Anupam Banerjee, Anthony T. Bogetti, Ivet Bahar","doi":"10.1073/pnas.2418100122","DOIUrl":null,"url":null,"abstract":"Understanding the effects of missense mutations or single amino acid variants (SAVs) on protein function is crucial for elucidating the molecular basis of diseases/disorders and designing rational therapies. We introduce here <jats:italic>Rhapsody-2</jats:italic> , a machine learning tool for discriminating pathogenic and neutral SAVs, significantly expanding on a precursor limited by the availability of structural data. With the advent of AlphaFold2 as a powerful tool for structure prediction, <jats:italic>Rhapsody-2</jats:italic> is trained on a significantly expanded dataset of 117,525 SAVs corresponding to 12,094 human proteins reported in the ClinVar database. Adopting a broad set of descriptors composed of sequence evolutionary, structural, dynamic, and energetics features in the training algorithm, <jats:italic>Rhapsody-2</jats:italic> achieved an AUROC of 0.94 in 10-fold cross-validation when all SAVs of a particular test protein (mutant) were excluded from the training set. Benchmarking against a variety of testing datasets demonstrated the high performance of <jats:italic>Rhapsody-2</jats:italic> . While sequence evolutionary descriptors play a dominant role in pathogenicity prediction, those based on structural dynamics provide a mechanistic interpretation. Notably, residues involved in allosteric communication and those distinguished by pronounced fluctuations in the high-frequency modes of motion or subject to spatial constraints in soft modes usually give rise to pathogenicity when mutated. Overall, <jats:italic>Rhapsody-2</jats:italic> provides an efficient and transparent tool for accurately predicting the pathogenicity of SAVs and unraveling the mechanistic basis of the observed behavior, thus advancing our understanding of genotype-to-phenotype relations.","PeriodicalId":20548,"journal":{"name":"Proceedings of the National Academy of Sciences of the United States of America","volume":"8 1","pages":""},"PeriodicalIF":9.4000,"publicationDate":"2025-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Accurate identification and mechanistic evaluation of pathogenic missense variants with Rhapsody-2\",\"authors\":\"Anupam Banerjee, Anthony T. Bogetti, Ivet Bahar\",\"doi\":\"10.1073/pnas.2418100122\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Understanding the effects of missense mutations or single amino acid variants (SAVs) on protein function is crucial for elucidating the molecular basis of diseases/disorders and designing rational therapies. We introduce here <jats:italic>Rhapsody-2</jats:italic> , a machine learning tool for discriminating pathogenic and neutral SAVs, significantly expanding on a precursor limited by the availability of structural data. With the advent of AlphaFold2 as a powerful tool for structure prediction, <jats:italic>Rhapsody-2</jats:italic> is trained on a significantly expanded dataset of 117,525 SAVs corresponding to 12,094 human proteins reported in the ClinVar database. Adopting a broad set of descriptors composed of sequence evolutionary, structural, dynamic, and energetics features in the training algorithm, <jats:italic>Rhapsody-2</jats:italic> achieved an AUROC of 0.94 in 10-fold cross-validation when all SAVs of a particular test protein (mutant) were excluded from the training set. Benchmarking against a variety of testing datasets demonstrated the high performance of <jats:italic>Rhapsody-2</jats:italic> . While sequence evolutionary descriptors play a dominant role in pathogenicity prediction, those based on structural dynamics provide a mechanistic interpretation. Notably, residues involved in allosteric communication and those distinguished by pronounced fluctuations in the high-frequency modes of motion or subject to spatial constraints in soft modes usually give rise to pathogenicity when mutated. Overall, <jats:italic>Rhapsody-2</jats:italic> provides an efficient and transparent tool for accurately predicting the pathogenicity of SAVs and unraveling the mechanistic basis of the observed behavior, thus advancing our understanding of genotype-to-phenotype relations.\",\"PeriodicalId\":20548,\"journal\":{\"name\":\"Proceedings of the National Academy of Sciences of the United States of America\",\"volume\":\"8 1\",\"pages\":\"\"},\"PeriodicalIF\":9.4000,\"publicationDate\":\"2025-05-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the National Academy of Sciences of the United States of America\",\"FirstCategoryId\":\"103\",\"ListUrlMain\":\"https://doi.org/10.1073/pnas.2418100122\",\"RegionNum\":1,\"RegionCategory\":\"综合性期刊\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"MULTIDISCIPLINARY SCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the National Academy of Sciences of the United States of America","FirstCategoryId":"103","ListUrlMain":"https://doi.org/10.1073/pnas.2418100122","RegionNum":1,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}
引用次数: 0

摘要

了解错义突变或单氨基酸变异(sav)对蛋白质功能的影响对于阐明疾病/失调的分子基础和设计合理的治疗方法至关重要。我们在这里介绍Rhapsody-2,这是一种用于区分致病性和中性sav的机器学习工具,在受结构数据可用性限制的前体上进行了显著扩展。随着AlphaFold2作为结构预测的强大工具的出现,Rhapsody-2在ClinVar数据库中报告的117,525个sav对应的12,094个人类蛋白质的显著扩展数据集上进行训练。在训练算法中采用由序列进化、结构、动态和能量学特征组成的广泛描述符,当从训练集中排除特定测试蛋白(突变体)的所有sav时,Rhapsody-2在10倍交叉验证中获得了0.94的AUROC。针对各种测试数据集的基准测试证明了Rhapsody-2的高性能。虽然序列进化描述符在致病性预测中起主导作用,但基于结构动力学的描述符提供了一种机制解释。值得注意的是,参与变构通讯的残基和那些在高频运动模式中以明显波动为特征的残基,或在软模式中受空间限制的残基,在突变时通常会产生致病性。总之,Rhapsody-2为准确预测sav的致病性和揭示观察到的行为的机制基础提供了一个高效透明的工具,从而促进了我们对基因型-表型关系的理解。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Accurate identification and mechanistic evaluation of pathogenic missense variants with Rhapsody-2
Understanding the effects of missense mutations or single amino acid variants (SAVs) on protein function is crucial for elucidating the molecular basis of diseases/disorders and designing rational therapies. We introduce here Rhapsody-2 , a machine learning tool for discriminating pathogenic and neutral SAVs, significantly expanding on a precursor limited by the availability of structural data. With the advent of AlphaFold2 as a powerful tool for structure prediction, Rhapsody-2 is trained on a significantly expanded dataset of 117,525 SAVs corresponding to 12,094 human proteins reported in the ClinVar database. Adopting a broad set of descriptors composed of sequence evolutionary, structural, dynamic, and energetics features in the training algorithm, Rhapsody-2 achieved an AUROC of 0.94 in 10-fold cross-validation when all SAVs of a particular test protein (mutant) were excluded from the training set. Benchmarking against a variety of testing datasets demonstrated the high performance of Rhapsody-2 . While sequence evolutionary descriptors play a dominant role in pathogenicity prediction, those based on structural dynamics provide a mechanistic interpretation. Notably, residues involved in allosteric communication and those distinguished by pronounced fluctuations in the high-frequency modes of motion or subject to spatial constraints in soft modes usually give rise to pathogenicity when mutated. Overall, Rhapsody-2 provides an efficient and transparent tool for accurately predicting the pathogenicity of SAVs and unraveling the mechanistic basis of the observed behavior, thus advancing our understanding of genotype-to-phenotype relations.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
19.00
自引率
0.90%
发文量
3575
审稿时长
2.5 months
期刊介绍: The Proceedings of the National Academy of Sciences (PNAS), a peer-reviewed journal of the National Academy of Sciences (NAS), serves as an authoritative source for high-impact, original research across the biological, physical, and social sciences. With a global scope, the journal welcomes submissions from researchers worldwide, making it an inclusive platform for advancing scientific knowledge.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信