FiTMuSiC: leveraging structural and (co)evolutionary data for protein fitness prediction

IF 3.8 3区 医学 Q2 GENETICS & HEREDITY
Matsvei Tsishyn, Gabriel Cia, Pauline Hermans, Jean Kwasigroch, Marianne Rooman, Fabrizio Pucci
{"title":"FiTMuSiC: leveraging structural and (co)evolutionary data for protein fitness prediction","authors":"Matsvei Tsishyn, Gabriel Cia, Pauline Hermans, Jean Kwasigroch, Marianne Rooman, Fabrizio Pucci","doi":"10.1186/s40246-024-00605-9","DOIUrl":null,"url":null,"abstract":"Systematically predicting the effects of mutations on protein fitness is essential for the understanding of genetic diseases. Indeed, predictions complement experimental efforts in analyzing how variants lead to dysfunctional proteins that in turn can cause diseases. Here we present our new fitness predictor, FiTMuSiC, which leverages structural, evolutionary and coevolutionary information. We show that FiTMuSiC predicts fitness with high accuracy despite the simplicity of its underlying model: it was among the top predictors on the hydroxymethylbilane synthase (HMBS) target of the sixth round of the Critical Assessment of Genome Interpretation challenge (CAGI6) and performs as well as much more complex deep learning models such as AlphaMissense. To further demonstrate FiTMuSiC’s robustness, we compared its predictions with in vitro activity data on HMBS, variant fitness data on human glucokinase (GCK), and variant deleteriousness data on HMBS and GCK. These analyses further confirm FiTMuSiC’s qualities and accuracy, which compare favorably with those of other predictors. Additionally, FiTMuSiC returns two scores that separately describe the functional and structural effects of the variant, thus providing mechanistic insight into why the variant leads to fitness loss or gain. We also provide an easy-to-use webserver at https://babylone.ulb.ac.be/FiTMuSiC , which is freely available for academic use and does not require any bioinformatics expertise, which simplifies the accessibility of our tool for the entire scientific community.","PeriodicalId":13183,"journal":{"name":"Human Genomics","volume":null,"pages":null},"PeriodicalIF":3.8000,"publicationDate":"2024-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Human Genomics","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s40246-024-00605-9","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"GENETICS & HEREDITY","Score":null,"Total":0}
引用次数: 0

Abstract

Systematically predicting the effects of mutations on protein fitness is essential for the understanding of genetic diseases. Indeed, predictions complement experimental efforts in analyzing how variants lead to dysfunctional proteins that in turn can cause diseases. Here we present our new fitness predictor, FiTMuSiC, which leverages structural, evolutionary and coevolutionary information. We show that FiTMuSiC predicts fitness with high accuracy despite the simplicity of its underlying model: it was among the top predictors on the hydroxymethylbilane synthase (HMBS) target of the sixth round of the Critical Assessment of Genome Interpretation challenge (CAGI6) and performs as well as much more complex deep learning models such as AlphaMissense. To further demonstrate FiTMuSiC’s robustness, we compared its predictions with in vitro activity data on HMBS, variant fitness data on human glucokinase (GCK), and variant deleteriousness data on HMBS and GCK. These analyses further confirm FiTMuSiC’s qualities and accuracy, which compare favorably with those of other predictors. Additionally, FiTMuSiC returns two scores that separately describe the functional and structural effects of the variant, thus providing mechanistic insight into why the variant leads to fitness loss or gain. We also provide an easy-to-use webserver at https://babylone.ulb.ac.be/FiTMuSiC , which is freely available for academic use and does not require any bioinformatics expertise, which simplifies the accessibility of our tool for the entire scientific community.
FiTMuSiC:利用结构和(共)进化数据进行蛋白质适宜性预测
系统预测变异对蛋白质适应性的影响对于了解遗传疾病至关重要。事实上,在分析变异如何导致蛋白质功能失调进而引发疾病时,预测是对实验工作的补充。在这里,我们介绍了新的适应性预测器 FiTMuSiC,它充分利用了结构、进化和协同进化信息。我们的研究表明,尽管 FiTMuSiC 的底层模型非常简单,但它却能高精度地预测适应性:在第六轮基因组解读关键评估挑战赛(CAGI6)中,它是羟甲基硅烷合成酶(HMBS)靶标的顶级预测器之一,其表现不亚于 AlphaMissense 等更复杂的深度学习模型。为了进一步证明 FiTMuSiC 的稳健性,我们将其预测结果与 HMBS 的体外活性数据、人类葡萄糖激酶(GCK)的变异适配性数据以及 HMBS 和 GCK 的变异缺失性数据进行了比较。这些分析进一步证实了 FiTMuSiC 的质量和准确性,与其他预测工具相比,FiTMuSiC 的质量和准确性更胜一筹。此外,FiTMuSiC 还能返回两个分数,分别描述变异体的功能和结构效应,从而从机理上揭示变异体导致适性损失或增益的原因。我们还提供了一个简单易用的网络服务器 https://babylone.ulb.ac.be/FiTMuSiC,供学术界免费使用,不需要任何生物信息学专业知识,从而简化了整个科学界对我们工具的使用。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Human Genomics
Human Genomics GENETICS & HEREDITY-
CiteScore
6.00
自引率
2.20%
发文量
55
审稿时长
11 weeks
期刊介绍: Human Genomics is a peer-reviewed, open access, online journal that focuses on the application of genomic analysis in all aspects of human health and disease, as well as genomic analysis of drug efficacy and safety, and comparative genomics. Topics covered by the journal include, but are not limited to: pharmacogenomics, genome-wide association studies, genome-wide sequencing, exome sequencing, next-generation deep-sequencing, functional genomics, epigenomics, translational genomics, expression profiling, proteomics, bioinformatics, animal models, statistical genetics, genetic epidemiology, human population genetics and comparative genomics.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信