位点特异性结构和稳定性约束替代模型改进了系统发育推断

IF 6.1 1区 生物学 Q1 EVOLUTIONARY BIOLOGY
Ivan Lorca-Alonso, Otero-de-Navascues Fernando, Miguel Arenas, Ugo Bastolla
{"title":"位点特异性结构和稳定性约束替代模型改进了系统发育推断","authors":"Ivan Lorca-Alonso, Otero-de-Navascues Fernando, Miguel Arenas, Ugo Bastolla","doi":"10.1093/sysbio/syaf007","DOIUrl":null,"url":null,"abstract":"In previous studies, we presented our site-specific Stability Constrained substitution models of Protein Evolution (Stab-CPE) that define fitness as the probability of finding a protein folded in its native state but ignore changes in the native structure. Stab-CPE models can be used to predict a more realistic evolutionary variability across protein sites, nevertheless they still qualitatively differ from observed data and appear too tolerant to mutations. Here we present novel structurally constrained substitution models (Str-CPE) that define fitness based on the structural deformation produced by a mutation, which we predict adopting an extension of Juli’an Echaveás linearly forced elastic network model. Compared to our previous Stab-CPE models, the novel Str-CPE models are more stringent (they predict lower sequence entropy and substitution rate), provide higher likelihood to multiple sequence alignments (MSAs) that include one or more known structures, and better predict the observed conservation across sites. The models that combine Str-CPE and Stab-CPE models are even more stringent and fit the empirical MSAs better. We collectively refer to our models as Structure and Stability Constrained substitution models of Protein Evolution (SSCPE). When using distantly-related proteins, we find that more similar phylogenies are inferred under the SSCPE models than under traditional empirical substitution models if compared to the corresponding reference phylogenies inferred using structural distances. Therefore, SSCPE models seem to be much better-fitting substitution models for deep phylogeny inference. The SSCPE models have been implemented in the PERL-based program SSCPE.pl, which uses RAxML-NG to infer phylogenies under the SSCPE model given a concatenated MSA and a list of protein structures that match the sequences in the MSA.","PeriodicalId":22120,"journal":{"name":"Systematic Biology","volume":"5 1","pages":""},"PeriodicalIF":6.1000,"publicationDate":"2025-04-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Site-specific structure and stability constrained substitution models improve phylogenetic inference\",\"authors\":\"Ivan Lorca-Alonso, Otero-de-Navascues Fernando, Miguel Arenas, Ugo Bastolla\",\"doi\":\"10.1093/sysbio/syaf007\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In previous studies, we presented our site-specific Stability Constrained substitution models of Protein Evolution (Stab-CPE) that define fitness as the probability of finding a protein folded in its native state but ignore changes in the native structure. Stab-CPE models can be used to predict a more realistic evolutionary variability across protein sites, nevertheless they still qualitatively differ from observed data and appear too tolerant to mutations. Here we present novel structurally constrained substitution models (Str-CPE) that define fitness based on the structural deformation produced by a mutation, which we predict adopting an extension of Juli’an Echaveás linearly forced elastic network model. Compared to our previous Stab-CPE models, the novel Str-CPE models are more stringent (they predict lower sequence entropy and substitution rate), provide higher likelihood to multiple sequence alignments (MSAs) that include one or more known structures, and better predict the observed conservation across sites. The models that combine Str-CPE and Stab-CPE models are even more stringent and fit the empirical MSAs better. We collectively refer to our models as Structure and Stability Constrained substitution models of Protein Evolution (SSCPE). When using distantly-related proteins, we find that more similar phylogenies are inferred under the SSCPE models than under traditional empirical substitution models if compared to the corresponding reference phylogenies inferred using structural distances. Therefore, SSCPE models seem to be much better-fitting substitution models for deep phylogeny inference. The SSCPE models have been implemented in the PERL-based program SSCPE.pl, which uses RAxML-NG to infer phylogenies under the SSCPE model given a concatenated MSA and a list of protein structures that match the sequences in the MSA.\",\"PeriodicalId\":22120,\"journal\":{\"name\":\"Systematic Biology\",\"volume\":\"5 1\",\"pages\":\"\"},\"PeriodicalIF\":6.1000,\"publicationDate\":\"2025-04-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Systematic Biology\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1093/sysbio/syaf007\",\"RegionNum\":1,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"EVOLUTIONARY BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Systematic Biology","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1093/sysbio/syaf007","RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"EVOLUTIONARY BIOLOGY","Score":null,"Total":0}
引用次数: 0

摘要

在之前的研究中,我们提出了蛋白质进化的位点特异性稳定性约束替代模型(Stab-CPE),该模型将适应度定义为发现蛋白质在其天然状态折叠而忽略天然结构变化的概率。Stab-CPE模型可以用来预测更现实的蛋白质位点的进化变异性,然而,它们在质量上仍然与观察到的数据不同,并且似乎对突变过于宽容。本文提出了一种新的结构约束替代模型(Str-CPE),该模型基于突变产生的结构变形来定义适应度,并采用聚力安Echaveás线性强迫弹性网络模型的扩展来预测。与我们之前的Stab-CPE模型相比,新的Str-CPE模型更严格(它们预测更低的序列熵和取代率),提供更高的可能性包含一个或多个已知结构的多个序列比对(msa),并更好地预测观察到的跨位点保守性。结合Str-CPE和Stab-CPE模型的模型更严格,更符合经验msa。我们将我们的模型统称为蛋白质进化的结构和稳定性约束替代模型(SSCPE)。当使用远缘相关蛋白时,我们发现,如果与使用结构距离推断的相应参考系统发生相比,在SSCPE模型下推断出的相似系统发生比在传统的经验替代模型下推断出的相似系统发生更多。因此,SSCPE模型似乎是更适合深层系统发育推断的替代模型。SSCPE模型已在基于perl的程序SSCPE.pl中实现,该程序使用RAxML-NG来推断SSCPE模型下的系统发育,并给出串联的MSA和与MSA中序列匹配的蛋白质结构列表。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Site-specific structure and stability constrained substitution models improve phylogenetic inference
In previous studies, we presented our site-specific Stability Constrained substitution models of Protein Evolution (Stab-CPE) that define fitness as the probability of finding a protein folded in its native state but ignore changes in the native structure. Stab-CPE models can be used to predict a more realistic evolutionary variability across protein sites, nevertheless they still qualitatively differ from observed data and appear too tolerant to mutations. Here we present novel structurally constrained substitution models (Str-CPE) that define fitness based on the structural deformation produced by a mutation, which we predict adopting an extension of Juli’an Echaveás linearly forced elastic network model. Compared to our previous Stab-CPE models, the novel Str-CPE models are more stringent (they predict lower sequence entropy and substitution rate), provide higher likelihood to multiple sequence alignments (MSAs) that include one or more known structures, and better predict the observed conservation across sites. The models that combine Str-CPE and Stab-CPE models are even more stringent and fit the empirical MSAs better. We collectively refer to our models as Structure and Stability Constrained substitution models of Protein Evolution (SSCPE). When using distantly-related proteins, we find that more similar phylogenies are inferred under the SSCPE models than under traditional empirical substitution models if compared to the corresponding reference phylogenies inferred using structural distances. Therefore, SSCPE models seem to be much better-fitting substitution models for deep phylogeny inference. The SSCPE models have been implemented in the PERL-based program SSCPE.pl, which uses RAxML-NG to infer phylogenies under the SSCPE model given a concatenated MSA and a list of protein structures that match the sequences in the MSA.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Systematic Biology
Systematic Biology 生物-进化生物学
CiteScore
13.00
自引率
7.70%
发文量
70
审稿时长
6-12 weeks
期刊介绍: Systematic Biology is the bimonthly journal of the Society of Systematic Biologists. Papers for the journal are original contributions to the theory, principles, and methods of systematics as well as phylogeny, evolution, morphology, biogeography, paleontology, genetics, and the classification of all living things. A Points of View section offers a forum for discussion, while book reviews and announcements of general interest are also featured.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信