GTRpmix: A Linked General Time-Reversible Model for Profile Mixture Models.

IF 11 1区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY
Hector Banos, Thomas K F Wong, Justin Daneau, Edward Susko, Bui Quang Minh, Robert Lanfear, Matthew W Brown, Laura Eme, Andrew J Roger
{"title":"GTRpmix: A Linked General Time-Reversible Model for Profile Mixture Models.","authors":"Hector Banos, Thomas K F Wong, Justin Daneau, Edward Susko, Bui Quang Minh, Robert Lanfear, Matthew W Brown, Laura Eme, Andrew J Roger","doi":"10.1093/molbev/msae174","DOIUrl":null,"url":null,"abstract":"<p><p>Profile mixture models capture distinct biochemical constraints on the amino acid substitution process at different sites in proteins. These models feature a mixture of time-reversible models with a common matrix of exchangeabilities and distinct sets of equilibrium amino acid frequencies known as profiles. Combining the exchangeability matrix with each profile generates the matrix of instantaneous rates of amino acid exchange for that profile. Currently, empirically estimated exchangeability matrices (e.g. the LG matrix) are widely used for phylogenetic inference under profile mixture models. However, these were estimated using a single profile and are unlikely optimal for profile mixture models. Here, we describe the GTRpmix model that allows maximum likelihood estimation of a common exchangeability matrix under any profile mixture model. We show that exchangeability matrices estimated under profile mixture models differ from the LG matrix, dramatically improving model fit and topological estimation accuracy for empirical test cases. Because the GTRpmix model is computationally expensive, we provide two exchangeability matrices estimated from large concatenated phylogenomic-supermatrices to be used for phylogenetic analyses. One, called Eukaryotic Linked Mixture (ELM), is designed for phylogenetic analysis of proteins encoded by nuclear genomes of eukaryotes, and the other, Eukaryotic and Archaeal Linked mixture (EAL), for reconstructing relationships between eukaryotes and Archaea. These matrices, combined with profile mixture models, fit data better and have improved topology estimation relative to the LG matrix combined with the same mixture models. Starting with version 2.3.1, IQ-TREE2 allows users to estimate linked exchangeabilities (i.e. amino acid exchange rates) under profile mixture models.</p>","PeriodicalId":18730,"journal":{"name":"Molecular biology and evolution","volume":" ","pages":""},"PeriodicalIF":11.0000,"publicationDate":"2024-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11371462/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Molecular biology and evolution","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1093/molbev/msae174","RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Profile mixture models capture distinct biochemical constraints on the amino acid substitution process at different sites in proteins. These models feature a mixture of time-reversible models with a common matrix of exchangeabilities and distinct sets of equilibrium amino acid frequencies known as profiles. Combining the exchangeability matrix with each profile generates the matrix of instantaneous rates of amino acid exchange for that profile. Currently, empirically estimated exchangeability matrices (e.g. the LG matrix) are widely used for phylogenetic inference under profile mixture models. However, these were estimated using a single profile and are unlikely optimal for profile mixture models. Here, we describe the GTRpmix model that allows maximum likelihood estimation of a common exchangeability matrix under any profile mixture model. We show that exchangeability matrices estimated under profile mixture models differ from the LG matrix, dramatically improving model fit and topological estimation accuracy for empirical test cases. Because the GTRpmix model is computationally expensive, we provide two exchangeability matrices estimated from large concatenated phylogenomic-supermatrices to be used for phylogenetic analyses. One, called Eukaryotic Linked Mixture (ELM), is designed for phylogenetic analysis of proteins encoded by nuclear genomes of eukaryotes, and the other, Eukaryotic and Archaeal Linked mixture (EAL), for reconstructing relationships between eukaryotes and Archaea. These matrices, combined with profile mixture models, fit data better and have improved topology estimation relative to the LG matrix combined with the same mixture models. Starting with version 2.3.1, IQ-TREE2 allows users to estimate linked exchangeabilities (i.e. amino acid exchange rates) under profile mixture models.

GTRpmix:用于轮廓混合物模型的链接式通用时间可逆模型。
轮廓混合模型捕捉了蛋白质中不同位点氨基酸替代过程的不同生化约束。这些模型的特点是时间可逆模型的混合,具有共同的交换率矩阵和不同的氨基酸平衡频率集(称为轮廓)。将可交换性矩阵与每个轮廓相结合,就会产生该轮廓的氨基酸瞬时交换率矩阵。目前,根据经验估算的交换率矩阵(如 LG 矩阵)被广泛用于特征混合模型下的系统发育推断。然而,这些矩阵是使用单一剖面估算的,不太可能是剖面混合模型的最佳矩阵。在这里,我们描述了 GTRpmix 模型,该模型允许在任何剖面混合模型下最大似然估计共同的可交换性矩阵。我们的研究表明,在剖面混合模型下估算出的可交换性矩阵与 LG 矩阵不同,这大大提高了模型拟合度和经验测试案例的拓扑估算精度。由于 GTRpmix 模型的计算成本很高,我们提供了两个从大型连接系统发生组-上表矩阵中估算出的可交换性矩阵,用于系统发生学分析。其中一个称为真核生物关联混合物(ELM),用于对真核生物核基因组编码的蛋白质进行系统发育分析;另一个称为真核生物与古菌关联混合物(EAL),用于重建真核生物与古菌之间的关系。这些矩阵与轮廓混合物模型相结合,与 LG 矩阵和相同的混合物模型相结合相比,能更好地拟合数据并改进拓扑估计。从 2.3.1 版开始,IQ-TREE2 允许用户在剖面混合模型下估算关联交换率(即氨基酸交换率)。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Molecular biology and evolution
Molecular biology and evolution 生物-进化生物学
CiteScore
19.70
自引率
3.70%
发文量
257
审稿时长
1 months
期刊介绍: Molecular Biology and Evolution Journal Overview: Publishes research at the interface of molecular (including genomics) and evolutionary biology Considers manuscripts containing patterns, processes, and predictions at all levels of organization: population, taxonomic, functional, and phenotypic Interested in fundamental discoveries, new and improved methods, resources, technologies, and theories advancing evolutionary research Publishes balanced reviews of recent developments in genome evolution and forward-looking perspectives suggesting future directions in molecular evolution applications.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信