MixtureFinder:估计DNA混合模型用于系统发育分析。

IF 11 1区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY
Huaiyan Ren, Thomas K F Wong, Bui Quang Minh, Robert Lanfear
{"title":"MixtureFinder:估计DNA混合模型用于系统发育分析。","authors":"Huaiyan Ren, Thomas K F Wong, Bui Quang Minh, Robert Lanfear","doi":"10.1093/molbev/msae264","DOIUrl":null,"url":null,"abstract":"<p><p>In phylogenetic studies, both partitioned models and mixture models are used to account for heterogeneity in molecular evolution among the sites of DNA sequence alignments. Partitioned models require the user to specify the grouping of sites into subsets, and then assume that each subset of sites can be modeled by a single common process. Mixture models do not require users to prespecify subsets of sites, and instead calculate the likelihood of every site under every model, while co-estimating the model weights and parameters. While much research has gone into the optimization of partitioned models by merging user-specified subsets, there has been less attention paid to the optimization of mixture models for DNA sequence alignments. In this study, we first ask whether a key assumption of partitioned models-that each user-specified subset can be modeled by a single common process-is supported by the data. Having shown that this is not the case, we then design, implement, test, and apply an algorithm, MixtureFinder, to select the optimum number of classes for a mixture model of Q-matrices for the standard models of DNA sequence evolution. We show this algorithm performs well on simulated and empirical datasets and suggest that it may be useful for future empirical studies. MixtureFinder is available in IQ-TREE2, and a tutorial for using MixtureFinder can be found here: http://www.iqtree.org/doc/Complex-Models#mixture-models.</p>","PeriodicalId":18730,"journal":{"name":"Molecular biology and evolution","volume":" ","pages":""},"PeriodicalIF":11.0000,"publicationDate":"2025-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11704958/pdf/","citationCount":"0","resultStr":"{\"title\":\"MixtureFinder: Estimating DNA Mixture Models for Phylogenetic Analyses.\",\"authors\":\"Huaiyan Ren, Thomas K F Wong, Bui Quang Minh, Robert Lanfear\",\"doi\":\"10.1093/molbev/msae264\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>In phylogenetic studies, both partitioned models and mixture models are used to account for heterogeneity in molecular evolution among the sites of DNA sequence alignments. Partitioned models require the user to specify the grouping of sites into subsets, and then assume that each subset of sites can be modeled by a single common process. Mixture models do not require users to prespecify subsets of sites, and instead calculate the likelihood of every site under every model, while co-estimating the model weights and parameters. While much research has gone into the optimization of partitioned models by merging user-specified subsets, there has been less attention paid to the optimization of mixture models for DNA sequence alignments. In this study, we first ask whether a key assumption of partitioned models-that each user-specified subset can be modeled by a single common process-is supported by the data. Having shown that this is not the case, we then design, implement, test, and apply an algorithm, MixtureFinder, to select the optimum number of classes for a mixture model of Q-matrices for the standard models of DNA sequence evolution. We show this algorithm performs well on simulated and empirical datasets and suggest that it may be useful for future empirical studies. MixtureFinder is available in IQ-TREE2, and a tutorial for using MixtureFinder can be found here: http://www.iqtree.org/doc/Complex-Models#mixture-models.</p>\",\"PeriodicalId\":18730,\"journal\":{\"name\":\"Molecular biology and evolution\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":11.0000,\"publicationDate\":\"2025-01-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11704958/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Molecular biology and evolution\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1093/molbev/msae264\",\"RegionNum\":1,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"BIOCHEMISTRY & MOLECULAR BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Molecular biology and evolution","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1093/molbev/msae264","RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
引用次数: 0

摘要

在系统发育研究中,分区模型和混合模型都被用来解释DNA序列比对位点之间分子进化的异质性。分区模型要求用户指定将站点分组为子集,然后假设站点的每个子集都可以通过单个公共流程建模。混合模型不需要用户预先指定站点子集,而是计算每个站点在每个模型下的可能性,同时共同估计模型权重和参数。虽然许多研究都是通过合并用户指定的子集来优化分割模型,但对DNA序列比对混合模型的优化关注较少。在本研究中,我们首先要问的是,数据是否支持分区模型的一个关键假设——每个用户指定的子集都可以由单个公共过程建模。在证明情况并非如此之后,我们随后设计、实现、测试和应用MixtureFinder算法,为DNA序列进化的标准模型的Q矩阵混合模型选择最佳数量的类。我们表明该算法在模拟和实证数据集上表现良好,并建议它可能对未来的实证研究有用。MixtureFinder在IQ-TREE2中可用,使用MixtureFinder的教程可以在这里找到:http://www.iqtree.org/doc/Complex-Models#mixture-models。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
MixtureFinder: Estimating DNA Mixture Models for Phylogenetic Analyses.

In phylogenetic studies, both partitioned models and mixture models are used to account for heterogeneity in molecular evolution among the sites of DNA sequence alignments. Partitioned models require the user to specify the grouping of sites into subsets, and then assume that each subset of sites can be modeled by a single common process. Mixture models do not require users to prespecify subsets of sites, and instead calculate the likelihood of every site under every model, while co-estimating the model weights and parameters. While much research has gone into the optimization of partitioned models by merging user-specified subsets, there has been less attention paid to the optimization of mixture models for DNA sequence alignments. In this study, we first ask whether a key assumption of partitioned models-that each user-specified subset can be modeled by a single common process-is supported by the data. Having shown that this is not the case, we then design, implement, test, and apply an algorithm, MixtureFinder, to select the optimum number of classes for a mixture model of Q-matrices for the standard models of DNA sequence evolution. We show this algorithm performs well on simulated and empirical datasets and suggest that it may be useful for future empirical studies. MixtureFinder is available in IQ-TREE2, and a tutorial for using MixtureFinder can be found here: http://www.iqtree.org/doc/Complex-Models#mixture-models.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Molecular biology and evolution
Molecular biology and evolution 生物-进化生物学
CiteScore
19.70
自引率
3.70%
发文量
257
审稿时长
1 months
期刊介绍: Molecular Biology and Evolution Journal Overview: Publishes research at the interface of molecular (including genomics) and evolutionary biology Considers manuscripts containing patterns, processes, and predictions at all levels of organization: population, taxonomic, functional, and phenotypic Interested in fundamental discoveries, new and improved methods, resources, technologies, and theories advancing evolutionary research Publishes balanced reviews of recent developments in genome evolution and forward-looking perspectives suggesting future directions in molecular evolution applications.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信