The structure of deviations from maximum parsimony for densely-sampled data and applications for clade support estimation.

ArXiv Pub Date : 2025-09-12
William Howard-Snyder, Will Dumm, Mary Barker, Ognian Milanov, Claris Winston, David H Rich, Marc A Suchard, Frederick A Matsen Iv
{"title":"The structure of deviations from maximum parsimony for densely-sampled data and applications for clade support estimation.","authors":"William Howard-Snyder, Will Dumm, Mary Barker, Ognian Milanov, Claris Winston, David H Rich, Marc A Suchard, Frederick A Matsen Iv","doi":"","DOIUrl":null,"url":null,"abstract":"<p><p>How do phylogenetic reconstruction algorithms go astray when they return incorrect trees? This simple question has not been answered in detail, even for maximum parsimony (MP), the simplest phylogenetic criterion. Understanding MP has recently gained relevance in the regime of extremely dense sampling, where each virus sample commonly differs by zero or one mutation from another previously sampled virus. Although recent research shows that evolutionary histories in this regime are close to being maximally parsimonious, the structure of their deviations from MP is not yet understood. In this paper, we develop algorithms to understand how the correct tree deviates from being MP in the densely sampled case. By applying these algorithms to simulations that realistically mimic the evolution of SARS-CoV-2, we find that simulated trees frequently only deviate from maximally parsimonious trees locally, through simple structures consisting of the same mutation appearing independently on sister branches. We leverage this insight to design approaches for sampling near-MP trees and using them to efficiently estimate clade supports.</p>","PeriodicalId":93888,"journal":{"name":"ArXiv","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2025-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12440060/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ArXiv","FirstCategoryId":"1085","ListUrlMain":"","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

How do phylogenetic reconstruction algorithms go astray when they return incorrect trees? This simple question has not been answered in detail, even for maximum parsimony (MP), the simplest phylogenetic criterion. Understanding MP has recently gained relevance in the regime of extremely dense sampling, where each virus sample commonly differs by zero or one mutation from another previously sampled virus. Although recent research shows that evolutionary histories in this regime are close to being maximally parsimonious, the structure of their deviations from MP is not yet understood. In this paper, we develop algorithms to understand how the correct tree deviates from being MP in the densely sampled case. By applying these algorithms to simulations that realistically mimic the evolution of SARS-CoV-2, we find that simulated trees frequently only deviate from maximally parsimonious trees locally, through simple structures consisting of the same mutation appearing independently on sister branches. We leverage this insight to design approaches for sampling near-MP trees and using them to efficiently estimate clade supports.

密集采样数据中最大简约性偏差的结构及其在支系支持度估计中的应用。
当系统发育重建算法返回不正确的树时,它们是如何误入歧途的?即使对于最简单的系统发育标准——最大简约性(MP),这个简单的问题也没有得到详细的回答。了解MP最近在极其密集的采样制度中获得了相关性,其中每个病毒样本通常与先前采样的另一个病毒相差零或一个突变。尽管最近的研究表明,在这种制度下的进化历史接近于最大限度的节俭,但它们偏离MP的结构尚未被理解。在本文中,我们开发了算法来理解在密集采样情况下正确树如何偏离MP。通过将这些算法应用于实际模拟SARS-CoV-2进化的模拟,我们发现模拟树经常只在局部偏离最节俭树,通过由相同突变组成的简单结构独立出现在姐妹分支上。我们利用这一见解来设计采样近mp树的方法,并使用它们来有效地估计进化支的支持。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信