Ana Serra Silva, Karen Siu-Ting, Christopher J Creevey, Davide Pisani, Mark Wilkinson
{"title":"Coping with Ineffective Overlap in Multilocus Phylogenetics.","authors":"Ana Serra Silva, Karen Siu-Ting, Christopher J Creevey, Davide Pisani, Mark Wilkinson","doi":"10.1093/sysbio/syaf044","DOIUrl":null,"url":null,"abstract":"<p><p>Missing data is a long standing issue in phylogenetic inference, which often results in high levels of taxonomic instability, obscuring otherwise well supported relationships. Multiple approaches have been developed to deal with the negative effects of ineffective overlap on tree resolution, often by identifying taxa for removal. Here we repurpose a heuristic method developed to identify unstable taxa in morphological data matrices, concatabominations, and combine it with a novel gene-tree jackknifing on matrix representation of trees to identify candidates for targeted sequencing. Using a multilocus caecilian dataset we illustrate the method's capacity to identify candidate taxa and loci for additional sequencing, compare the results to those of the mathematics-based gene sampling sufficiency approach and explore the terrace space associated with the multilocus dataset. We show that our approach yields tractable numbers of loci/taxa for targeted sequencing that successfully mitigate topological instability due to ineffective overlap, even when modest amounts of data are added.</p>","PeriodicalId":22120,"journal":{"name":"Systematic Biology","volume":" ","pages":""},"PeriodicalIF":5.7000,"publicationDate":"2025-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Systematic Biology","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1093/sysbio/syaf044","RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"EVOLUTIONARY BIOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Missing data is a long standing issue in phylogenetic inference, which often results in high levels of taxonomic instability, obscuring otherwise well supported relationships. Multiple approaches have been developed to deal with the negative effects of ineffective overlap on tree resolution, often by identifying taxa for removal. Here we repurpose a heuristic method developed to identify unstable taxa in morphological data matrices, concatabominations, and combine it with a novel gene-tree jackknifing on matrix representation of trees to identify candidates for targeted sequencing. Using a multilocus caecilian dataset we illustrate the method's capacity to identify candidate taxa and loci for additional sequencing, compare the results to those of the mathematics-based gene sampling sufficiency approach and explore the terrace space associated with the multilocus dataset. We show that our approach yields tractable numbers of loci/taxa for targeted sequencing that successfully mitigate topological instability due to ineffective overlap, even when modest amounts of data are added.
期刊介绍:
Systematic Biology is the bimonthly journal of the Society of Systematic Biologists. Papers for the journal are original contributions to the theory, principles, and methods of systematics as well as phylogeny, evolution, morphology, biogeography, paleontology, genetics, and the classification of all living things. A Points of View section offers a forum for discussion, while book reviews and announcements of general interest are also featured.