Phylogenomic reconstruction influenced by assembly and annotation parameters: Using whole genome data to unravel the relationships of Spionidae (Annelida)
Viktoria E. Bogantes, Karin Meiβner, D. S. Waits, K. Kocot, K. M. Halanych
{"title":"Phylogenomic reconstruction influenced by assembly and annotation parameters: Using whole genome data to unravel the relationships of Spionidae (Annelida)","authors":"Viktoria E. Bogantes, Karin Meiβner, D. S. Waits, K. Kocot, K. M. Halanych","doi":"10.1111/zsc.12676","DOIUrl":null,"url":null,"abstract":"Most efforts at improving accuracy in phylogenomic reconstructions have focused on improving tree‐building methods or orthology determination. Even though the use of whole genome sequence or transcriptome data is increasing, the degree to which accurate genome assembly and annotation influence phylogenetic inference has not been well explored. Here, we use low‐coverage whole genome sequencing of spionid annelids to explore the impact of different assemblers and annotation strategies on tree reconstruction. We also produce a phylogenetic hypothesis that spans the breadth of Spionidae, examining the current systematics of the group, which is based on morphological parsimony analyses and classical taxonomy. Our results show that both assembly and annotation can have important consequences for the pool of loci that may be available for tree reconstruction. When an identical phylogenomic pipeline is used, differences in assembly and annotation can account for variation in reconstructed topologies. Interestingly, the completeness and depth of the data used for training annotation software (i.e. data from model systems) appear to be more important, by some measures, than the degree of phylogenetic relatedness of the organism from which training data are drawn. Despite variation in recovered topologies, the recognised subfamily Spioninae is nested within Nerininae, suggesting that diagnostic characters of Nerininae (e.g. thick egg membrane, short‐headed sperm) are symplesiomorphies of Spionidae rather than apomorphies of a particular subclade. With the increased use of genomic data, our results advocate for a broader consideration of how assembly and annotation may impact data matrices used in phylogenomic analyses.","PeriodicalId":2,"journal":{"name":"ACS Applied Bio Materials","volume":null,"pages":null},"PeriodicalIF":4.6000,"publicationDate":"2024-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACS Applied Bio Materials","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1111/zsc.12676","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MATERIALS SCIENCE, BIOMATERIALS","Score":null,"Total":0}
引用次数: 0
Abstract
Most efforts at improving accuracy in phylogenomic reconstructions have focused on improving tree‐building methods or orthology determination. Even though the use of whole genome sequence or transcriptome data is increasing, the degree to which accurate genome assembly and annotation influence phylogenetic inference has not been well explored. Here, we use low‐coverage whole genome sequencing of spionid annelids to explore the impact of different assemblers and annotation strategies on tree reconstruction. We also produce a phylogenetic hypothesis that spans the breadth of Spionidae, examining the current systematics of the group, which is based on morphological parsimony analyses and classical taxonomy. Our results show that both assembly and annotation can have important consequences for the pool of loci that may be available for tree reconstruction. When an identical phylogenomic pipeline is used, differences in assembly and annotation can account for variation in reconstructed topologies. Interestingly, the completeness and depth of the data used for training annotation software (i.e. data from model systems) appear to be more important, by some measures, than the degree of phylogenetic relatedness of the organism from which training data are drawn. Despite variation in recovered topologies, the recognised subfamily Spioninae is nested within Nerininae, suggesting that diagnostic characters of Nerininae (e.g. thick egg membrane, short‐headed sperm) are symplesiomorphies of Spionidae rather than apomorphies of a particular subclade. With the increased use of genomic data, our results advocate for a broader consideration of how assembly and annotation may impact data matrices used in phylogenomic analyses.