{"title":"库斯科:一个工具,用于管理单拷贝同源物和提取标记基因与额外样本的系统发育树建设","authors":"Takashi Seiko, Koki Nagasawa, Ken Naito","doi":"10.1111/1440-1703.70015","DOIUrl":null,"url":null,"abstract":"<p>Single-copy orthologs are often used to reconstruct phylogenetic trees of life. A set of single-copy orthologs can be prepared by building a new database including the species/strains of interest and performing a homology search, but these steps are time-consuming when working with a large number of samples. To address this issue, more efficient and practical approaches are needed. Here, we developed a new pipeline called CUSCO, which reconstructs phylogenetic trees from the genome sequences of desired samples, including polyploid genomes or those without annotation using a reference set of protein sequences. As a benchmark, we compared the CUSCO pipeline with OrthoFinder. The CUSCO pipeline reproduced the same tree topologies that were reconstructed using single-copy orthologs selected by OrthoFinder in a significantly shorter runtime. The pipeline is implemented with a function to identify a minimal set of marker genes that reconstruct a species tree, which is comparable to the one reconstructed from single-copy orthologs. We also verified that the minimal set of marker genes identified by CUSCO accurately reproduces the tree topology obtained from the whole genome dataset. Sequencing these marker genes enables rapid and cost-effective inference of the phylogenetic position of newly sampled species. Now that the genomes can be sequenced easily and inexpensively, the speed and accuracy of CUSCO facilitate large-scale phylogenomic analyses on a desktop computer. Availability and implementation https://github.com/seikot345/CUSCO/.</p>","PeriodicalId":11434,"journal":{"name":"Ecological Research","volume":"40 6","pages":""},"PeriodicalIF":1.4000,"publicationDate":"2025-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://esj-journals.onlinelibrary.wiley.com/doi/epdf/10.1111/1440-1703.70015","citationCount":"0","resultStr":"{\"title\":\"CUSCO: A Tool for Curating Single-Copy Orthologs and Extracting Marker Genes for Phylogenetic Tree Construction With Extra Samples\",\"authors\":\"Takashi Seiko, Koki Nagasawa, Ken Naito\",\"doi\":\"10.1111/1440-1703.70015\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Single-copy orthologs are often used to reconstruct phylogenetic trees of life. A set of single-copy orthologs can be prepared by building a new database including the species/strains of interest and performing a homology search, but these steps are time-consuming when working with a large number of samples. To address this issue, more efficient and practical approaches are needed. Here, we developed a new pipeline called CUSCO, which reconstructs phylogenetic trees from the genome sequences of desired samples, including polyploid genomes or those without annotation using a reference set of protein sequences. As a benchmark, we compared the CUSCO pipeline with OrthoFinder. The CUSCO pipeline reproduced the same tree topologies that were reconstructed using single-copy orthologs selected by OrthoFinder in a significantly shorter runtime. The pipeline is implemented with a function to identify a minimal set of marker genes that reconstruct a species tree, which is comparable to the one reconstructed from single-copy orthologs. We also verified that the minimal set of marker genes identified by CUSCO accurately reproduces the tree topology obtained from the whole genome dataset. Sequencing these marker genes enables rapid and cost-effective inference of the phylogenetic position of newly sampled species. Now that the genomes can be sequenced easily and inexpensively, the speed and accuracy of CUSCO facilitate large-scale phylogenomic analyses on a desktop computer. Availability and implementation https://github.com/seikot345/CUSCO/.</p>\",\"PeriodicalId\":11434,\"journal\":{\"name\":\"Ecological Research\",\"volume\":\"40 6\",\"pages\":\"\"},\"PeriodicalIF\":1.4000,\"publicationDate\":\"2025-10-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://esj-journals.onlinelibrary.wiley.com/doi/epdf/10.1111/1440-1703.70015\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Ecological Research\",\"FirstCategoryId\":\"93\",\"ListUrlMain\":\"https://esj-journals.onlinelibrary.wiley.com/doi/10.1111/1440-1703.70015\",\"RegionNum\":4,\"RegionCategory\":\"环境科学与生态学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"ECOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Ecological Research","FirstCategoryId":"93","ListUrlMain":"https://esj-journals.onlinelibrary.wiley.com/doi/10.1111/1440-1703.70015","RegionNum":4,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ECOLOGY","Score":null,"Total":0}
CUSCO: A Tool for Curating Single-Copy Orthologs and Extracting Marker Genes for Phylogenetic Tree Construction With Extra Samples
Single-copy orthologs are often used to reconstruct phylogenetic trees of life. A set of single-copy orthologs can be prepared by building a new database including the species/strains of interest and performing a homology search, but these steps are time-consuming when working with a large number of samples. To address this issue, more efficient and practical approaches are needed. Here, we developed a new pipeline called CUSCO, which reconstructs phylogenetic trees from the genome sequences of desired samples, including polyploid genomes or those without annotation using a reference set of protein sequences. As a benchmark, we compared the CUSCO pipeline with OrthoFinder. The CUSCO pipeline reproduced the same tree topologies that were reconstructed using single-copy orthologs selected by OrthoFinder in a significantly shorter runtime. The pipeline is implemented with a function to identify a minimal set of marker genes that reconstruct a species tree, which is comparable to the one reconstructed from single-copy orthologs. We also verified that the minimal set of marker genes identified by CUSCO accurately reproduces the tree topology obtained from the whole genome dataset. Sequencing these marker genes enables rapid and cost-effective inference of the phylogenetic position of newly sampled species. Now that the genomes can be sequenced easily and inexpensively, the speed and accuracy of CUSCO facilitate large-scale phylogenomic analyses on a desktop computer. Availability and implementation https://github.com/seikot345/CUSCO/.
期刊介绍:
Ecological Research has been published in English by the Ecological Society of Japan since 1986. Ecological Research publishes original papers on all aspects of ecology, in both aquatic and terrestrial ecosystems.