Chang Wan Seo, Shinnam Yoo, Yoonhee Cho, Ji Seon Kim, Martin Steinegger, Young Woon Lim
{"title":"FunVIP:基于系统发育分析的真菌验证和鉴定管道。","authors":"Chang Wan Seo, Shinnam Yoo, Yoonhee Cho, Ji Seon Kim, Martin Steinegger, Young Woon Lim","doi":"10.71150/jm.2411017","DOIUrl":null,"url":null,"abstract":"<p><p>The increase of sequence data in public nucleotide databases has made DNA sequence-based identification an indispensable tool for fungal identification. However, the large proportion of mislabeled sequence data in public databases leads to frequent misidentifications. Inaccurate identification is causing severe problems, especially for industrial and clinical fungi, and edible mushrooms. Existing species identification pipelines require separate validation of a dataset obtained from public databases containing mislabeled taxonomic identifications. To address this issue, we developed FunVIP, a fully automated phylogeny-based fungal validation and identification pipeline (https://github.com/Changwanseo/FunVIP). FunVIP employs phylogeny-based identification with validation, where the result is achievable only with a query, database, and a single command. FunVIP command comprises nine steps within a workflow: input management, sequence-set organization, alignment, trimming, concatenation, model selection, tree inference, tree interpretation, and report generation. Users may acquire identification results, phylogenetic tree evidence, and reports of conflicts and issues detected in multiple checkpoints during the analysis. The conflicting sample validation performance of FunVIP was demonstrated by re-iterating the manual revision of a fungal genus with a database with mislabeled sequences, Fuscoporia. We also compared the identification performance of FunVIP with BLAST and q2-feature-classifier with two mass double-revised fungal datasets, Sanghuangporus and Aspergillus section Terrei. Therefore, with its automatic validation ability and high identification performance, FunVIP proves to be a highly promising tool for achieving easy and accurate fungal identification.</p>","PeriodicalId":16546,"journal":{"name":"Journal of Microbiology","volume":"63 4","pages":"e2411017"},"PeriodicalIF":3.3000,"publicationDate":"2025-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"FunVIP: Fungal Validation and Identification Pipeline based on phylogenetic analysis.\",\"authors\":\"Chang Wan Seo, Shinnam Yoo, Yoonhee Cho, Ji Seon Kim, Martin Steinegger, Young Woon Lim\",\"doi\":\"10.71150/jm.2411017\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>The increase of sequence data in public nucleotide databases has made DNA sequence-based identification an indispensable tool for fungal identification. However, the large proportion of mislabeled sequence data in public databases leads to frequent misidentifications. Inaccurate identification is causing severe problems, especially for industrial and clinical fungi, and edible mushrooms. Existing species identification pipelines require separate validation of a dataset obtained from public databases containing mislabeled taxonomic identifications. To address this issue, we developed FunVIP, a fully automated phylogeny-based fungal validation and identification pipeline (https://github.com/Changwanseo/FunVIP). FunVIP employs phylogeny-based identification with validation, where the result is achievable only with a query, database, and a single command. FunVIP command comprises nine steps within a workflow: input management, sequence-set organization, alignment, trimming, concatenation, model selection, tree inference, tree interpretation, and report generation. Users may acquire identification results, phylogenetic tree evidence, and reports of conflicts and issues detected in multiple checkpoints during the analysis. The conflicting sample validation performance of FunVIP was demonstrated by re-iterating the manual revision of a fungal genus with a database with mislabeled sequences, Fuscoporia. We also compared the identification performance of FunVIP with BLAST and q2-feature-classifier with two mass double-revised fungal datasets, Sanghuangporus and Aspergillus section Terrei. Therefore, with its automatic validation ability and high identification performance, FunVIP proves to be a highly promising tool for achieving easy and accurate fungal identification.</p>\",\"PeriodicalId\":16546,\"journal\":{\"name\":\"Journal of Microbiology\",\"volume\":\"63 4\",\"pages\":\"e2411017\"},\"PeriodicalIF\":3.3000,\"publicationDate\":\"2025-04-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Microbiology\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.71150/jm.2411017\",\"RegionNum\":4,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/4/29 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q2\",\"JCRName\":\"MICROBIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Microbiology","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.71150/jm.2411017","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/4/29 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"MICROBIOLOGY","Score":null,"Total":0}
FunVIP: Fungal Validation and Identification Pipeline based on phylogenetic analysis.
The increase of sequence data in public nucleotide databases has made DNA sequence-based identification an indispensable tool for fungal identification. However, the large proportion of mislabeled sequence data in public databases leads to frequent misidentifications. Inaccurate identification is causing severe problems, especially for industrial and clinical fungi, and edible mushrooms. Existing species identification pipelines require separate validation of a dataset obtained from public databases containing mislabeled taxonomic identifications. To address this issue, we developed FunVIP, a fully automated phylogeny-based fungal validation and identification pipeline (https://github.com/Changwanseo/FunVIP). FunVIP employs phylogeny-based identification with validation, where the result is achievable only with a query, database, and a single command. FunVIP command comprises nine steps within a workflow: input management, sequence-set organization, alignment, trimming, concatenation, model selection, tree inference, tree interpretation, and report generation. Users may acquire identification results, phylogenetic tree evidence, and reports of conflicts and issues detected in multiple checkpoints during the analysis. The conflicting sample validation performance of FunVIP was demonstrated by re-iterating the manual revision of a fungal genus with a database with mislabeled sequences, Fuscoporia. We also compared the identification performance of FunVIP with BLAST and q2-feature-classifier with two mass double-revised fungal datasets, Sanghuangporus and Aspergillus section Terrei. Therefore, with its automatic validation ability and high identification performance, FunVIP proves to be a highly promising tool for achieving easy and accurate fungal identification.
期刊介绍:
Publishes papers that deal with research on microorganisms, including archaea, bacteria, yeasts, fungi, microalgae, protozoa, and simple eukaryotic microorganisms. Topics considered for publication include Microbial Systematics, Evolutionary Microbiology, Microbial Ecology, Environmental Microbiology, Microbial Genetics, Genomics, Molecular Biology, Microbial Physiology, Biochemistry, Microbial Pathogenesis, Host-Microbe Interaction, Systems Microbiology, Synthetic Microbiology, Bioinformatics and Virology. Manuscripts dealing with simple identification of microorganism(s), cloning of a known gene and its expression in a microbial host, and clinical statistics will not be considered for publication by JM.