Andrew Millard, Rémi Denise, Maria Lestido, Moi Taiga Thomas, Deven Webster, Dann Turner, Thomas Sicheritz-Pontén
{"title":"taxMyPhage: Automated Taxonomy of dsDNA Phage Genomes at the Genus and Species Level.","authors":"Andrew Millard, Rémi Denise, Maria Lestido, Moi Taiga Thomas, Deven Webster, Dann Turner, Thomas Sicheritz-Pontén","doi":"10.1089/phage.2024.0050","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Bacteriophages are classified into genera and species based on genomic similarity, a process regulated by the International Committee on the Taxonomy of Viruses. With the rapid increase in phage genomic data there is a growing need for automated classification systems that can handle large-scale genome analyses and place phages into new or existing genera and species.</p><p><strong>Materials and methods: </strong>We developed <i>taxMyPhage</i>, a tool system for the rapid automated classification of dsDNA bacteriophage genomes. The system integrates a MASH database, built from ICTV-classified phage genomes to identify closely related phages, followed by BLASTn to calculate intergenomic similarity, conforming to ICTV guidelines for genus and species classification. taxMyPhage is available as a git repository at https://github.com/amillard/tax_myPHAGE, a conda package, a pip-installable tool, and a web service at https://phagecompass.ku.dk.</p><p><strong>Results: </strong><i>taxMyPhage</i> enables rapid classification of bacteriophages to the genus and species level. Benchmarking on 705 genomes pending ICTV classification showed a 96.7% accuracy at the genus level and 97.9% accuracy at the species level. The system also detected inconsistencies in current ICTV classifications, identifying cases where genera did not adhere to ICTV's 70% average nucleotide identity (ANI) threshold for genus classification or 95% ANI for species. The command line version classified 705 genomes within 48 h, demonstrating its scalability for large datasets.</p><p><strong>Conclusions: </strong><i>taxMyPhage</i> significantly enhances the speed and accuracy of bacteriophage genome classification at the genus and species levels, making it compatible with current sequencing outputs. The tool facilitates the integration of bacteriophage classification into standard workflows, thereby accelerating research and ensuring consistent taxonomy.</p>","PeriodicalId":74428,"journal":{"name":"PHAGE (New Rochelle, N.Y.)","volume":"6 1","pages":"5-11"},"PeriodicalIF":0.0000,"publicationDate":"2025-03-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12060842/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"PHAGE (New Rochelle, N.Y.)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1089/phage.2024.0050","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/3/1 0:00:00","PubModel":"eCollection","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Background: Bacteriophages are classified into genera and species based on genomic similarity, a process regulated by the International Committee on the Taxonomy of Viruses. With the rapid increase in phage genomic data there is a growing need for automated classification systems that can handle large-scale genome analyses and place phages into new or existing genera and species.
Materials and methods: We developed taxMyPhage, a tool system for the rapid automated classification of dsDNA bacteriophage genomes. The system integrates a MASH database, built from ICTV-classified phage genomes to identify closely related phages, followed by BLASTn to calculate intergenomic similarity, conforming to ICTV guidelines for genus and species classification. taxMyPhage is available as a git repository at https://github.com/amillard/tax_myPHAGE, a conda package, a pip-installable tool, and a web service at https://phagecompass.ku.dk.
Results: taxMyPhage enables rapid classification of bacteriophages to the genus and species level. Benchmarking on 705 genomes pending ICTV classification showed a 96.7% accuracy at the genus level and 97.9% accuracy at the species level. The system also detected inconsistencies in current ICTV classifications, identifying cases where genera did not adhere to ICTV's 70% average nucleotide identity (ANI) threshold for genus classification or 95% ANI for species. The command line version classified 705 genomes within 48 h, demonstrating its scalability for large datasets.
Conclusions: taxMyPhage significantly enhances the speed and accuracy of bacteriophage genome classification at the genus and species levels, making it compatible with current sequencing outputs. The tool facilitates the integration of bacteriophage classification into standard workflows, thereby accelerating research and ensuring consistent taxonomy.