Gabriel Montenegro de Campos, Luan Gaspar Clemente, Alex Ranieri Jerônimo Lima, Eleonora Cella, Vagner Fonseca, João Paulo Bianchi Ximenez, Milton Yutaka Nishiyama, Enéas de Carvalho, Sandra Coccuzzo Sampaio, Marta Giovanetti, Maria Carolina Elias, Svetoslav Nanev Slavov
{"title":"Anellovirus abundance as an indicator for viral metagenomic classifier utility in plasma samples.","authors":"Gabriel Montenegro de Campos, Luan Gaspar Clemente, Alex Ranieri Jerônimo Lima, Eleonora Cella, Vagner Fonseca, João Paulo Bianchi Ximenez, Milton Yutaka Nishiyama, Enéas de Carvalho, Sandra Coccuzzo Sampaio, Marta Giovanetti, Maria Carolina Elias, Svetoslav Nanev Slavov","doi":"10.1186/s12985-025-02708-8","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Viral metagenomics has expanded significantly in recent years due to advancements in next-generation sequencing, establishing it as the leading method for identifying emerging viruses. A crucial step in metagenomics is taxonomic classification, where sequence data is assigned to specific taxa, thereby enabling the characterization of species composition within a sample. Various taxonomic classifiers have been developed in recent years, each employing distinct classification approaches that produce varying results and abundance profiles, even when analyzing the same sample.</p><p><strong>Methods: </strong>In this study, we propose using the identification of Torque Teno Viruses (TTVs), from the Anelloviridae family, as indicators to evaluate the performance of four short-read-based metagenomic classifiers: Kraken2, Kaiju, CLARK and DIAMOND, when evaluating human plasma samples.</p><p><strong>Results: </strong>Our results show that each classifier assigns TTV species at different abundance levels, potentially influencing the interpretation of diversity within samples. Specifically, nucleotide-based classifiers tend to detect a broader range of TTV species, indicating higher sensitivity, while amino acid-based classifiers like DIAMOND and CLARK display lower abundance indices. Interestingly, despite employing different algorithms and data types (protein-based vs. nucleotide-based), Kaiju and Kraken2 performed similarly.</p><p><strong>Conclusion: </strong>Our study underscores the critical impact of classifier selection on diversity indices in metagenomic analyses. Kaiju effectively assigned a wide variety of TTV species, demonstrating it did not require a high volume of reads to capture diversity. Nucleotide-based classifiers like CLARK and Kraken2 showed superior sensitivity, which is valuable for detecting emerging or rare viruses. At the same time, protein-based approaches such as DIAMOND and Kaiju proved robust for identifying known species with low variability.</p>","PeriodicalId":23616,"journal":{"name":"Virology Journal","volume":"22 1","pages":"88"},"PeriodicalIF":4.0000,"publicationDate":"2025-03-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11951539/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Virology Journal","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s12985-025-02708-8","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"VIROLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Background: Viral metagenomics has expanded significantly in recent years due to advancements in next-generation sequencing, establishing it as the leading method for identifying emerging viruses. A crucial step in metagenomics is taxonomic classification, where sequence data is assigned to specific taxa, thereby enabling the characterization of species composition within a sample. Various taxonomic classifiers have been developed in recent years, each employing distinct classification approaches that produce varying results and abundance profiles, even when analyzing the same sample.
Methods: In this study, we propose using the identification of Torque Teno Viruses (TTVs), from the Anelloviridae family, as indicators to evaluate the performance of four short-read-based metagenomic classifiers: Kraken2, Kaiju, CLARK and DIAMOND, when evaluating human plasma samples.
Results: Our results show that each classifier assigns TTV species at different abundance levels, potentially influencing the interpretation of diversity within samples. Specifically, nucleotide-based classifiers tend to detect a broader range of TTV species, indicating higher sensitivity, while amino acid-based classifiers like DIAMOND and CLARK display lower abundance indices. Interestingly, despite employing different algorithms and data types (protein-based vs. nucleotide-based), Kaiju and Kraken2 performed similarly.
Conclusion: Our study underscores the critical impact of classifier selection on diversity indices in metagenomic analyses. Kaiju effectively assigned a wide variety of TTV species, demonstrating it did not require a high volume of reads to capture diversity. Nucleotide-based classifiers like CLARK and Kraken2 showed superior sensitivity, which is valuable for detecting emerging or rare viruses. At the same time, protein-based approaches such as DIAMOND and Kaiju proved robust for identifying known species with low variability.
期刊介绍:
Virology Journal is an open access, peer reviewed journal that considers articles on all aspects of virology, including research on the viruses of animals, plants and microbes. The journal welcomes basic research as well as pre-clinical and clinical studies of novel diagnostic tools, vaccines and anti-viral therapies.
The Editorial policy of Virology Journal is to publish all research which is assessed by peer reviewers to be a coherent and sound addition to the scientific literature, and puts less emphasis on interest levels or perceived impact.