Jiayao Zhang, Feng Tang, Bixian Ni, Qiang Zhang, Xinyi Gong, Fanzhen Mao, Jun Cao, Yaobao Liu
{"title":"PGIP: a web server for the rapid taxonomic identification of parasite genomes.","authors":"Jiayao Zhang, Feng Tang, Bixian Ni, Qiang Zhang, Xinyi Gong, Fanzhen Mao, Jun Cao, Yaobao Liu","doi":"10.1186/s13071-025-07007-3","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Parasitic diseases remain a global health challenge, and traditional methods in their diagnosis face limitations in sensitivity and scalability. Genome-based sequencing technologies have improved and are increasingly employed for the identification of parasites; however, their clinical adoption remains hindered by the complexity of bioinformatics analysis, reliance on incomplete reference databases, and accessibility barriers for nonspecialists. Overcoming these challenges necessitates the development of standardized analytical workflows and high-quality genomic resources specifically tailored for parasite identification.</p><p><strong>Methods: </strong>We developed a user-friendly web server named the Parasite Genome Identification Platform (PGIP). The reference database was sourced from the National Center for Biotechnology Information (NCBI), WormBase, European Nucleotide Archive (ENA), and VEuPathDB, rigorously filtered for quality, and deduplicated using Cluster Database at High Identity with Tolerance (CD-HIT) to ensure accuracy and nonredundancy. To streamline analysis, we integrated a standardized identification pipeline built on Nextflow, which encompasses host DNA depletion, quality control, parasite species identification via both reads mapping and assembly-based approaches, and automated report generation for comprehensive diagnostic insights.</p><p><strong>Results: </strong>PGIP integrates a curated database of 280 parasite genomes; which is rigorously filtered for quality and taxonomic accuracy. Validation across diverse datasets demonstrated the precise species-level resolution of PGIP, and its compatibility with clinical samples. The platform features an intuitive graphic interface; and one-click analysis significantly reduces reliance on bioinformatics expertise, thus enabling rapid diagnosis.</p><p><strong>Conclusions: </strong>PGIP offers an accurate, efficient, and a user-friendly web server designed to simplify and accelerate the taxonomic identification of parasite genomes using data from metagenomic next-generation sequencing. Its automated framework reduces the need for specialized expertise, enabling rapid application in clinical and public health settings.</p>","PeriodicalId":19793,"journal":{"name":"Parasites & Vectors","volume":"18 1","pages":"365"},"PeriodicalIF":3.5000,"publicationDate":"2025-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12392538/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Parasites & Vectors","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s13071-025-07007-3","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PARASITOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Background: Parasitic diseases remain a global health challenge, and traditional methods in their diagnosis face limitations in sensitivity and scalability. Genome-based sequencing technologies have improved and are increasingly employed for the identification of parasites; however, their clinical adoption remains hindered by the complexity of bioinformatics analysis, reliance on incomplete reference databases, and accessibility barriers for nonspecialists. Overcoming these challenges necessitates the development of standardized analytical workflows and high-quality genomic resources specifically tailored for parasite identification.
Methods: We developed a user-friendly web server named the Parasite Genome Identification Platform (PGIP). The reference database was sourced from the National Center for Biotechnology Information (NCBI), WormBase, European Nucleotide Archive (ENA), and VEuPathDB, rigorously filtered for quality, and deduplicated using Cluster Database at High Identity with Tolerance (CD-HIT) to ensure accuracy and nonredundancy. To streamline analysis, we integrated a standardized identification pipeline built on Nextflow, which encompasses host DNA depletion, quality control, parasite species identification via both reads mapping and assembly-based approaches, and automated report generation for comprehensive diagnostic insights.
Results: PGIP integrates a curated database of 280 parasite genomes; which is rigorously filtered for quality and taxonomic accuracy. Validation across diverse datasets demonstrated the precise species-level resolution of PGIP, and its compatibility with clinical samples. The platform features an intuitive graphic interface; and one-click analysis significantly reduces reliance on bioinformatics expertise, thus enabling rapid diagnosis.
Conclusions: PGIP offers an accurate, efficient, and a user-friendly web server designed to simplify and accelerate the taxonomic identification of parasite genomes using data from metagenomic next-generation sequencing. Its automated framework reduces the need for specialized expertise, enabling rapid application in clinical and public health settings.
期刊介绍:
Parasites & Vectors is an open access, peer-reviewed online journal dealing with the biology of parasites, parasitic diseases, intermediate hosts, vectors and vector-borne pathogens. Manuscripts published in this journal will be available to all worldwide, with no barriers to access, immediately following acceptance. However, authors retain the copyright of their material and may use it, or distribute it, as they wish.
Manuscripts on all aspects of the basic and applied biology of parasites, intermediate hosts, vectors and vector-borne pathogens will be considered. In addition to the traditional and well-established areas of science in these fields, we also aim to provide a vehicle for publication of the rapidly developing resources and technology in parasite, intermediate host and vector genomics and their impacts on biological research. We are able to publish large datasets and extensive results, frequently associated with genomic and post-genomic technologies, which are not readily accommodated in traditional journals. Manuscripts addressing broader issues, for example economics, social sciences and global climate change in relation to parasites, vectors and disease control, are also welcomed.