Rok Kogoj , Martin Bosilj , Andraž Celar Šturm , Misa Korva , Katja Strašek Smrdel , Eva Kvas , Mateja Pirš , Lidija Lepen , Tina Triglav
{"title":"Bioinformatic challenges in metagenomic next generation sequencing data analysis while unravelling a case of uncommon campylobacteriosis","authors":"Rok Kogoj , Martin Bosilj , Andraž Celar Šturm , Misa Korva , Katja Strašek Smrdel , Eva Kvas , Mateja Pirš , Lidija Lepen , Tina Triglav","doi":"10.1016/j.jbi.2025.104841","DOIUrl":null,"url":null,"abstract":"<div><h3>Objective</h3><div>This study aimed to employ advanced bioinformatics and modern sequencing approaches to solve a diagnostic problem of persistent <em>Campylobacter</em> spp. molecular detection yet negative culture results from four consecutive stool samples of a previously healthy patient with newly diagnosed selective IgA deficiency and prolonged diarrhoea.</div></div><div><h3>Methods</h3><div>Metagenomic next-generation sequencing (mNGS) based on short-paired end reads with basic bioinformatic read classification analysis was used at first. Due to ambiguous results, advanced bioinformatics involving contigs construction and classification, reference genome mappings and reads filtering with BBSplit, additionally coupled with metagenomic long-reads sequencing and Full-length 16S rRNA metabarcoding were employed to further elucidate the results. Virulence factors were analysed using the Prokka Genome Annotation tool. Modified classical bacteriology methods were finally used for further clarification.</div></div><div><h3>Results</h3><div>Short-pair end reads analysis identified several <em>Campylobacter</em> species in all four samples. After advanced bioinformatic approaches were applied, candidatus <em>C. infans</em> was suspected as the putative pathogen. This result was further supported by metagenomic long-reads sequencing and Full-length 16S rRNA metabarcoding. Nevertheless, after modifying the culture conditions based on mNGS results, a mixed culture of candidatus <em>C. infans</em> and <em>C.<!--> <!-->ureolyticus</em> was obtained. Sequencing of the mixed culture resulted in an 87.48% and 73.47% genome coverage of candidatus <em>C. infans</em> and <em>C. ureolyticus</em>, respectively. In the candidatus <em>C. infans</em> genome more virulence factors hits were found than in the <em>C. ureolyticus</em> genome thus supporting the first as the most probable cause of symptoms.</div></div><div><h3>Conclusion</h3><div>This study shows the pivotal role and strengths of mNGS in unravelling an unusual case of diarrhoea and demonstrates how mNGS can guide established microbiological methods to improve on current limitations. However, it also emphasises the need for careful interpretation of sequencing data, particularly for closely related bacterial species from clinical samples that are known to support complex microbial communities.</div></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":"166 ","pages":"Article 104841"},"PeriodicalIF":4.0000,"publicationDate":"2025-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Biomedical Informatics","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S153204642500070X","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0
Abstract
Objective
This study aimed to employ advanced bioinformatics and modern sequencing approaches to solve a diagnostic problem of persistent Campylobacter spp. molecular detection yet negative culture results from four consecutive stool samples of a previously healthy patient with newly diagnosed selective IgA deficiency and prolonged diarrhoea.
Methods
Metagenomic next-generation sequencing (mNGS) based on short-paired end reads with basic bioinformatic read classification analysis was used at first. Due to ambiguous results, advanced bioinformatics involving contigs construction and classification, reference genome mappings and reads filtering with BBSplit, additionally coupled with metagenomic long-reads sequencing and Full-length 16S rRNA metabarcoding were employed to further elucidate the results. Virulence factors were analysed using the Prokka Genome Annotation tool. Modified classical bacteriology methods were finally used for further clarification.
Results
Short-pair end reads analysis identified several Campylobacter species in all four samples. After advanced bioinformatic approaches were applied, candidatus C. infans was suspected as the putative pathogen. This result was further supported by metagenomic long-reads sequencing and Full-length 16S rRNA metabarcoding. Nevertheless, after modifying the culture conditions based on mNGS results, a mixed culture of candidatus C. infans and C. ureolyticus was obtained. Sequencing of the mixed culture resulted in an 87.48% and 73.47% genome coverage of candidatus C. infans and C. ureolyticus, respectively. In the candidatus C. infans genome more virulence factors hits were found than in the C. ureolyticus genome thus supporting the first as the most probable cause of symptoms.
Conclusion
This study shows the pivotal role and strengths of mNGS in unravelling an unusual case of diarrhoea and demonstrates how mNGS can guide established microbiological methods to improve on current limitations. However, it also emphasises the need for careful interpretation of sequencing data, particularly for closely related bacterial species from clinical samples that are known to support complex microbial communities.
期刊介绍:
The Journal of Biomedical Informatics reflects a commitment to high-quality original research papers, reviews, and commentaries in the area of biomedical informatics methodology. Although we publish articles motivated by applications in the biomedical sciences (for example, clinical medicine, health care, population health, and translational bioinformatics), the journal emphasizes reports of new methodologies and techniques that have general applicability and that form the basis for the evolving science of biomedical informatics. Articles on medical devices; evaluations of implemented systems (including clinical trials of information technologies); or papers that provide insight into a biological process, a specific disease, or treatment options would generally be more suitable for publication in other venues. Papers on applications of signal processing and image analysis are often more suitable for biomedical engineering journals or other informatics journals, although we do publish papers that emphasize the information management and knowledge representation/modeling issues that arise in the storage and use of biological signals and images. System descriptions are welcome if they illustrate and substantiate the underlying methodology that is the principal focus of the report and an effort is made to address the generalizability and/or range of application of that methodology. Note also that, given the international nature of JBI, papers that deal with specific languages other than English, or with country-specific health systems or approaches, are acceptable for JBI only if they offer generalizable lessons that are relevant to the broad JBI readership, regardless of their country, language, culture, or health system.