Fang Lu, Ting Lei, Jie Zhou, Hao Liang, Ping Cui, Taiping Zuo, Li Ye, Hui Chen, Jiegang Huang
{"title":"Using gut microbiota as a diagnostic tool for colorectal cancer: machine learning techniques reveal promising results.","authors":"Fang Lu, Ting Lei, Jie Zhou, Hao Liang, Ping Cui, Taiping Zuo, Li Ye, Hui Chen, Jiegang Huang","doi":"10.1099/jmm.0.001699","DOIUrl":null,"url":null,"abstract":"<p><p><b>Introduction.</b> Increasing evidence suggests a correlation between gut microbiota and colorectal cancer (CRC).<b>Hypothesis/Gap Statement.</b> However, few studies have used gut microbiota as a diagnostic biomarker for CRC.<b>Aim.</b> The objective of this study was to explore whether a machine learning (ML) model based on gut microbiota could be used to diagnose CRC and identify key biomarkers in the model.<b>Methodology.</b> We sequenced the 16S rRNA gene from faecal samples of 38 participants, including 17 healthy subjects and 21 CRC patients. Eight supervised ML algorithms were used to diagnose CRC based on faecal microbiota operational taxonomic units (OTUs), and the models were evaluated in terms of identification, calibration and clinical practicality for optimal modelling parameters. Finally, the key gut microbiota was identified using the random forest (RF) algorithm.<b>Results.</b> We found that CRC was associated with the dysregulation of gut microbiota. Through a comprehensive evaluation of supervised ML algorithms, we found that different algorithms had significantly different prediction performance using faecal microbiomes. Different data screening methods played an important role in optimization of the prediction models. We found that naïve Bayes algorithms [NB, accuracy=0.917, area under the curve (AUC)=0.926], RF (accuracy=0.750, AUC=0.926) and logistic regression (LR, accuracy=0.750, AUC=0.889) had high predictive potential for CRC. Furthermore, important features in the model, namely <i>s__metagenome_g__Lachnospiraceae_ND3007_group</i> (AUC=0.814)<i>, s__Escherichia_coli_g__Escherichia-Shigella</i> (AUC=0.784) and <i>s__unclassified_g__Prevotella</i> (AUC=0.750), could each be used as diagnostic biomarkers of CRC.<b>Conclusions.</b> Our results suggested an association between gut microbiota dysregulation and CRC, and demonstrated the feasibility of the gut microbiota to diagnose cancer. The bacteria <i>s__metagenome_g__Lachnospiraceae_ND3007_group, s__Escherichia_coli_g__Escherichia-Shigella</i> and <i>s__unclassified_g__Prevotella</i> were key biomarkers for CRC.</p>","PeriodicalId":16343,"journal":{"name":"Journal of medical microbiology","volume":"72 6","pages":""},"PeriodicalIF":2.4000,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of medical microbiology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1099/jmm.0.001699","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"MICROBIOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Introduction. Increasing evidence suggests a correlation between gut microbiota and colorectal cancer (CRC).Hypothesis/Gap Statement. However, few studies have used gut microbiota as a diagnostic biomarker for CRC.Aim. The objective of this study was to explore whether a machine learning (ML) model based on gut microbiota could be used to diagnose CRC and identify key biomarkers in the model.Methodology. We sequenced the 16S rRNA gene from faecal samples of 38 participants, including 17 healthy subjects and 21 CRC patients. Eight supervised ML algorithms were used to diagnose CRC based on faecal microbiota operational taxonomic units (OTUs), and the models were evaluated in terms of identification, calibration and clinical practicality for optimal modelling parameters. Finally, the key gut microbiota was identified using the random forest (RF) algorithm.Results. We found that CRC was associated with the dysregulation of gut microbiota. Through a comprehensive evaluation of supervised ML algorithms, we found that different algorithms had significantly different prediction performance using faecal microbiomes. Different data screening methods played an important role in optimization of the prediction models. We found that naïve Bayes algorithms [NB, accuracy=0.917, area under the curve (AUC)=0.926], RF (accuracy=0.750, AUC=0.926) and logistic regression (LR, accuracy=0.750, AUC=0.889) had high predictive potential for CRC. Furthermore, important features in the model, namely s__metagenome_g__Lachnospiraceae_ND3007_group (AUC=0.814), s__Escherichia_coli_g__Escherichia-Shigella (AUC=0.784) and s__unclassified_g__Prevotella (AUC=0.750), could each be used as diagnostic biomarkers of CRC.Conclusions. Our results suggested an association between gut microbiota dysregulation and CRC, and demonstrated the feasibility of the gut microbiota to diagnose cancer. The bacteria s__metagenome_g__Lachnospiraceae_ND3007_group, s__Escherichia_coli_g__Escherichia-Shigella and s__unclassified_g__Prevotella were key biomarkers for CRC.
期刊介绍:
Journal of Medical Microbiology provides comprehensive coverage of medical, dental and veterinary microbiology, and infectious diseases. We welcome everything from laboratory research to clinical trials, including bacteriology, virology, mycology and parasitology. We publish articles under the following subject categories: Antimicrobial resistance; Clinical microbiology; Disease, diagnosis and diagnostics; Medical mycology; Molecular and microbial epidemiology; Microbiome and microbial ecology in health; One Health; Pathogenesis, virulence and host response; Prevention, therapy and therapeutics