MicroAIbiome:使用可解释的机器学习从微生物谱中解码癌症类型。

IF 4.2 2区 生物学 Q2 MICROBIOLOGY
Md Motiur Rahman, Shiva Shokouhmand, Saeka Rahman, Nafisa Nawar Tamzi, Smriti Bhatt, Miad Faezipour
{"title":"MicroAIbiome:使用可解释的机器学习从微生物谱中解码癌症类型。","authors":"Md Motiur Rahman, Shiva Shokouhmand, Saeka Rahman, Nafisa Nawar Tamzi, Smriti Bhatt, Miad Faezipour","doi":"10.3390/microorganisms13092210","DOIUrl":null,"url":null,"abstract":"<p><p>Microbial communities within human tissues are increasingly recognized as promising biomarkers for cancer detection. However, leveraging microbiome data for multiclass cancer classification remains challenging due to its compositional structure, high dimensionality, and lack of model interpretability. In this study, we address these challenges by introducing MicroAIbiome, a machine learning-based artificial intelligence (AI) pipeline designed to classify five cancer types such as esophageal carcinoma (ESCA), head and neck squamous cell carcinoma (HNSC), stomach adenocarcinoma (STAD), colon adenocarcinoma (COAD), and rectum adenocarcinoma (READ), using genus-level microbial relative abundances. Our pipeline incorporates zero-replacement, centered log-ratio (CLR) transformation, correlation filtering, and recursive feature elimination (RFE) to enable robust learning from compositional data. Among five evaluated classifiers, XGBoost achieved the highest accuracy of 78.23%, outperforming prior work. We further enhance interpretability using SHapley Additive exPlanations (SHAP)-based feature attribution to uncover class-specific microbial signatures, such as Corynebacterium in ESCA and Bacteroides in COAD. Our results highlight the importance of compositional preprocessing and explainable AI in advancing microbiome-based cancer diagnostics.</p>","PeriodicalId":18667,"journal":{"name":"Microorganisms","volume":"13 9","pages":""},"PeriodicalIF":4.2000,"publicationDate":"2025-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12473104/pdf/","citationCount":"0","resultStr":"{\"title\":\"MicroAIbiome: Decoding Cancer Types from Microbial Profiles Using Explainable Machine Learning.\",\"authors\":\"Md Motiur Rahman, Shiva Shokouhmand, Saeka Rahman, Nafisa Nawar Tamzi, Smriti Bhatt, Miad Faezipour\",\"doi\":\"10.3390/microorganisms13092210\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Microbial communities within human tissues are increasingly recognized as promising biomarkers for cancer detection. However, leveraging microbiome data for multiclass cancer classification remains challenging due to its compositional structure, high dimensionality, and lack of model interpretability. In this study, we address these challenges by introducing MicroAIbiome, a machine learning-based artificial intelligence (AI) pipeline designed to classify five cancer types such as esophageal carcinoma (ESCA), head and neck squamous cell carcinoma (HNSC), stomach adenocarcinoma (STAD), colon adenocarcinoma (COAD), and rectum adenocarcinoma (READ), using genus-level microbial relative abundances. Our pipeline incorporates zero-replacement, centered log-ratio (CLR) transformation, correlation filtering, and recursive feature elimination (RFE) to enable robust learning from compositional data. Among five evaluated classifiers, XGBoost achieved the highest accuracy of 78.23%, outperforming prior work. We further enhance interpretability using SHapley Additive exPlanations (SHAP)-based feature attribution to uncover class-specific microbial signatures, such as Corynebacterium in ESCA and Bacteroides in COAD. Our results highlight the importance of compositional preprocessing and explainable AI in advancing microbiome-based cancer diagnostics.</p>\",\"PeriodicalId\":18667,\"journal\":{\"name\":\"Microorganisms\",\"volume\":\"13 9\",\"pages\":\"\"},\"PeriodicalIF\":4.2000,\"publicationDate\":\"2025-09-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12473104/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Microorganisms\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.3390/microorganisms13092210\",\"RegionNum\":2,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"MICROBIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Microorganisms","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.3390/microorganisms13092210","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MICROBIOLOGY","Score":null,"Total":0}
引用次数: 0

摘要

人体组织中的微生物群落越来越被认为是癌症检测的有前途的生物标志物。然而,由于微生物组数据的组成结构、高维性和缺乏模型可解释性,利用微生物组数据进行多类别癌症分类仍然具有挑战性。在这项研究中,我们通过引入MicroAIbiome来解决这些挑战,MicroAIbiome是一个基于机器学习的人工智能(AI)管道,旨在使用属水平的微生物相对丰度对五种癌症类型进行分类,如食管癌(ESCA)、头颈部鳞状细胞癌(HNSC)、胃腺癌(STAD)、结肠腺癌(COAD)和直肠腺癌(READ)。我们的管道集成了零替换、中心对数比(CLR)转换、相关滤波和递归特征消除(RFE),以实现对成分数据的鲁棒学习。在五个被评估的分类器中,XGBoost达到了78.23%的最高准确率,优于之前的工作。我们使用SHapley加性解释(SHAP)为基础的特征属性进一步增强了可解释性,以揭示类特异性微生物特征,如ESCA中的棒状杆菌和COAD中的拟杆菌。我们的研究结果强调了成分预处理和可解释的人工智能在推进基于微生物组的癌症诊断中的重要性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
MicroAIbiome: Decoding Cancer Types from Microbial Profiles Using Explainable Machine Learning.

Microbial communities within human tissues are increasingly recognized as promising biomarkers for cancer detection. However, leveraging microbiome data for multiclass cancer classification remains challenging due to its compositional structure, high dimensionality, and lack of model interpretability. In this study, we address these challenges by introducing MicroAIbiome, a machine learning-based artificial intelligence (AI) pipeline designed to classify five cancer types such as esophageal carcinoma (ESCA), head and neck squamous cell carcinoma (HNSC), stomach adenocarcinoma (STAD), colon adenocarcinoma (COAD), and rectum adenocarcinoma (READ), using genus-level microbial relative abundances. Our pipeline incorporates zero-replacement, centered log-ratio (CLR) transformation, correlation filtering, and recursive feature elimination (RFE) to enable robust learning from compositional data. Among five evaluated classifiers, XGBoost achieved the highest accuracy of 78.23%, outperforming prior work. We further enhance interpretability using SHapley Additive exPlanations (SHAP)-based feature attribution to uncover class-specific microbial signatures, such as Corynebacterium in ESCA and Bacteroides in COAD. Our results highlight the importance of compositional preprocessing and explainable AI in advancing microbiome-based cancer diagnostics.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Microorganisms
Microorganisms Medicine-Microbiology (medical)
CiteScore
7.40
自引率
6.70%
发文量
2168
审稿时长
20.03 days
期刊介绍: Microorganisms (ISSN 2076-2607) is an international, peer-reviewed open access journal which provides an advanced forum for studies related to prokaryotic and eukaryotic microorganisms, viruses and prions. It publishes reviews, research papers and communications. Our aim is to encourage scientists to publish their experimental and theoretical results in as much detail as possible. There is no restriction on the length of the papers. The full experimental details must be provided so that the results can be reproduced. Electronic files and software regarding the full details of the calculation or experimental procedure, if unable to be published in a normal way, can be deposited as supplementary electronic material.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信