Binaya Dhakal, Lakshmi Sai Kishore Savarapu, Khaled Sayed
{"title":"将多室微生物组数据与临床参数相结合,利用自编码器提高死亡率预测。","authors":"Binaya Dhakal, Lakshmi Sai Kishore Savarapu, Khaled Sayed","doi":"10.1016/j.mimet.2025.107267","DOIUrl":null,"url":null,"abstract":"<div><div>The human microbiome, a complex ecosystem of microorganisms residing in different body compartments, significantly influences health outcomes and disease progression, however, leveraging this data for developing clinical prediction models remains challenging due to its high dimensionality, sparsity, and compositional nature. Traditional machine learning approaches often struggle to capture the intricate microbial interactions that contribute to mortality risk, particularly when analyzing data across multiple compartments with distinct microbial compositions. To address these limitations, we introduce a novel framework utilizing an autoencoder-based model trained on high-dimensional microbiome data collected from oral, lung, and gut compartments. Our approach encodes microbiome data into a low-dimensional latent space while preserving essential microbial community characteristics, enabling more effective feature extraction and pattern recognition than conventional dimensionality reduction techniques. Through systematic evaluation of three data configurations—microbiome taxa only, clinical data only, and an integrated model combining both—we demonstrated that the integrated approach consistently achieved superior prediction accuracy (98 % in lung microbiome) compared to using either data source independently. Clinical data alone provided reasonable but inconsistent performance (70–90 %), while microbiome taxa alone yielded the weakest results (53–65 %). Furthermore, our investigation of preprocessing techniques revealed that applying z-score normalization to the taxa data significantly enhanced performance and substantially improved recall metrics across all compartments. By analyzing compartment-specific microbial contributions, our study reveals distinct predictive roles of the oral and lung microbiomes compared to the gut microbiome, underscoring of body-site specificity in microbiome-based predictive modeling.</div></div>","PeriodicalId":16409,"journal":{"name":"Journal of microbiological methods","volume":"238 ","pages":"Article 107267"},"PeriodicalIF":1.9000,"publicationDate":"2025-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Integrating multi-compartment microbiome data with clinical parameters enhances mortality prediction using autoencoder\",\"authors\":\"Binaya Dhakal, Lakshmi Sai Kishore Savarapu, Khaled Sayed\",\"doi\":\"10.1016/j.mimet.2025.107267\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>The human microbiome, a complex ecosystem of microorganisms residing in different body compartments, significantly influences health outcomes and disease progression, however, leveraging this data for developing clinical prediction models remains challenging due to its high dimensionality, sparsity, and compositional nature. Traditional machine learning approaches often struggle to capture the intricate microbial interactions that contribute to mortality risk, particularly when analyzing data across multiple compartments with distinct microbial compositions. To address these limitations, we introduce a novel framework utilizing an autoencoder-based model trained on high-dimensional microbiome data collected from oral, lung, and gut compartments. Our approach encodes microbiome data into a low-dimensional latent space while preserving essential microbial community characteristics, enabling more effective feature extraction and pattern recognition than conventional dimensionality reduction techniques. Through systematic evaluation of three data configurations—microbiome taxa only, clinical data only, and an integrated model combining both—we demonstrated that the integrated approach consistently achieved superior prediction accuracy (98 % in lung microbiome) compared to using either data source independently. Clinical data alone provided reasonable but inconsistent performance (70–90 %), while microbiome taxa alone yielded the weakest results (53–65 %). Furthermore, our investigation of preprocessing techniques revealed that applying z-score normalization to the taxa data significantly enhanced performance and substantially improved recall metrics across all compartments. By analyzing compartment-specific microbial contributions, our study reveals distinct predictive roles of the oral and lung microbiomes compared to the gut microbiome, underscoring of body-site specificity in microbiome-based predictive modeling.</div></div>\",\"PeriodicalId\":16409,\"journal\":{\"name\":\"Journal of microbiological methods\",\"volume\":\"238 \",\"pages\":\"Article 107267\"},\"PeriodicalIF\":1.9000,\"publicationDate\":\"2025-09-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of microbiological methods\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0167701225001836\",\"RegionNum\":4,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"BIOCHEMICAL RESEARCH METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of microbiological methods","FirstCategoryId":"99","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167701225001836","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
Integrating multi-compartment microbiome data with clinical parameters enhances mortality prediction using autoencoder
The human microbiome, a complex ecosystem of microorganisms residing in different body compartments, significantly influences health outcomes and disease progression, however, leveraging this data for developing clinical prediction models remains challenging due to its high dimensionality, sparsity, and compositional nature. Traditional machine learning approaches often struggle to capture the intricate microbial interactions that contribute to mortality risk, particularly when analyzing data across multiple compartments with distinct microbial compositions. To address these limitations, we introduce a novel framework utilizing an autoencoder-based model trained on high-dimensional microbiome data collected from oral, lung, and gut compartments. Our approach encodes microbiome data into a low-dimensional latent space while preserving essential microbial community characteristics, enabling more effective feature extraction and pattern recognition than conventional dimensionality reduction techniques. Through systematic evaluation of three data configurations—microbiome taxa only, clinical data only, and an integrated model combining both—we demonstrated that the integrated approach consistently achieved superior prediction accuracy (98 % in lung microbiome) compared to using either data source independently. Clinical data alone provided reasonable but inconsistent performance (70–90 %), while microbiome taxa alone yielded the weakest results (53–65 %). Furthermore, our investigation of preprocessing techniques revealed that applying z-score normalization to the taxa data significantly enhanced performance and substantially improved recall metrics across all compartments. By analyzing compartment-specific microbial contributions, our study reveals distinct predictive roles of the oral and lung microbiomes compared to the gut microbiome, underscoring of body-site specificity in microbiome-based predictive modeling.
期刊介绍:
The Journal of Microbiological Methods publishes scholarly and original articles, notes and review articles. These articles must include novel and/or state-of-the-art methods, or significant improvements to existing methods. Novel and innovative applications of current methods that are validated and useful will also be published. JMM strives for scholarship, innovation and excellence. This demands scientific rigour, the best available methods and technologies, correctly replicated experiments/tests, the inclusion of proper controls, calibrations, and the correct statistical analysis. The presentation of the data must support the interpretation of the method/approach.
All aspects of microbiology are covered, except virology. These include agricultural microbiology, applied and environmental microbiology, bioassays, bioinformatics, biotechnology, biochemical microbiology, clinical microbiology, diagnostics, food monitoring and quality control microbiology, microbial genetics and genomics, geomicrobiology, microbiome methods regardless of habitat, high through-put sequencing methods and analysis, microbial pathogenesis and host responses, metabolomics, metagenomics, metaproteomics, microbial ecology and diversity, microbial physiology, microbial ultra-structure, microscopic and imaging methods, molecular microbiology, mycology, novel mathematical microbiology and modelling, parasitology, plant-microbe interactions, protein markers/profiles, proteomics, pyrosequencing, public health microbiology, radioisotopes applied to microbiology, robotics applied to microbiological methods,rumen microbiology, microbiological methods for space missions and extreme environments, sampling methods and samplers, soil and sediment microbiology, transcriptomics, veterinary microbiology, sero-diagnostics and typing/identification.