Multiomics Analysis and Machine Learning-based Identification of Molecular Signatures for Diagnostic Classification in Liver Disease Types Along the Microbiota-gut-liver Axis
{"title":"Multiomics Analysis and Machine Learning-based Identification of Molecular Signatures for Diagnostic Classification in Liver Disease Types Along the Microbiota-gut-liver Axis","authors":"Betul Comertpay, Esra Gov","doi":"10.1016/j.jceh.2025.102552","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><div>Liver disease, responsible for around two million deaths annually, remains a pressing global health challenge. Microbial interactions within the microbiota–gut–liver axis play a substantial role in the pathogenesis of various liver conditions, including early chronic liver disease (eCLD), chronic liver disease (CLD), acute liver failure (ALF), acute-on-chronic liver failure (ACLF), non-alcoholic fatty liver disease (NAFLD), steatohepatitis, and cirrhosis. This study aimed to identify key molecular signatures involved in liver disease progression by analyzing transcriptomic and gut microbiome data, and to evaluate their diagnostic utility using machine learning models.</div></div><div><h3>Methods</h3><div>Transcriptomic analysis identified differentially expressed genes (DEGs) that, when integrated with regulatory elements microRNAs, transcription factors, receptors, and the gut microbiome highlight disease-specific molecular interactions. To assess the diagnostic potential of these molecular signatures, a two-step analysis involving principal component analysis (PCA) and Random Forest classification was conducted, achieving accuracies of 75% for ALF and 89% for NAFLD. Additionally, machine learning algorithms, including K-neighbors, multi-layer perceptron (MLP), decision tree, Random Forest, logistic regression, gradient boosting, CatBoost, Extreme Gradient Boosting (XGB), and Light Gradient Boosting Machine (LGBM), were applied to gene expression data for ALF and NAFLD.</div></div><div><h3>Results</h3><div>Key genes including CLDN14, EGFR, GSK3B, MYC, and TJP2, alongside regulatory miRNAs let-7a-5p, miR-124-3p, and miR-195-5p and transcription factors NFKB1 and SP1 may be suggested as critical to liver disease progression. Additionally, gut microbiota members, Dictyostelium discoideum and Eikenella might be novel candidates associated with liver disease, highlighting the importance of the gut-liver axis. The Random Forest model reached 75% accuracy and 83% area under the curve for ALF, while NAFLD classification achieved 100% accuracy, precision, recall, and area under the curve underscoring robust diagnostic potential.</div></div><div><h3>Conclusion</h3><div>This study establishes a solid foundation for further research and therapeutic advancement by identifying key biomolecules and pathways critical to liver disease. Additional experimental validation is needed to confirm clinical applicability.</div></div>","PeriodicalId":15479,"journal":{"name":"Journal of Clinical and Experimental Hepatology","volume":"15 5","pages":"Article 102552"},"PeriodicalIF":3.3000,"publicationDate":"2025-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Clinical and Experimental Hepatology","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0973688325000520","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"GASTROENTEROLOGY & HEPATOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Background
Liver disease, responsible for around two million deaths annually, remains a pressing global health challenge. Microbial interactions within the microbiota–gut–liver axis play a substantial role in the pathogenesis of various liver conditions, including early chronic liver disease (eCLD), chronic liver disease (CLD), acute liver failure (ALF), acute-on-chronic liver failure (ACLF), non-alcoholic fatty liver disease (NAFLD), steatohepatitis, and cirrhosis. This study aimed to identify key molecular signatures involved in liver disease progression by analyzing transcriptomic and gut microbiome data, and to evaluate their diagnostic utility using machine learning models.
Methods
Transcriptomic analysis identified differentially expressed genes (DEGs) that, when integrated with regulatory elements microRNAs, transcription factors, receptors, and the gut microbiome highlight disease-specific molecular interactions. To assess the diagnostic potential of these molecular signatures, a two-step analysis involving principal component analysis (PCA) and Random Forest classification was conducted, achieving accuracies of 75% for ALF and 89% for NAFLD. Additionally, machine learning algorithms, including K-neighbors, multi-layer perceptron (MLP), decision tree, Random Forest, logistic regression, gradient boosting, CatBoost, Extreme Gradient Boosting (XGB), and Light Gradient Boosting Machine (LGBM), were applied to gene expression data for ALF and NAFLD.
Results
Key genes including CLDN14, EGFR, GSK3B, MYC, and TJP2, alongside regulatory miRNAs let-7a-5p, miR-124-3p, and miR-195-5p and transcription factors NFKB1 and SP1 may be suggested as critical to liver disease progression. Additionally, gut microbiota members, Dictyostelium discoideum and Eikenella might be novel candidates associated with liver disease, highlighting the importance of the gut-liver axis. The Random Forest model reached 75% accuracy and 83% area under the curve for ALF, while NAFLD classification achieved 100% accuracy, precision, recall, and area under the curve underscoring robust diagnostic potential.
Conclusion
This study establishes a solid foundation for further research and therapeutic advancement by identifying key biomolecules and pathways critical to liver disease. Additional experimental validation is needed to confirm clinical applicability.