XGBoost- and Mass Spectrometry-Based Feature Selection for Identifying Metabolic Biomarkers Associated with HBV-Related Liver Disease Progression and Hepatocellular Carcinoma Treatment.
Shao-Hua Li, Ming Song, Peng Wang, Tian-Shun Kou, Xuan-Xian Peng, Hua Ye, Hui Li
{"title":"XGBoost- and Mass Spectrometry-Based Feature Selection for Identifying Metabolic Biomarkers Associated with HBV-Related Liver Disease Progression and Hepatocellular Carcinoma Treatment.","authors":"Shao-Hua Li, Ming Song, Peng Wang, Tian-Shun Kou, Xuan-Xian Peng, Hua Ye, Hui Li","doi":"10.1021/acs.jproteome.5c00540","DOIUrl":null,"url":null,"abstract":"<p><p>XGBoost, a gradient boosting algorithm, is widely recognized for its efficiency and robustness in multiclass classification tasks. Metabolomics serves as a powerful tool for biomarker discovery; however, metabolic biomarkers associated with the progression from chronic hepatitis B (CHB) to liver cirrhosis (LC) to hepatocellular carcinoma (HCC), as well as those related to treatment effects in HCC (HCCAT), remain unclear. In this study, an XGBoost-based machine learning approach combined with mass spectrometry was used to analyze the metabolic profiles of 30 healthy controls (HC), 29 CHB patients, 30 LC patients, 30 HCC patients, and 30 HCCAT patients. Biomarker screening was conducted through three comparative analyses: (1) HC, CHB, LC, HCC, and HCCAT; (2) HC, CHB, LC, and HCC; and (3) HC, HCC, and HCCAT. A total of 17 metabolic biomarkers were identified, among which nine had not been previously associated with HBV-related liver diseases. Notably, a potential biomarker panel composed of eicosenoic acid, dihydromorphine, cysteine, acetic acid, sitosterol, and hypoxanthine showed promise for disease prognosis and therapeutic evaluation. These findings highlight the great potential of integrating metabolomics with machine learning to identify novel metabolic biomarkers related to HBV-associated liver disease progression and treatment response.</p>","PeriodicalId":48,"journal":{"name":"Journal of Proteome Research","volume":" ","pages":""},"PeriodicalIF":3.6000,"publicationDate":"2025-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Proteome Research","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1021/acs.jproteome.5c00540","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
引用次数: 0
Abstract
XGBoost, a gradient boosting algorithm, is widely recognized for its efficiency and robustness in multiclass classification tasks. Metabolomics serves as a powerful tool for biomarker discovery; however, metabolic biomarkers associated with the progression from chronic hepatitis B (CHB) to liver cirrhosis (LC) to hepatocellular carcinoma (HCC), as well as those related to treatment effects in HCC (HCCAT), remain unclear. In this study, an XGBoost-based machine learning approach combined with mass spectrometry was used to analyze the metabolic profiles of 30 healthy controls (HC), 29 CHB patients, 30 LC patients, 30 HCC patients, and 30 HCCAT patients. Biomarker screening was conducted through three comparative analyses: (1) HC, CHB, LC, HCC, and HCCAT; (2) HC, CHB, LC, and HCC; and (3) HC, HCC, and HCCAT. A total of 17 metabolic biomarkers were identified, among which nine had not been previously associated with HBV-related liver diseases. Notably, a potential biomarker panel composed of eicosenoic acid, dihydromorphine, cysteine, acetic acid, sitosterol, and hypoxanthine showed promise for disease prognosis and therapeutic evaluation. These findings highlight the great potential of integrating metabolomics with machine learning to identify novel metabolic biomarkers related to HBV-associated liver disease progression and treatment response.
期刊介绍:
Journal of Proteome Research publishes content encompassing all aspects of global protein analysis and function, including the dynamic aspects of genomics, spatio-temporal proteomics, metabonomics and metabolomics, clinical and agricultural proteomics, as well as advances in methodology including bioinformatics. The theme and emphasis is on a multidisciplinary approach to the life sciences through the synergy between the different types of "omics".