Machine learning models integrating dietary data predict all-cause mortality in U.S. NAFLD patients: an NHANES-based study.

IF 3.8 2区 医学 Q1 NUTRITION & DIETETICS
Pinchu Chen, Yao Li, Chenfenglin Yang, Qifan Zhang
{"title":"Machine learning models integrating dietary data predict all-cause mortality in U.S. NAFLD patients: an NHANES-based study.","authors":"Pinchu Chen, Yao Li, Chenfenglin Yang, Qifan Zhang","doi":"10.1186/s12937-025-01170-0","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Non-alcoholic fatty liver disease (NAFLD) is a leading cause of chronic liver disease, closely associated with metabolic abnormalities and unhealthy lifestyle habits. Despite the critical role of diet in disease progression, most existing prognostic models for NAFLD fail to incorporate dietary factors. This study aims to integrate demographic, serological, and nutritional data. It focuses on developing machine learning models that predict all-cause mortality risk in NAFLD patients, with a particular emphasis on dietary interventions.</p><p><strong>Methods: </strong>Data from the National Health and Nutrition Examination Survey (NHANES) 2007-2018, comprising 2,589 NAFLD participants, were analyzed. Variables associated with survival outcomes were selected using LASSO-Cox regression. Five machine learning models-Random Survival Forest (RSF), Gradient Boosting Machine (GBM), CoxBoost, and Survival Support Vector Machine (SurvivalSVM), eXtreme Gradient Boosting (XGBoost) -were developed and their performance evaluated through time-dependent AUC, ROC curves, C-index, Brier score and Kaplan-Meier analysis. SHAP values were employed for model interpretability.</p><p><strong>Results: </strong>LASSO-Cox regression identified 13 significant variables, including age, household income, blood glucose, sedentary behavior, dietary fiber intake and so on. The GBM and RSF models demonstrated strong predictive performance with AUC values around 0.8 for both 5- and 10-year survival predictions, and also performed well in terms of C-index and Brier score. SHAP analysis revealed that advanced age, low household income, hyperglycemia, and sedentary behavior were associated with poor prognosis, whereas higher dietary fiber intake was linked to improved survival.</p><p><strong>Conclusions: </strong>This study integrates dietary data into machine learning models, demonstrating the potential for predicting all-cause mortality in NAFLD patients. The models, particularly RSF and GBM, show robust predictive accuracy, with dietary fiber intake consistently exhibiting a protective effect on survival outcomes. These findings suggest that dietary interventions, such as increasing dietary fiber intake, could improve the long-term prognosis of NAFLD patients.</p><p><strong>Clinical trial number: </strong>Not applicable.</p>","PeriodicalId":19203,"journal":{"name":"Nutrition Journal","volume":"24 1","pages":"100"},"PeriodicalIF":3.8000,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12220616/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Nutrition Journal","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s12937-025-01170-0","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"NUTRITION & DIETETICS","Score":null,"Total":0}
引用次数: 0

Abstract

Background: Non-alcoholic fatty liver disease (NAFLD) is a leading cause of chronic liver disease, closely associated with metabolic abnormalities and unhealthy lifestyle habits. Despite the critical role of diet in disease progression, most existing prognostic models for NAFLD fail to incorporate dietary factors. This study aims to integrate demographic, serological, and nutritional data. It focuses on developing machine learning models that predict all-cause mortality risk in NAFLD patients, with a particular emphasis on dietary interventions.

Methods: Data from the National Health and Nutrition Examination Survey (NHANES) 2007-2018, comprising 2,589 NAFLD participants, were analyzed. Variables associated with survival outcomes were selected using LASSO-Cox regression. Five machine learning models-Random Survival Forest (RSF), Gradient Boosting Machine (GBM), CoxBoost, and Survival Support Vector Machine (SurvivalSVM), eXtreme Gradient Boosting (XGBoost) -were developed and their performance evaluated through time-dependent AUC, ROC curves, C-index, Brier score and Kaplan-Meier analysis. SHAP values were employed for model interpretability.

Results: LASSO-Cox regression identified 13 significant variables, including age, household income, blood glucose, sedentary behavior, dietary fiber intake and so on. The GBM and RSF models demonstrated strong predictive performance with AUC values around 0.8 for both 5- and 10-year survival predictions, and also performed well in terms of C-index and Brier score. SHAP analysis revealed that advanced age, low household income, hyperglycemia, and sedentary behavior were associated with poor prognosis, whereas higher dietary fiber intake was linked to improved survival.

Conclusions: This study integrates dietary data into machine learning models, demonstrating the potential for predicting all-cause mortality in NAFLD patients. The models, particularly RSF and GBM, show robust predictive accuracy, with dietary fiber intake consistently exhibiting a protective effect on survival outcomes. These findings suggest that dietary interventions, such as increasing dietary fiber intake, could improve the long-term prognosis of NAFLD patients.

Clinical trial number: Not applicable.

Abstract Image

Abstract Image

Abstract Image

整合饮食数据的机器学习模型预测美国NAFLD患者的全因死亡率:一项基于nhanes的研究。
背景:非酒精性脂肪性肝病(NAFLD)是慢性肝病的主要病因,与代谢异常和不健康的生活习惯密切相关。尽管饮食在疾病进展中起着关键作用,但大多数现有的NAFLD预后模型未能纳入饮食因素。本研究旨在整合人口统计学、血清学和营养数据。它专注于开发预测NAFLD患者全因死亡风险的机器学习模型,特别强调饮食干预。方法:分析2007-2018年国家健康与营养检查调查(NHANES)的数据,其中包括2589名NAFLD参与者。使用LASSO-Cox回归选择与生存结果相关的变量。开发了随机生存森林(RSF)、梯度增强机(GBM)、CoxBoost和生存支持向量机(SurvivalSVM)、极限梯度增强(XGBoost) 5种机器学习模型,并通过随时间变化的AUC、ROC曲线、c指数、Brier评分和Kaplan-Meier分析对其性能进行了评价。SHAP值用于模型可解释性。结果:LASSO-Cox回归识别出年龄、家庭收入、血糖、久坐行为、膳食纤维摄入量等13个显著变量。GBM和RSF模型在预测5年和10年生存期方面表现出较强的预测性能,AUC值均在0.8左右,在c指数和Brier评分方面也表现良好。SHAP分析显示,高龄、家庭收入低、高血糖和久坐行为与预后不良有关,而高膳食纤维摄入量与生存率提高有关。结论:该研究将饮食数据整合到机器学习模型中,证明了预测NAFLD患者全因死亡率的潜力。这些模型,特别是RSF和GBM,显示出强大的预测准确性,膳食纤维摄入始终显示出对生存结果的保护作用。这些发现表明,饮食干预,如增加膳食纤维摄入量,可以改善NAFLD患者的长期预后。临床试验号:不适用。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Nutrition Journal
Nutrition Journal NUTRITION & DIETETICS-
CiteScore
9.80
自引率
0.00%
发文量
68
审稿时长
4-8 weeks
期刊介绍: Nutrition Journal publishes surveillance, epidemiologic, and intervention research that sheds light on i) influences (e.g., familial, environmental) on eating patterns; ii) associations between eating patterns and health, and iii) strategies to improve eating patterns among populations. The journal also welcomes manuscripts reporting on the psychometric properties (e.g., validity, reliability) and feasibility of methods (e.g., for assessing dietary intake) for human nutrition research. In addition, study protocols for controlled trials and cohort studies, with an emphasis on methods for assessing dietary exposures and outcomes as well as intervention components, will be considered. Manuscripts that consider eating patterns holistically, as opposed to solely reductionist approaches that focus on specific dietary components in isolation, are encouraged. Also encouraged are papers that take a holistic or systems perspective in attempting to understand possible compensatory and differential effects of nutrition interventions. The journal does not consider animal studies. In addition to the influence of eating patterns for human health, we also invite research providing insights into the environmental sustainability of dietary practices. Again, a holistic perspective is encouraged, for example, through the consideration of how eating patterns might maximize both human and planetary health.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信