Predicting the onset of overweight in Chinese high school students: a machine-learning approach in a one-year prospective cohort study.

IF 3.7 3区 医学 Q2 Medicine
Endocrine Pub Date : 2024-11-01 Epub Date: 2024-06-10 DOI:10.1007/s12020-024-03902-4
Zikang Zhang, Wei Peng, Shaoming Sun, Jianguo Ma, Yining Sun, Fangwen Zhang
{"title":"Predicting the onset of overweight in Chinese high school students: a machine-learning approach in a one-year prospective cohort study.","authors":"Zikang Zhang, Wei Peng, Shaoming Sun, Jianguo Ma, Yining Sun, Fangwen Zhang","doi":"10.1007/s12020-024-03902-4","DOIUrl":null,"url":null,"abstract":"<p><strong>Objective: </strong>This study aimed to develop and evaluate machine-learning models for predicting the onset of overweight in adolescents aged 14‒17, utilizing easily collectible personal information.</p><p><strong>Methods: </strong>This study was a one-year prospective cohort study. Baseline data were collected through anthropometric measurements and questionnaires, and the incidence of overweight was calculated one year later via anthropometric measurements. Predictive factors were selected through univariate analysis. Six machine-learning models were developed for predicting the onset of overweight. The SHapley Additive exPlanations (SHAP) was used for global and local interpretation of the models.</p><p><strong>Results: </strong>Out of 1,241 adolescents, 204 (16.4%) were identified as overweight after one year. Nineteen features were associated with the overweight incidence in univariable analysis. Participants were randomly divided into a training group and a testing group in a 7:3 ratio. The Light Gradient Boosting Machine (LGBM) algorithm achieved outperformed other models, achieving the following metrics: Accuracy (0.956), Recall (0.812), Specificity (0.983), F1-score (0.855), AUC (0.961). Importance ranking revealed that the top 11 minimal feature set can maintain the stability of model performance.</p><p><strong>Conclusions: </strong>The onset of overweight in adolescents was accurately predicted using easily collectible personal information. The LGBM-based model exhibited superior performance. Oversampling technique notably improved model performance. The model interpretation technique provided innovative strategies for managing adolescent overweight/obesity.</p>","PeriodicalId":11572,"journal":{"name":"Endocrine","volume":" ","pages":"600-611"},"PeriodicalIF":3.7000,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Endocrine","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1007/s12020-024-03902-4","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/6/10 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"Medicine","Score":null,"Total":0}
引用次数: 0

Abstract

Objective: This study aimed to develop and evaluate machine-learning models for predicting the onset of overweight in adolescents aged 14‒17, utilizing easily collectible personal information.

Methods: This study was a one-year prospective cohort study. Baseline data were collected through anthropometric measurements and questionnaires, and the incidence of overweight was calculated one year later via anthropometric measurements. Predictive factors were selected through univariate analysis. Six machine-learning models were developed for predicting the onset of overweight. The SHapley Additive exPlanations (SHAP) was used for global and local interpretation of the models.

Results: Out of 1,241 adolescents, 204 (16.4%) were identified as overweight after one year. Nineteen features were associated with the overweight incidence in univariable analysis. Participants were randomly divided into a training group and a testing group in a 7:3 ratio. The Light Gradient Boosting Machine (LGBM) algorithm achieved outperformed other models, achieving the following metrics: Accuracy (0.956), Recall (0.812), Specificity (0.983), F1-score (0.855), AUC (0.961). Importance ranking revealed that the top 11 minimal feature set can maintain the stability of model performance.

Conclusions: The onset of overweight in adolescents was accurately predicted using easily collectible personal information. The LGBM-based model exhibited superior performance. Oversampling technique notably improved model performance. The model interpretation technique provided innovative strategies for managing adolescent overweight/obesity.

Abstract Image

预测中国高中生超重的发生:一年前瞻性队列研究中的机器学习方法。
研究目的本研究旨在利用易于收集的个人信息,开发和评估用于预测 14-17 岁青少年开始超重的机器学习模型:本研究是一项为期一年的前瞻性队列研究。方法:本研究是一项为期一年的前瞻性队列研究,通过人体测量和问卷调查收集基线数据,一年后通过人体测量计算超重发生率。通过单变量分析选出了预测因素。开发了六个机器学习模型来预测超重的发生。结果:在 1241 名青少年中,有 204 人(16.4%)在一年后被确定为超重。在单变量分析中,有 19 个特征与超重发生率相关。参与者按 7:3 的比例随机分为训练组和测试组。轻梯度提升机(LGBM)算法的表现优于其他模型,达到了以下指标:准确率 (0.956)、召回率 (0.812)、特异性 (0.983)、F1-分数 (0.855)、AUC (0.961)。重要度排序显示,前 11 个最小特征集可以保持模型性能的稳定性:结论:利用易于收集的个人信息可以准确预测青少年超重的发生。基于 LGBM 的模型表现出卓越的性能。过度取样技术显著提高了模型性能。模型解释技术为管理青少年超重/肥胖症提供了创新策略。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Endocrine
Endocrine 医学-内分泌学与代谢
CiteScore
6.40
自引率
5.40%
发文量
0
期刊介绍: Well-established as a major journal in today’s rapidly advancing experimental and clinical research areas, Endocrine publishes original articles devoted to basic (including molecular, cellular and physiological studies), translational and clinical research in all the different fields of endocrinology and metabolism. Articles will be accepted based on peer-reviews, priority, and editorial decision. Invited reviews, mini-reviews and viewpoints on relevant pathophysiological and clinical topics, as well as Editorials on articles appearing in the Journal, are published. Unsolicited Editorials will be evaluated by the editorial team. Outcomes of scientific meetings, as well as guidelines and position statements, may be submitted. The Journal also considers special feature articles in the field of endocrine genetics and epigenetics, as well as articles devoted to novel methods and techniques in endocrinology. Endocrine covers controversial, clinical endocrine issues. Meta-analyses on endocrine and metabolic topics are also accepted. Descriptions of single clinical cases and/or small patients studies are not published unless of exceptional interest. However, reports of novel imaging studies and endocrine side effects in single patients may be considered. Research letters and letters to the editor related or unrelated to recently published articles can be submitted. Endocrine covers leading topics in endocrinology such as neuroendocrinology, pituitary and hypothalamic peptides, thyroid physiological and clinical aspects, bone and mineral metabolism and osteoporosis, obesity, lipid and energy metabolism and food intake control, insulin, Type 1 and Type 2 diabetes, hormones of male and female reproduction, adrenal diseases pediatric and geriatric endocrinology, endocrine hypertension and endocrine oncology.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信