在美国的电子医疗记录中使用机器学习识别有体重增加风险的个体。

IF 5.4 2区 医学 Q1 ENDOCRINOLOGY & METABOLISM
Casey Choong PhD, Neena Xavier MD, Beverly Falcon PhD, Hong Kan PhD, Ilya Lipkovich PhD, Callie Nowak MPH, Margaret Hoyt PhD, Christy Houle PhD, Scott Kahan MD
{"title":"在美国的电子医疗记录中使用机器学习识别有体重增加风险的个体。","authors":"Casey Choong PhD,&nbsp;Neena Xavier MD,&nbsp;Beverly Falcon PhD,&nbsp;Hong Kan PhD,&nbsp;Ilya Lipkovich PhD,&nbsp;Callie Nowak MPH,&nbsp;Margaret Hoyt PhD,&nbsp;Christy Houle PhD,&nbsp;Scott Kahan MD","doi":"10.1111/dom.16311","DOIUrl":null,"url":null,"abstract":"<div>\n \n \n <section>\n \n <h3> Aims</h3>\n \n <p>Numerous risk factors for the development of obesity have been identified, yet the aetiology is not well understood. Traditional statistical methods for analysing observational data are limited by the volume and characteristics of large datasets. Machine learning (ML) methods can analyse large datasets to extract novel insights on risk factors for obesity. This study predicted adults at risk of a ≥10% increase in index body mass index (BMI) within 12 months using ML and a large electronic medical records (EMR) database.</p>\n </section>\n \n <section>\n \n <h3> Materials and Methods</h3>\n \n <p>ML algorithms were used with EMR from Optum's de-identified Market Clarity Data, a US database. Models included extreme gradient boosting (XGBoost), random forest, simple logistic regression (no feature selection procedure) and two penalised logistic models (Elastic Net and Least Absolute Shrinkage and Selection Operator [LASSO]). Performance metrics included the area under the curve (AUC) of the receiver operating characteristic curve (used to determine the best-performing model), average precision, Brier score, accuracy, recall, positive predictive value, Youden index, F1 score, negative predictive value and specificity.</p>\n </section>\n \n <section>\n \n <h3> Results</h3>\n \n <p>The XGBoost model performed best 12 months post-index, with an AUC of 0.75. Lower baseline BMI, having any emergency room visit during the study period, no diabetes mellitus, no lipid disorders and younger age were among the top predictors for ≥10% increase in index BMI.</p>\n </section>\n \n <section>\n \n <h3> Conclusion</h3>\n \n <p>The current study demonstrates an ML approach applied to EMR to identify those at risk for weight gain over 12 months. Providers may use this risk stratification to prioritise prevention strategies or earlier obesity intervention.</p>\n </section>\n </div>","PeriodicalId":158,"journal":{"name":"Diabetes, Obesity & Metabolism","volume":"27 6","pages":"3061-3071"},"PeriodicalIF":5.4000,"publicationDate":"2025-03-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/dom.16311","citationCount":"0","resultStr":"{\"title\":\"Identifying individuals at risk for weight gain using machine learning in electronic medical records from the United States\",\"authors\":\"Casey Choong PhD,&nbsp;Neena Xavier MD,&nbsp;Beverly Falcon PhD,&nbsp;Hong Kan PhD,&nbsp;Ilya Lipkovich PhD,&nbsp;Callie Nowak MPH,&nbsp;Margaret Hoyt PhD,&nbsp;Christy Houle PhD,&nbsp;Scott Kahan MD\",\"doi\":\"10.1111/dom.16311\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div>\\n \\n \\n <section>\\n \\n <h3> Aims</h3>\\n \\n <p>Numerous risk factors for the development of obesity have been identified, yet the aetiology is not well understood. Traditional statistical methods for analysing observational data are limited by the volume and characteristics of large datasets. Machine learning (ML) methods can analyse large datasets to extract novel insights on risk factors for obesity. This study predicted adults at risk of a ≥10% increase in index body mass index (BMI) within 12 months using ML and a large electronic medical records (EMR) database.</p>\\n </section>\\n \\n <section>\\n \\n <h3> Materials and Methods</h3>\\n \\n <p>ML algorithms were used with EMR from Optum's de-identified Market Clarity Data, a US database. Models included extreme gradient boosting (XGBoost), random forest, simple logistic regression (no feature selection procedure) and two penalised logistic models (Elastic Net and Least Absolute Shrinkage and Selection Operator [LASSO]). Performance metrics included the area under the curve (AUC) of the receiver operating characteristic curve (used to determine the best-performing model), average precision, Brier score, accuracy, recall, positive predictive value, Youden index, F1 score, negative predictive value and specificity.</p>\\n </section>\\n \\n <section>\\n \\n <h3> Results</h3>\\n \\n <p>The XGBoost model performed best 12 months post-index, with an AUC of 0.75. Lower baseline BMI, having any emergency room visit during the study period, no diabetes mellitus, no lipid disorders and younger age were among the top predictors for ≥10% increase in index BMI.</p>\\n </section>\\n \\n <section>\\n \\n <h3> Conclusion</h3>\\n \\n <p>The current study demonstrates an ML approach applied to EMR to identify those at risk for weight gain over 12 months. Providers may use this risk stratification to prioritise prevention strategies or earlier obesity intervention.</p>\\n </section>\\n </div>\",\"PeriodicalId\":158,\"journal\":{\"name\":\"Diabetes, Obesity & Metabolism\",\"volume\":\"27 6\",\"pages\":\"3061-3071\"},\"PeriodicalIF\":5.4000,\"publicationDate\":\"2025-03-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://onlinelibrary.wiley.com/doi/epdf/10.1111/dom.16311\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Diabetes, Obesity & Metabolism\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1111/dom.16311\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENDOCRINOLOGY & METABOLISM\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Diabetes, Obesity & Metabolism","FirstCategoryId":"3","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1111/dom.16311","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENDOCRINOLOGY & METABOLISM","Score":null,"Total":0}
引用次数: 0

摘要

目的:肥胖的许多危险因素已经确定,但其病因尚不清楚。分析观测数据的传统统计方法受到大型数据集的数量和特征的限制。机器学习(ML)方法可以分析大型数据集,以提取有关肥胖风险因素的新见解。本研究使用ML和大型电子医疗记录(EMR)数据库预测成人在12个月内体重指数(BMI)增加≥10%的风险。材料和方法:机器学习算法与来自Optum的去识别市场清晰度数据(一个美国数据库)的EMR一起使用。模型包括极端梯度增强(XGBoost)、随机森林、简单逻辑回归(没有特征选择过程)和两个惩罚逻辑模型(弹性网络和最小绝对收缩和选择算子[LASSO])。性能指标包括受试者工作特征曲线的曲线下面积(AUC)(用于确定最佳模型)、平均精密度、Brier评分、准确率、召回率、阳性预测值、约登指数、F1评分、阴性预测值和特异性。结果:XGBoost模型在指数后12个月表现最佳,AUC为0.75。较低的基线BMI、在研究期间有任何急诊室就诊、无糖尿病、无脂质紊乱和较年轻是BMI指数增加≥10%的主要预测因素。结论:目前的研究表明,将ML方法应用于EMR来识别那些在12个月内体重增加的风险。提供者可以使用这种风险分层来优先考虑预防策略或早期肥胖干预。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

Identifying individuals at risk for weight gain using machine learning in electronic medical records from the United States

Identifying individuals at risk for weight gain using machine learning in electronic medical records from the United States

Aims

Numerous risk factors for the development of obesity have been identified, yet the aetiology is not well understood. Traditional statistical methods for analysing observational data are limited by the volume and characteristics of large datasets. Machine learning (ML) methods can analyse large datasets to extract novel insights on risk factors for obesity. This study predicted adults at risk of a ≥10% increase in index body mass index (BMI) within 12 months using ML and a large electronic medical records (EMR) database.

Materials and Methods

ML algorithms were used with EMR from Optum's de-identified Market Clarity Data, a US database. Models included extreme gradient boosting (XGBoost), random forest, simple logistic regression (no feature selection procedure) and two penalised logistic models (Elastic Net and Least Absolute Shrinkage and Selection Operator [LASSO]). Performance metrics included the area under the curve (AUC) of the receiver operating characteristic curve (used to determine the best-performing model), average precision, Brier score, accuracy, recall, positive predictive value, Youden index, F1 score, negative predictive value and specificity.

Results

The XGBoost model performed best 12 months post-index, with an AUC of 0.75. Lower baseline BMI, having any emergency room visit during the study period, no diabetes mellitus, no lipid disorders and younger age were among the top predictors for ≥10% increase in index BMI.

Conclusion

The current study demonstrates an ML approach applied to EMR to identify those at risk for weight gain over 12 months. Providers may use this risk stratification to prioritise prevention strategies or earlier obesity intervention.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Diabetes, Obesity & Metabolism
Diabetes, Obesity & Metabolism 医学-内分泌学与代谢
CiteScore
10.90
自引率
6.90%
发文量
319
审稿时长
3-8 weeks
期刊介绍: Diabetes, Obesity and Metabolism is primarily a journal of clinical and experimental pharmacology and therapeutics covering the interrelated areas of diabetes, obesity and metabolism. The journal prioritises high-quality original research that reports on the effects of new or existing therapies, including dietary, exercise and lifestyle (non-pharmacological) interventions, in any aspect of metabolic and endocrine disease, either in humans or animal and cellular systems. ‘Metabolism’ may relate to lipids, bone and drug metabolism, or broader aspects of endocrine dysfunction. Preclinical pharmacology, pharmacokinetic studies, meta-analyses and those addressing drug safety and tolerability are also highly suitable for publication in this journal. Original research may be published as a main paper or as a research letter.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信