Dawda Jawara MD , Kate V. Lauer MD , Manasa Venkatesh MS , Lily N. Stalter MS , Bret Hanlon PhD , Matthew M. Churpek MD, MPH, PhD , Luke M. Funk MD, MPH
{"title":"Using Machine Learning to Predict Weight Gain in Adults: an Observational Analysis From the All of Us Research Program","authors":"Dawda Jawara MD , Kate V. Lauer MD , Manasa Venkatesh MS , Lily N. Stalter MS , Bret Hanlon PhD , Matthew M. Churpek MD, MPH, PhD , Luke M. Funk MD, MPH","doi":"10.1016/j.jss.2024.11.042","DOIUrl":null,"url":null,"abstract":"<div><h3>Introduction</h3><div>Obesity, defined as a body mass index ≥30 kg/m<sup>2</sup>, is a major public health concern in the United States. Preventative approaches are essential, but they are limited by an inability to accurately predict individuals at highest risk of weight gain. Our objective was to develop accurate weight gain prediction models using the National Institutes of Health All of Us dataset. We hypothesized that machine learning models using both electronic health record and behavioral survey data would outperform models using electronic health record data alone.</div></div><div><h3>Methods</h3><div>The All of Us dataset was used to identify adults between 18 and 70 ys old with weight measurements 2 y apart between 2008 and 2022. Patients with a history of cancer, bariatric surgery, or pregnancy were excluded. Demographics, vital signs, laboratory results, comorbidities, and survey data (Alcohol Use Disorder Identification Test, Patient-Reported Outcomes Measurement Information System physical and mental health scores) were included as model parameters. Elastic net and XGBoost machine learning models were developed with and without survey data to predict ≥10% total body weight gain within 2 y. The data were split into a training sample (60%) and a testing sample (40%), and parameters were tuned using 10-fold cross-validation. Performance was compared using area under the receiver operating characteristic curves (AUCs).</div></div><div><h3>Results</h3><div>Our cohort consisted of 34,715 patients (mean [SD] age 50.9 [13.4] y; 45.7% White; 55.3% female). Over a 2-y span, 10.4% of the cohort gained ≥10% total body weight. AUCs were 0.677 [95% DeLong confidence interval 0.665-0.688] for elastic net and 0.706 [0.695-0.717] for XGBoost. Incorporation of survey data did not improve predictability, with AUCs of 0.681 [0.669-0.692] and 0.705 [0.694-0.716], respectively.</div></div><div><h3>Conclusions</h3><div>Our machine learning weight gain prediction models had modest performance that was not improved by survey data. The addition of other All of Us variables, including genomic data, may be informative in future studies.</div></div>","PeriodicalId":17030,"journal":{"name":"Journal of Surgical Research","volume":"306 ","pages":"Pages 43-53"},"PeriodicalIF":1.8000,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Surgical Research","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S002248042400787X","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"SURGERY","Score":null,"Total":0}
引用次数: 0
Abstract
Introduction
Obesity, defined as a body mass index ≥30 kg/m2, is a major public health concern in the United States. Preventative approaches are essential, but they are limited by an inability to accurately predict individuals at highest risk of weight gain. Our objective was to develop accurate weight gain prediction models using the National Institutes of Health All of Us dataset. We hypothesized that machine learning models using both electronic health record and behavioral survey data would outperform models using electronic health record data alone.
Methods
The All of Us dataset was used to identify adults between 18 and 70 ys old with weight measurements 2 y apart between 2008 and 2022. Patients with a history of cancer, bariatric surgery, or pregnancy were excluded. Demographics, vital signs, laboratory results, comorbidities, and survey data (Alcohol Use Disorder Identification Test, Patient-Reported Outcomes Measurement Information System physical and mental health scores) were included as model parameters. Elastic net and XGBoost machine learning models were developed with and without survey data to predict ≥10% total body weight gain within 2 y. The data were split into a training sample (60%) and a testing sample (40%), and parameters were tuned using 10-fold cross-validation. Performance was compared using area under the receiver operating characteristic curves (AUCs).
Results
Our cohort consisted of 34,715 patients (mean [SD] age 50.9 [13.4] y; 45.7% White; 55.3% female). Over a 2-y span, 10.4% of the cohort gained ≥10% total body weight. AUCs were 0.677 [95% DeLong confidence interval 0.665-0.688] for elastic net and 0.706 [0.695-0.717] for XGBoost. Incorporation of survey data did not improve predictability, with AUCs of 0.681 [0.669-0.692] and 0.705 [0.694-0.716], respectively.
Conclusions
Our machine learning weight gain prediction models had modest performance that was not improved by survey data. The addition of other All of Us variables, including genomic data, may be informative in future studies.
期刊介绍:
The Journal of Surgical Research: Clinical and Laboratory Investigation publishes original articles concerned with clinical and laboratory investigations relevant to surgical practice and teaching. The journal emphasizes reports of clinical investigations or fundamental research bearing directly on surgical management that will be of general interest to a broad range of surgeons and surgical researchers. The articles presented need not have been the products of surgeons or of surgical laboratories.
The Journal of Surgical Research also features review articles and special articles relating to educational, research, or social issues of interest to the academic surgical community.