{"title":"A Machine Learning Model for Predicting Sarcopenia Among Middle-Aged Adults: Development and External Validation.","authors":"Hye Jin Chong","doi":"10.2196/75760","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Sarcopenia is a common muscle disorder in older adults, and its early identification and management in middle-aged populations are essential for ensuring a healthier later life. Detecting sarcopenia at an earlier stage may reduce the future burden on health care systems and enhance the quality of life in older adults. Machine learning (ML) models can evaluate large datasets, identify essential variables, and find complicated correlations between input variables. However, using ML models to detect sarcopenia remains an unsatisfied need.</p><p><strong>Objective: </strong>This study aimed to develop and externally validate an ML model to predict sarcopenia risk among middle-aged adults using a nationally representative dataset.</p><p><strong>Methods: </strong>We analyzed data from 1926 participants aged 40 to 64 years and enrolled in the 2022 Korea National Health and Nutrition Examination Survey (KNHANES). Sarcopenia was diagnosed and defined based on the 2019 Asian Working Group for Sarcopenia criteria, which incorporate both low muscle mass and reduced muscle strength. Muscle mass was assessed using bioelectrical impedance analysis with cutoffs of <7.0 kg/m² for men and <5.7 kg/m² for women. Muscle strength was measured via handgrip strength using a digital dynamometer with thresholds of <28 kg for men and <18 kg for women. Participants meeting both criteria were classified as those with sarcopenia. Four ML algorithms, random forest, support vector machine, extreme gradient boosting, and logistic regression, were used to identify risk factors of sarcopenia and predict its likelihood. The top-performing model was subsequently validated in an external cohort of 2247 middle-aged adults from the 2023 KNHANES. Model performance was assessed using the F<sub>2</sub>-score, area under the curve of a receiver operating characteristic curve, and sensitivity. All analyses were conducted using Python 3.13.2 (Python Software Foundation).</p><p><strong>Results: </strong>Among the 4 models, the logistic regression model demonstrated the strongest performance, yielding an area under the curve of 0.85, a sensitivity of 0.92, and an F<sub>2</sub>-score of 0.66. External validation using the 2023 KNHANES dataset confirmed the model's robust performance, indicating its potential for widespread applications.</p><p><strong>Conclusions: </strong>This study developed and externally validated an ML model that accurately identified sarcopenia in middle-aged adults. Leveraging data from a comprehensive national survey, our findings underscore the significance of early detection and customized interventions in midlife to mitigate sarcopenia risk and optimize long-term health outcomes.</p>","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":"13 ","pages":"e75760"},"PeriodicalIF":3.8000,"publicationDate":"2025-08-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12423610/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"JMIR Medical Informatics","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.2196/75760","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MEDICAL INFORMATICS","Score":null,"Total":0}
引用次数: 0
Abstract
Background: Sarcopenia is a common muscle disorder in older adults, and its early identification and management in middle-aged populations are essential for ensuring a healthier later life. Detecting sarcopenia at an earlier stage may reduce the future burden on health care systems and enhance the quality of life in older adults. Machine learning (ML) models can evaluate large datasets, identify essential variables, and find complicated correlations between input variables. However, using ML models to detect sarcopenia remains an unsatisfied need.
Objective: This study aimed to develop and externally validate an ML model to predict sarcopenia risk among middle-aged adults using a nationally representative dataset.
Methods: We analyzed data from 1926 participants aged 40 to 64 years and enrolled in the 2022 Korea National Health and Nutrition Examination Survey (KNHANES). Sarcopenia was diagnosed and defined based on the 2019 Asian Working Group for Sarcopenia criteria, which incorporate both low muscle mass and reduced muscle strength. Muscle mass was assessed using bioelectrical impedance analysis with cutoffs of <7.0 kg/m² for men and <5.7 kg/m² for women. Muscle strength was measured via handgrip strength using a digital dynamometer with thresholds of <28 kg for men and <18 kg for women. Participants meeting both criteria were classified as those with sarcopenia. Four ML algorithms, random forest, support vector machine, extreme gradient boosting, and logistic regression, were used to identify risk factors of sarcopenia and predict its likelihood. The top-performing model was subsequently validated in an external cohort of 2247 middle-aged adults from the 2023 KNHANES. Model performance was assessed using the F2-score, area under the curve of a receiver operating characteristic curve, and sensitivity. All analyses were conducted using Python 3.13.2 (Python Software Foundation).
Results: Among the 4 models, the logistic regression model demonstrated the strongest performance, yielding an area under the curve of 0.85, a sensitivity of 0.92, and an F2-score of 0.66. External validation using the 2023 KNHANES dataset confirmed the model's robust performance, indicating its potential for widespread applications.
Conclusions: This study developed and externally validated an ML model that accurately identified sarcopenia in middle-aged adults. Leveraging data from a comprehensive national survey, our findings underscore the significance of early detection and customized interventions in midlife to mitigate sarcopenia risk and optimize long-term health outcomes.
期刊介绍:
JMIR Medical Informatics (JMI, ISSN 2291-9694) is a top-rated, tier A journal which focuses on clinical informatics, big data in health and health care, decision support for health professionals, electronic health records, ehealth infrastructures and implementation. It has a focus on applied, translational research, with a broad readership including clinicians, CIOs, engineers, industry and health informatics professionals.
Published by JMIR Publications, publisher of the Journal of Medical Internet Research (JMIR), the leading eHealth/mHealth journal (Impact Factor 2016: 5.175), JMIR Med Inform has a slightly different scope (emphasizing more on applications for clinicians and health professionals rather than consumers/citizens, which is the focus of JMIR), publishes even faster, and also allows papers which are more technical or more formative than what would be published in the Journal of Medical Internet Research.