{"title":"Ensemble-learning approach improves fracture prediction using genomic and phenotypic data.","authors":"Qing Wu, Jongyun Jung","doi":"10.1007/s00198-025-07437-w","DOIUrl":null,"url":null,"abstract":"<p><p>This study presents an innovative ensemble machine learning model integrating genomic and clinical data to enhance the prediction of major osteoporotic fractures in older men. The Super Learner (SL) model achieved superior performance (AUC = 0.76, accuracy = 95.6%, sensitivity = 94.5%, specificity = 96.1%) compared to individual models. Ensemble machine learning improves fracture prediction accuracy, demonstrating the potential for personalized osteoporosis management.</p><p><strong>Purpose: </strong>Existing fracture risk models have limitations in their accuracy and in integrating genomic data. This study developed and validated an innovative ensemble machine learning (ML) model that combines multiple algorithms and integrates clinical, lifestyle, skeletal, and genomic data to enhance prediction for major osteoporotic fractures (MOF) in older men.</p><p><strong>Methods: </strong>This study analyzed data from 5130 participants in the Osteoporotic Fractures in Men cohort Study. The model incorporated 1103 individual genome-wide significant variants and conventional risk factors of MOF. The participants were randomly divided into training (80%) and testing (20%) sets. Seven ML algorithms were combined using the SL ensemble method with tenfold cross-validation MOF prediction. Model performance was evaluated on the testing set using the area under the curve (AUC), the area under the precision-recall curve, calibration, accuracy, sensitivity, specificity, negative predictive value (NPV), positive predictive value (PPV), and reclassification metrics. SL model performances were evaluated by comparison with baseline models and subgroup analyses by race.</p><p><strong>Results: </strong>The SL model demonstrated the best performance with an AUC of 0.76, accuracy of 95.6%, sensitivity of 94.5%, specificity of 96.1%, NPV of 95.1%, and PPV of 94.7%. Among the individual ML, gradient boosting performed optimally. The SL model outperformed baseline models, and it also achieved accuracies of 93.1% for Whites and 91.6% for Minorities, outperforming single ML in subgroup analysis.</p><p><strong>Conclusion: </strong>The ensemble learning approach significantly improved fracture prediction accuracy and model performance compared to individual ML. Integrating genomic and phenotypic data via the SL approach represents a promising advancement for personalized osteoporosis management.</p>","PeriodicalId":19638,"journal":{"name":"Osteoporosis International","volume":" ","pages":""},"PeriodicalIF":4.2000,"publicationDate":"2025-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Osteoporosis International","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1007/s00198-025-07437-w","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENDOCRINOLOGY & METABOLISM","Score":null,"Total":0}
引用次数: 0
Abstract
This study presents an innovative ensemble machine learning model integrating genomic and clinical data to enhance the prediction of major osteoporotic fractures in older men. The Super Learner (SL) model achieved superior performance (AUC = 0.76, accuracy = 95.6%, sensitivity = 94.5%, specificity = 96.1%) compared to individual models. Ensemble machine learning improves fracture prediction accuracy, demonstrating the potential for personalized osteoporosis management.
Purpose: Existing fracture risk models have limitations in their accuracy and in integrating genomic data. This study developed and validated an innovative ensemble machine learning (ML) model that combines multiple algorithms and integrates clinical, lifestyle, skeletal, and genomic data to enhance prediction for major osteoporotic fractures (MOF) in older men.
Methods: This study analyzed data from 5130 participants in the Osteoporotic Fractures in Men cohort Study. The model incorporated 1103 individual genome-wide significant variants and conventional risk factors of MOF. The participants were randomly divided into training (80%) and testing (20%) sets. Seven ML algorithms were combined using the SL ensemble method with tenfold cross-validation MOF prediction. Model performance was evaluated on the testing set using the area under the curve (AUC), the area under the precision-recall curve, calibration, accuracy, sensitivity, specificity, negative predictive value (NPV), positive predictive value (PPV), and reclassification metrics. SL model performances were evaluated by comparison with baseline models and subgroup analyses by race.
Results: The SL model demonstrated the best performance with an AUC of 0.76, accuracy of 95.6%, sensitivity of 94.5%, specificity of 96.1%, NPV of 95.1%, and PPV of 94.7%. Among the individual ML, gradient boosting performed optimally. The SL model outperformed baseline models, and it also achieved accuracies of 93.1% for Whites and 91.6% for Minorities, outperforming single ML in subgroup analysis.
Conclusion: The ensemble learning approach significantly improved fracture prediction accuracy and model performance compared to individual ML. Integrating genomic and phenotypic data via the SL approach represents a promising advancement for personalized osteoporosis management.
期刊介绍:
An international multi-disciplinary journal which is a joint initiative between the International Osteoporosis Foundation and the National Osteoporosis Foundation of the USA, Osteoporosis International provides a forum for the communication and exchange of current ideas concerning the diagnosis, prevention, treatment and management of osteoporosis and other metabolic bone diseases.
It publishes: original papers - reporting progress and results in all areas of osteoporosis and its related fields; review articles - reflecting the present state of knowledge in special areas of summarizing limited themes in which discussion has led to clearly defined conclusions; educational articles - giving information on the progress of a topic of particular interest; case reports - of uncommon or interesting presentations of the condition.
While focusing on clinical research, the Journal will also accept submissions on more basic aspects of research, where they are considered by the editors to be relevant to the human disease spectrum.