Yonghan Cha, Sung Hyo Seo, Jung-Taek Kim, Jin-Woo Kim, Sang-Yeob Lee, Jun-Il Yoo
{"title":"使用横断面数据库的机器学习骨质疏松症特征选择和风险预测模型。","authors":"Yonghan Cha, Sung Hyo Seo, Jung-Taek Kim, Jin-Woo Kim, Sang-Yeob Lee, Jun-Il Yoo","doi":"10.11005/jbm.2023.30.3.263","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>The purpose of this study was to verify the accuracy and validity of using machine learning (ML) to select risk factors, to discriminate differences in feature selection by ML between men and women, and to develop predictive models for patients with osteoporosis in a big database.</p><p><strong>Methods: </strong>The data on 968 observed features from a total of 3,484 the Korea National Health and Nutrition Examination Survey participants were collected. To find preliminary features that were well-related to osteoporosis, logistic regression, random forest, gradient boosting, adaptive boosting, and support vector machine were used.</p><p><strong>Results: </strong>In osteoporosis feature selection by 5 ML models in this study, the most selected variables as risk factors in men and women were body mass index, monthly alcohol consumption, and dietary surveys. However, differences between men and women in osteoporosis feature selection by ML models were age, smoking, and blood glucose level. The receiver operating characteristic (ROC) analysis revealed that the area under the ROC curve for each ML model was not significantly different for either gender.</p><p><strong>Conclusions: </strong>ML performed a feature selection of osteoporosis, considering hidden differences between men and women. The present study considers the preprocessing of input data and the feature selection process as well as the ML technique to be important factors for the accuracy of the osteoporosis prediction model.</p>","PeriodicalId":15070,"journal":{"name":"Journal of Bone Metabolism","volume":"30 3","pages":"263-273"},"PeriodicalIF":0.0000,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/f3/12/jbm-2023-30-3-263.PMC10509024.pdf","citationCount":"0","resultStr":"{\"title\":\"Osteoporosis Feature Selection and Risk Prediction Model by Machine Learning Using a Cross-Sectional Database.\",\"authors\":\"Yonghan Cha, Sung Hyo Seo, Jung-Taek Kim, Jin-Woo Kim, Sang-Yeob Lee, Jun-Il Yoo\",\"doi\":\"10.11005/jbm.2023.30.3.263\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>The purpose of this study was to verify the accuracy and validity of using machine learning (ML) to select risk factors, to discriminate differences in feature selection by ML between men and women, and to develop predictive models for patients with osteoporosis in a big database.</p><p><strong>Methods: </strong>The data on 968 observed features from a total of 3,484 the Korea National Health and Nutrition Examination Survey participants were collected. To find preliminary features that were well-related to osteoporosis, logistic regression, random forest, gradient boosting, adaptive boosting, and support vector machine were used.</p><p><strong>Results: </strong>In osteoporosis feature selection by 5 ML models in this study, the most selected variables as risk factors in men and women were body mass index, monthly alcohol consumption, and dietary surveys. However, differences between men and women in osteoporosis feature selection by ML models were age, smoking, and blood glucose level. The receiver operating characteristic (ROC) analysis revealed that the area under the ROC curve for each ML model was not significantly different for either gender.</p><p><strong>Conclusions: </strong>ML performed a feature selection of osteoporosis, considering hidden differences between men and women. The present study considers the preprocessing of input data and the feature selection process as well as the ML technique to be important factors for the accuracy of the osteoporosis prediction model.</p>\",\"PeriodicalId\":15070,\"journal\":{\"name\":\"Journal of Bone Metabolism\",\"volume\":\"30 3\",\"pages\":\"263-273\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/f3/12/jbm-2023-30-3-263.PMC10509024.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Bone Metabolism\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.11005/jbm.2023.30.3.263\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2023/8/31 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q2\",\"JCRName\":\"Medicine\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Bone Metabolism","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.11005/jbm.2023.30.3.263","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2023/8/31 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"Medicine","Score":null,"Total":0}
Osteoporosis Feature Selection and Risk Prediction Model by Machine Learning Using a Cross-Sectional Database.
Background: The purpose of this study was to verify the accuracy and validity of using machine learning (ML) to select risk factors, to discriminate differences in feature selection by ML between men and women, and to develop predictive models for patients with osteoporosis in a big database.
Methods: The data on 968 observed features from a total of 3,484 the Korea National Health and Nutrition Examination Survey participants were collected. To find preliminary features that were well-related to osteoporosis, logistic regression, random forest, gradient boosting, adaptive boosting, and support vector machine were used.
Results: In osteoporosis feature selection by 5 ML models in this study, the most selected variables as risk factors in men and women were body mass index, monthly alcohol consumption, and dietary surveys. However, differences between men and women in osteoporosis feature selection by ML models were age, smoking, and blood glucose level. The receiver operating characteristic (ROC) analysis revealed that the area under the ROC curve for each ML model was not significantly different for either gender.
Conclusions: ML performed a feature selection of osteoporosis, considering hidden differences between men and women. The present study considers the preprocessing of input data and the feature selection process as well as the ML technique to be important factors for the accuracy of the osteoporosis prediction model.