Lei Lu, Luo Tao, Wang Yining, Han Jiahui, Li Jianfeng
{"title":"Research on Osteoporosis Risk Assessment Based on Semi-supervised Machine Learning","authors":"Lei Lu, Luo Tao, Wang Yining, Han Jiahui, Li Jianfeng","doi":"10.1145/3407703.3407725","DOIUrl":null,"url":null,"abstract":"In this paper, we propose a semi-supervised machine learning method for osteoporosis risk assessment. Existing osteoporosis risk assessment models have problems of low accuracy, and cannot utilize large amounts of unlabeled data. In order to improve the accuracy of diagnosis, the method comprehensively considers the osteoporosis-related questionnaire data and bone image data, and fuses the multi-modal features extracted from them. Feature engineering and Word2vec are used to extract numerical and text features from questionnaires, respectively. CNN is used to extract image features from BMD images. Considering the difficulty of obtaining labeled medical data, we build a self-training semi-supervised model based on XGBoost to classify and evaluate osteoporosis, which uses both labeled and unlabeled data for obtaining better generalization capabilities. Besides, in view of the fact that the questionnaire data has plenty of outliers and missing data, we remove outliers based on a DBSCAN algorithm and propose an improved PKNN algorithm to impute the missing data. Experimental results show that the proposed improved semi-supervised method achieves an accuracy of 0.78 in osteoporosis risk assessment and has obvious advantages compared with other methods.","PeriodicalId":284603,"journal":{"name":"Proceedings of the 2020 Artificial Intelligence and Complex Systems Conference","volume":"16 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2020 Artificial Intelligence and Complex Systems Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3407703.3407725","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
In this paper, we propose a semi-supervised machine learning method for osteoporosis risk assessment. Existing osteoporosis risk assessment models have problems of low accuracy, and cannot utilize large amounts of unlabeled data. In order to improve the accuracy of diagnosis, the method comprehensively considers the osteoporosis-related questionnaire data and bone image data, and fuses the multi-modal features extracted from them. Feature engineering and Word2vec are used to extract numerical and text features from questionnaires, respectively. CNN is used to extract image features from BMD images. Considering the difficulty of obtaining labeled medical data, we build a self-training semi-supervised model based on XGBoost to classify and evaluate osteoporosis, which uses both labeled and unlabeled data for obtaining better generalization capabilities. Besides, in view of the fact that the questionnaire data has plenty of outliers and missing data, we remove outliers based on a DBSCAN algorithm and propose an improved PKNN algorithm to impute the missing data. Experimental results show that the proposed improved semi-supervised method achieves an accuracy of 0.78 in osteoporosis risk assessment and has obvious advantages compared with other methods.