Yasser Attiga, Shih-Yin Chen, J. LaGue, Anaelia Ovalle, Nathan Stott, T. Brander, Abdullah Khaled, Gaurika Tyagi, P. Francis-Lyon
{"title":"Applying Deep Learning to Public Health: Using Unbalanced Demographic Data to Predict Thyroid Disorder","authors":"Yasser Attiga, Shih-Yin Chen, J. LaGue, Anaelia Ovalle, Nathan Stott, T. Brander, Abdullah Khaled, Gaurika Tyagi, P. Francis-Lyon","doi":"10.1109/IEMCON.2018.8614888","DOIUrl":null,"url":null,"abstract":"This study investigates the use of Deep Neural Learning to predict propensity for disease from demographic information alone, with thyroid disease as the test application. The imbalanced dataset of 747,301 samples contained 13 demographic predictor variables that were not known to be associated with the disease, and had much missing information. A TensorFlow feed-forward neural network was trained to predict thyroid disease. Different activation functions and a variety of up-sampling and down-sampling methods were employed. The lift statistic was used to evaluate success in identifying patients with a propensity for thyroid disease. The DNN model outperformed the Random Forest model with a 36.63% improvement in the lift statistic. These results suggest that deep learning may be successfully employed to select candidates for early intervention for improved health outcomes, utilizing a large dataset with only minimal demographic variables, similar to datasets that are held by the marketing arms of healthcare providers.","PeriodicalId":368939,"journal":{"name":"2018 IEEE 9th Annual Information Technology, Electronics and Mobile Communication Conference (IEMCON)","volume":"43 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE 9th Annual Information Technology, Electronics and Mobile Communication Conference (IEMCON)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IEMCON.2018.8614888","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
This study investigates the use of Deep Neural Learning to predict propensity for disease from demographic information alone, with thyroid disease as the test application. The imbalanced dataset of 747,301 samples contained 13 demographic predictor variables that were not known to be associated with the disease, and had much missing information. A TensorFlow feed-forward neural network was trained to predict thyroid disease. Different activation functions and a variety of up-sampling and down-sampling methods were employed. The lift statistic was used to evaluate success in identifying patients with a propensity for thyroid disease. The DNN model outperformed the Random Forest model with a 36.63% improvement in the lift statistic. These results suggest that deep learning may be successfully employed to select candidates for early intervention for improved health outcomes, utilizing a large dataset with only minimal demographic variables, similar to datasets that are held by the marketing arms of healthcare providers.