{"title":"Exploring the Performance of Stacking Classifier to Predict Depression Among the Elderly","authors":"E. Lee","doi":"10.1109/ICHI.2017.95","DOIUrl":null,"url":null,"abstract":"Geriatric depression is a disease prevailing in the elderly. It is characterized by typical symptoms of lower functioning, diminished interest in activities, insomnia or hypersomnia, fatigue or loss of energy and observable psycho motor agitation or retardation. Many studies exist with an aim to predict the geriatric depression from the perspective of healthcare informatics based on data mining analytics. However, there is no study emphasizing on the performance of stacking mechanism, which is one of ensemble classifiers. Therefore, this study is concerned with investigating the performance of stacking approach to predicting the geriatric depression-related dataset from the Korea National Health and Nutrition Examination Survey (KNHANES) ranging from 2010 to 2015. The KNHANES is a publicly available big dataset out of a national surveillance system aimed at assessing the health and nutritional status of Koreans since 1998. It is a nationally representative cross-sectional survey including approximately 10,000 individuals each year as a survey sample. By using 9,089 dataset regarding the geriatric depression in the Korean elderly (2010 ~2015), this study analyzed the changes in performance of the stacking mechanism when combining five classifiers (i.e., LR, DT, NN, SVM, NBN) in the base-level learner and meta-level learner. The performance of stacking mechanism measured in accuracy and AUC shows more robust pattern when the base-level learner is relatively simple (like LR, DT), and the meta-level learner is rather complex (like NBN, NN, SVM). To be specific, before the feature selection, the stacking performance was very competitive with accuracy 0.8624 when LR(SVM) indicating that the base-level learner is LR, and the meta-level learner is SVM. After the feature selection, the stacking performance was best with accuracy 0.8643 when DT (NN). With AUC, the similar results were obtained- i.e., LR(NN) with 0.8182 before the feature selection, and LR(NBN) with 0.8147 after the feature selection.","PeriodicalId":263611,"journal":{"name":"2017 IEEE International Conference on Healthcare Informatics (ICHI)","volume":"12 8 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"13","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE International Conference on Healthcare Informatics (ICHI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICHI.2017.95","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 13
Abstract
Geriatric depression is a disease prevailing in the elderly. It is characterized by typical symptoms of lower functioning, diminished interest in activities, insomnia or hypersomnia, fatigue or loss of energy and observable psycho motor agitation or retardation. Many studies exist with an aim to predict the geriatric depression from the perspective of healthcare informatics based on data mining analytics. However, there is no study emphasizing on the performance of stacking mechanism, which is one of ensemble classifiers. Therefore, this study is concerned with investigating the performance of stacking approach to predicting the geriatric depression-related dataset from the Korea National Health and Nutrition Examination Survey (KNHANES) ranging from 2010 to 2015. The KNHANES is a publicly available big dataset out of a national surveillance system aimed at assessing the health and nutritional status of Koreans since 1998. It is a nationally representative cross-sectional survey including approximately 10,000 individuals each year as a survey sample. By using 9,089 dataset regarding the geriatric depression in the Korean elderly (2010 ~2015), this study analyzed the changes in performance of the stacking mechanism when combining five classifiers (i.e., LR, DT, NN, SVM, NBN) in the base-level learner and meta-level learner. The performance of stacking mechanism measured in accuracy and AUC shows more robust pattern when the base-level learner is relatively simple (like LR, DT), and the meta-level learner is rather complex (like NBN, NN, SVM). To be specific, before the feature selection, the stacking performance was very competitive with accuracy 0.8624 when LR(SVM) indicating that the base-level learner is LR, and the meta-level learner is SVM. After the feature selection, the stacking performance was best with accuracy 0.8643 when DT (NN). With AUC, the similar results were obtained- i.e., LR(NN) with 0.8182 before the feature selection, and LR(NBN) with 0.8147 after the feature selection.