A novel integrated logistic regression model enhanced with recursive feature elimination and explainable artificial intelligence for dementia prediction
Rasel Ahmed , Nafiz Fahad , Md Saef Ullah Miah , Md. Jakir Hossen , Md. Kishor Morol , Mufti Mahmud , M. Mostafizur Rahman
{"title":"A novel integrated logistic regression model enhanced with recursive feature elimination and explainable artificial intelligence for dementia prediction","authors":"Rasel Ahmed , Nafiz Fahad , Md Saef Ullah Miah , Md. Jakir Hossen , Md. Kishor Morol , Mufti Mahmud , M. Mostafizur Rahman","doi":"10.1016/j.health.2024.100362","DOIUrl":null,"url":null,"abstract":"<div><p>Dementia is a major global health issue that significantly impacts millions of individuals, families, and societies worldwide, creating a substantial burden on healthcare systems. This study introduces a novel approach for predicting dementia by employing the Logistic Regression (LR) model, enhanced with Recursive Feature Elimination (RFE), applied to a unique dataset comprising 1000 patients, with 49.60% male and 50.40% female. The LR model, recognized for its simplicity and effectiveness in binary classification tasks, is optimized through RFE, a technique that iteratively eliminates less significant features to improve model performance. The model’s effectiveness was assessed using comprehensive metrics, including accuracy, precision, recall, F1-score, Matthews Correlation Coefficient (MCC), and Kappa score. Furthermore, SHapley Additive exPlanations (SHAP) values were employed to increase the interpretability of the model, providing insights into the most influential features for dementia prediction. To address the issue of overfitting, a standardization technique was implemented, which enhanced the model’s predictive performance. The findings of this study hold potential implications for early dementia detection, informing intervention strategies, and optimizing healthcare resource allocation.</p></div>","PeriodicalId":73222,"journal":{"name":"Healthcare analytics (New York, N.Y.)","volume":"6 ","pages":"Article 100362"},"PeriodicalIF":0.0000,"publicationDate":"2024-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2772442524000649/pdfft?md5=1b759d8a985cabb4b185f0a36f88797f&pid=1-s2.0-S2772442524000649-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Healthcare analytics (New York, N.Y.)","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2772442524000649","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Dementia is a major global health issue that significantly impacts millions of individuals, families, and societies worldwide, creating a substantial burden on healthcare systems. This study introduces a novel approach for predicting dementia by employing the Logistic Regression (LR) model, enhanced with Recursive Feature Elimination (RFE), applied to a unique dataset comprising 1000 patients, with 49.60% male and 50.40% female. The LR model, recognized for its simplicity and effectiveness in binary classification tasks, is optimized through RFE, a technique that iteratively eliminates less significant features to improve model performance. The model’s effectiveness was assessed using comprehensive metrics, including accuracy, precision, recall, F1-score, Matthews Correlation Coefficient (MCC), and Kappa score. Furthermore, SHapley Additive exPlanations (SHAP) values were employed to increase the interpretability of the model, providing insights into the most influential features for dementia prediction. To address the issue of overfitting, a standardization technique was implemented, which enhanced the model’s predictive performance. The findings of this study hold potential implications for early dementia detection, informing intervention strategies, and optimizing healthcare resource allocation.