Afeez A. Soladoye , Nicholas Aderinto , Bolaji A. Omodunbi , Adebimpe O. Esan , Ibrahim A. Adeyanju , David B. Olawade
{"title":"利用随机森林增强阿尔茨海默病预测:一种结合反向特征消除和蚁群优化的新框架","authors":"Afeez A. Soladoye , Nicholas Aderinto , Bolaji A. Omodunbi , Adebimpe O. Esan , Ibrahim A. Adeyanju , David B. Olawade","doi":"10.1016/j.retram.2025.103526","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><div>Alzheimer's disease (AD) represents a significant global health challenge due to its increasing prevalence and the limitations of current diagnostic approaches. Early detection is crucial as pathological changes occur 10-15 years before clinical symptoms manifest, yet current diagnostic methods typically identify the disease at moderate to advanced stages. Machine learning techniques offer promising solutions for early prediction, but face challenges related to feature selection and hyperparameter optimization.</div></div><div><h3>Objective</h3><div>To develop an enhanced predictive model for Alzheimer's disease by integrating advanced feature selection techniques with nature-inspired hyperparameter optimization for Random Forest classifiers while ensuring robust validation and statistical significance testing.</div></div><div><h3>Methods</h3><div>This study employed three feature selection techniques (Whale Optimization Algorithm, Artificial Bee Colony, and Backward Elimination Feature Selection) and two hyperparameter optimization algorithms (Artificial Ant Colony Optimization and Bald Eagle Search) to improve Random Forest model performance. A dataset comprising 2,149 instances with 34 features was preprocessed using MinMax normalization and Synthetic Minority Oversampling Technique (SMOTE) applied only to training data to prevent data leakage. Statistical significance testing using McNemar's test was conducted to compare model performances. Model performance was evaluated using accuracy, precision, recall, F1-score, and AUC with confidence intervals calculated using bootstrap sampling.</div></div><div><h3>Results</h3><div>The combination of Backward Elimination Feature Selection with Artificial Ant Colony Optimization achieved the highest performance (95% accuracy ± 1.2%, 95% precision ± 1.1%, 94% recall ± 1.3%, 95% F1-score ± 1.0%, 98% AUC ± 0.8%), outperforming other methodological combinations and conventional machine learning algorithms with statistically significant improvements (p < 0.001). This approach identified 26 significant features associated with Alzheimer's disease. Additionally, nature-inspired optimization algorithms demonstrated substantial computational efficiency advantages over empirical approaches (18 minutes versus 133 minutes).</div></div><div><h3>Conclusion</h3><div>The integration of advanced feature selection with nature-inspired hyperparameter optimization enhances Alzheimer's disease prediction accuracy while improving computational efficiency. However, external validation on independent datasets and prospective clinical studies are needed to establish real-world utility. This methodological framework offers promising applications for early diagnosis and intervention planning, with potential extensions to other complex medical prediction tasks.</div></div>","PeriodicalId":54260,"journal":{"name":"Current Research in Translational Medicine","volume":"73 4","pages":"Article 103526"},"PeriodicalIF":3.2000,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Enhancing Alzheimer's disease prediction using random forest: A novel framework combining backward feature elimination and ant colony optimization\",\"authors\":\"Afeez A. Soladoye , Nicholas Aderinto , Bolaji A. Omodunbi , Adebimpe O. Esan , Ibrahim A. Adeyanju , David B. Olawade\",\"doi\":\"10.1016/j.retram.2025.103526\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><h3>Background</h3><div>Alzheimer's disease (AD) represents a significant global health challenge due to its increasing prevalence and the limitations of current diagnostic approaches. Early detection is crucial as pathological changes occur 10-15 years before clinical symptoms manifest, yet current diagnostic methods typically identify the disease at moderate to advanced stages. Machine learning techniques offer promising solutions for early prediction, but face challenges related to feature selection and hyperparameter optimization.</div></div><div><h3>Objective</h3><div>To develop an enhanced predictive model for Alzheimer's disease by integrating advanced feature selection techniques with nature-inspired hyperparameter optimization for Random Forest classifiers while ensuring robust validation and statistical significance testing.</div></div><div><h3>Methods</h3><div>This study employed three feature selection techniques (Whale Optimization Algorithm, Artificial Bee Colony, and Backward Elimination Feature Selection) and two hyperparameter optimization algorithms (Artificial Ant Colony Optimization and Bald Eagle Search) to improve Random Forest model performance. A dataset comprising 2,149 instances with 34 features was preprocessed using MinMax normalization and Synthetic Minority Oversampling Technique (SMOTE) applied only to training data to prevent data leakage. Statistical significance testing using McNemar's test was conducted to compare model performances. Model performance was evaluated using accuracy, precision, recall, F1-score, and AUC with confidence intervals calculated using bootstrap sampling.</div></div><div><h3>Results</h3><div>The combination of Backward Elimination Feature Selection with Artificial Ant Colony Optimization achieved the highest performance (95% accuracy ± 1.2%, 95% precision ± 1.1%, 94% recall ± 1.3%, 95% F1-score ± 1.0%, 98% AUC ± 0.8%), outperforming other methodological combinations and conventional machine learning algorithms with statistically significant improvements (p < 0.001). This approach identified 26 significant features associated with Alzheimer's disease. Additionally, nature-inspired optimization algorithms demonstrated substantial computational efficiency advantages over empirical approaches (18 minutes versus 133 minutes).</div></div><div><h3>Conclusion</h3><div>The integration of advanced feature selection with nature-inspired hyperparameter optimization enhances Alzheimer's disease prediction accuracy while improving computational efficiency. However, external validation on independent datasets and prospective clinical studies are needed to establish real-world utility. This methodological framework offers promising applications for early diagnosis and intervention planning, with potential extensions to other complex medical prediction tasks.</div></div>\",\"PeriodicalId\":54260,\"journal\":{\"name\":\"Current Research in Translational Medicine\",\"volume\":\"73 4\",\"pages\":\"Article 103526\"},\"PeriodicalIF\":3.2000,\"publicationDate\":\"2025-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Current Research in Translational Medicine\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2452318625000352\",\"RegionNum\":4,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"MEDICINE, RESEARCH & EXPERIMENTAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Current Research in Translational Medicine","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2452318625000352","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MEDICINE, RESEARCH & EXPERIMENTAL","Score":null,"Total":0}
Enhancing Alzheimer's disease prediction using random forest: A novel framework combining backward feature elimination and ant colony optimization
Background
Alzheimer's disease (AD) represents a significant global health challenge due to its increasing prevalence and the limitations of current diagnostic approaches. Early detection is crucial as pathological changes occur 10-15 years before clinical symptoms manifest, yet current diagnostic methods typically identify the disease at moderate to advanced stages. Machine learning techniques offer promising solutions for early prediction, but face challenges related to feature selection and hyperparameter optimization.
Objective
To develop an enhanced predictive model for Alzheimer's disease by integrating advanced feature selection techniques with nature-inspired hyperparameter optimization for Random Forest classifiers while ensuring robust validation and statistical significance testing.
Methods
This study employed three feature selection techniques (Whale Optimization Algorithm, Artificial Bee Colony, and Backward Elimination Feature Selection) and two hyperparameter optimization algorithms (Artificial Ant Colony Optimization and Bald Eagle Search) to improve Random Forest model performance. A dataset comprising 2,149 instances with 34 features was preprocessed using MinMax normalization and Synthetic Minority Oversampling Technique (SMOTE) applied only to training data to prevent data leakage. Statistical significance testing using McNemar's test was conducted to compare model performances. Model performance was evaluated using accuracy, precision, recall, F1-score, and AUC with confidence intervals calculated using bootstrap sampling.
Results
The combination of Backward Elimination Feature Selection with Artificial Ant Colony Optimization achieved the highest performance (95% accuracy ± 1.2%, 95% precision ± 1.1%, 94% recall ± 1.3%, 95% F1-score ± 1.0%, 98% AUC ± 0.8%), outperforming other methodological combinations and conventional machine learning algorithms with statistically significant improvements (p < 0.001). This approach identified 26 significant features associated with Alzheimer's disease. Additionally, nature-inspired optimization algorithms demonstrated substantial computational efficiency advantages over empirical approaches (18 minutes versus 133 minutes).
Conclusion
The integration of advanced feature selection with nature-inspired hyperparameter optimization enhances Alzheimer's disease prediction accuracy while improving computational efficiency. However, external validation on independent datasets and prospective clinical studies are needed to establish real-world utility. This methodological framework offers promising applications for early diagnosis and intervention planning, with potential extensions to other complex medical prediction tasks.
期刊介绍:
Current Research in Translational Medicine is a peer-reviewed journal, publishing worldwide clinical and basic research in the field of hematology, immunology, infectiology, hematopoietic cell transplantation, and cellular and gene therapy. The journal considers for publication English-language editorials, original articles, reviews, and short reports including case-reports. Contributions are intended to draw attention to experimental medicine and translational research. Current Research in Translational Medicine periodically publishes thematic issues and is indexed in all major international databases (2017 Impact Factor is 1.9).
Core areas covered in Current Research in Translational Medicine are:
Hematology,
Immunology,
Infectiology,
Hematopoietic,
Cell Transplantation,
Cellular and Gene Therapy.