Thibault Anani, Jean-François Pradat-Peyre, François Delbot, Claude Desnuelle, Anne Sophie Rolland, David Devos, Pierre-François Pradat
{"title":"Feature selection using metaheuristics to predict annual amyotrophic lateral sclerosis progression.","authors":"Thibault Anani, Jean-François Pradat-Peyre, François Delbot, Claude Desnuelle, Anne Sophie Rolland, David Devos, Pierre-François Pradat","doi":"10.1080/21678421.2025.2522399","DOIUrl":null,"url":null,"abstract":"<p><strong>Objective: </strong>Amyotrophic lateral sclerosis (ALS), a progressive neurodegenerative disease with no curative treatment and affecting motor neurons, leads to motor weakness, atrophy, spasticity and difficulties with speech, swallowing, and breathing. Accurately predicting disease progression and survival is crucial for optimizing patient care, intervention planning, and informed decision-making.</p><p><strong>Methods: </strong>Data were gathered from the PRO-ACT database (4659 patients), clinical trial data from ExonHit Therapeutics (384 patients) and the PULSE multicenter cohort aimed at identifying predictive factors of disease progression (198 patients). Machine learning (ML) techniques including logistic/linear regression (LR), K-nearest neighbors, decision tree, random forest, and light gradient boosting machine (LGBM) were applied to forecast ALS progression using ALS Functional Rating Scale (ALSFRS) scores and patient survival over one year. Models were validated using 10-fold cross-validation, while Kaplan-Meier estimates were employed to cluster patients according to their profiles. To enhance the predictive accuracy of our models, we performed feature selection using ANOVA and differential evolution (DE).</p><p><strong>Results: </strong>LR with DE achieved a balanced accuracy of 76.05% on validation (ranging from 68.6% to 79.8% per fold) and 76.33% on test data, with an AUC of 0.84. With Kaplan-Meier's estimates, we identified five distinct patient clusters (<i>C</i>-index = 0.8; log-rank test <i>p</i> value ≤0.0001). Additionally, LGBM predictions for ALSFRS progression at 3 months yielded an RMSE of 3.14 and an adjusted <i>R</i><sup>2</sup> of 0.764.</p><p><strong>Conclusion: </strong>This study showcases the potential of ML models to provide significant predictive insights in ALS, enhancing the understanding of disease dynamics and supporting patient care.</p>","PeriodicalId":72184,"journal":{"name":"Amyotrophic lateral sclerosis & frontotemporal degeneration","volume":" ","pages":"1-16"},"PeriodicalIF":0.0000,"publicationDate":"2025-07-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Amyotrophic lateral sclerosis & frontotemporal degeneration","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1080/21678421.2025.2522399","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Objective: Amyotrophic lateral sclerosis (ALS), a progressive neurodegenerative disease with no curative treatment and affecting motor neurons, leads to motor weakness, atrophy, spasticity and difficulties with speech, swallowing, and breathing. Accurately predicting disease progression and survival is crucial for optimizing patient care, intervention planning, and informed decision-making.
Methods: Data were gathered from the PRO-ACT database (4659 patients), clinical trial data from ExonHit Therapeutics (384 patients) and the PULSE multicenter cohort aimed at identifying predictive factors of disease progression (198 patients). Machine learning (ML) techniques including logistic/linear regression (LR), K-nearest neighbors, decision tree, random forest, and light gradient boosting machine (LGBM) were applied to forecast ALS progression using ALS Functional Rating Scale (ALSFRS) scores and patient survival over one year. Models were validated using 10-fold cross-validation, while Kaplan-Meier estimates were employed to cluster patients according to their profiles. To enhance the predictive accuracy of our models, we performed feature selection using ANOVA and differential evolution (DE).
Results: LR with DE achieved a balanced accuracy of 76.05% on validation (ranging from 68.6% to 79.8% per fold) and 76.33% on test data, with an AUC of 0.84. With Kaplan-Meier's estimates, we identified five distinct patient clusters (C-index = 0.8; log-rank test p value ≤0.0001). Additionally, LGBM predictions for ALSFRS progression at 3 months yielded an RMSE of 3.14 and an adjusted R2 of 0.764.
Conclusion: This study showcases the potential of ML models to provide significant predictive insights in ALS, enhancing the understanding of disease dynamics and supporting patient care.