Predicting clinical events characterizing the progression of amyotrophic lateral sclerosis via machine learning approaches using routine visits data: a feasibility study.
IF 4.3 3区 材料科学Q1 ENGINEERING, ELECTRICAL & ELECTRONIC
Alessandro Guazzo, Michele Atzeni, Elena Idi, Isotta Trescato, Erica Tavazzi, Enrico Longato, Umberto Manera, Adriano Chió, Marta Gromicho, Inês Alves, Mamede de Carvalho, Martina Vettoretti, Barbara Di Camillo
{"title":"Predicting clinical events characterizing the progression of amyotrophic lateral sclerosis via machine learning approaches using routine visits data: a feasibility study.","authors":"Alessandro Guazzo, Michele Atzeni, Elena Idi, Isotta Trescato, Erica Tavazzi, Enrico Longato, Umberto Manera, Adriano Chió, Marta Gromicho, Inês Alves, Mamede de Carvalho, Martina Vettoretti, Barbara Di Camillo","doi":"10.1186/s12911-024-02719-5","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Amyotrophic lateral sclerosis (ALS) is a progressive neurodegenerative disease that results in death within a short time span (3-5 years). One of the major challenges in treating ALS is its highly heterogeneous disease progression and the lack of effective prognostic tools to forecast it. The main aim of this study was, then, to test the feasibility of predicting relevant clinical outcomes that characterize the progression of ALS with a two-year prediction horizon via artificial intelligence techniques using routine visits data.</p><p><strong>Methods: </strong>Three classification problems were considered: predicting death (binary problem), predicting death or percutaneous endoscopic gastrostomy (PEG) (multiclass problem), and predicting death or non-invasive ventilation (NIV) (multiclass problem). Two supervised learning models, a logistic regression (LR) and a deep learning multilayer perceptron (MLP), were trained ensuring technical robustness and reproducibility. Moreover, to provide insights into model explainability and result interpretability, model coefficients for LR and Shapley values for both LR and MLP were considered to characterize the relationship between each variable and the outcome.</p><p><strong>Results: </strong>On the one hand, predicting death was successful as both models yielded F1 scores and accuracy well above 0.7. The model explainability analysis performed for this outcome allowed for the understanding of how different methodological approaches consider the input variables when performing the prediction. On the other hand, predicting death alongside PEG or NIV proved to be much more challenging (F1 scores and accuracy in the 0.4-0.6 interval).</p><p><strong>Conclusions: </strong>In conclusion, predicting death due to ALS proved to be feasible. However, predicting PEG or NIV in a multiclass fashion proved to be unfeasible with these data, regardless of the complexity of the methodological approach. The observed results suggest a potential ceiling on the amount of information extractable from the database, e.g., due to the intrinsic difficulty of the prediction tasks at hand, or to the absence of crucial predictors that are, however, not currently collected during routine practice.</p>","PeriodicalId":3,"journal":{"name":"ACS Applied Electronic Materials","volume":null,"pages":null},"PeriodicalIF":4.3000,"publicationDate":"2024-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11523576/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACS Applied Electronic Materials","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s12911-024-02719-5","RegionNum":3,"RegionCategory":"材料科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0
Abstract
Background: Amyotrophic lateral sclerosis (ALS) is a progressive neurodegenerative disease that results in death within a short time span (3-5 years). One of the major challenges in treating ALS is its highly heterogeneous disease progression and the lack of effective prognostic tools to forecast it. The main aim of this study was, then, to test the feasibility of predicting relevant clinical outcomes that characterize the progression of ALS with a two-year prediction horizon via artificial intelligence techniques using routine visits data.
Methods: Three classification problems were considered: predicting death (binary problem), predicting death or percutaneous endoscopic gastrostomy (PEG) (multiclass problem), and predicting death or non-invasive ventilation (NIV) (multiclass problem). Two supervised learning models, a logistic regression (LR) and a deep learning multilayer perceptron (MLP), were trained ensuring technical robustness and reproducibility. Moreover, to provide insights into model explainability and result interpretability, model coefficients for LR and Shapley values for both LR and MLP were considered to characterize the relationship between each variable and the outcome.
Results: On the one hand, predicting death was successful as both models yielded F1 scores and accuracy well above 0.7. The model explainability analysis performed for this outcome allowed for the understanding of how different methodological approaches consider the input variables when performing the prediction. On the other hand, predicting death alongside PEG or NIV proved to be much more challenging (F1 scores and accuracy in the 0.4-0.6 interval).
Conclusions: In conclusion, predicting death due to ALS proved to be feasible. However, predicting PEG or NIV in a multiclass fashion proved to be unfeasible with these data, regardless of the complexity of the methodological approach. The observed results suggest a potential ceiling on the amount of information extractable from the database, e.g., due to the intrinsic difficulty of the prediction tasks at hand, or to the absence of crucial predictors that are, however, not currently collected during routine practice.
背景:肌萎缩性脊髓侧索硬化症(ALS)是一种进行性神经退行性疾病,患者会在短时间内(3-5 年)死亡。治疗肌萎缩侧索硬化症的主要挑战之一是其高度异质性的疾病进展以及缺乏有效的预后预测工具。因此,本研究的主要目的是利用常规就诊数据,通过人工智能技术测试预测相关临床结果的可行性,这些结果描述了 ALS 在两年内的进展情况:我们考虑了三个分类问题:预测死亡(二元问题)、预测死亡或经皮内镜胃造瘘术(PEG)(多类问题)以及预测死亡或无创通气(NIV)(多类问题)。训练了两个监督学习模型,即逻辑回归(LR)和深度学习多层感知器(MLP),以确保技术的稳健性和可重复性。此外,为了深入了解模型的可解释性和结果的可解释性,还考虑了 LR 的模型系数以及 LR 和 MLP 的 Shapley 值,以描述每个变量与结果之间的关系:一方面,预测死亡是成功的,因为两个模型的 F1 分数和准确率都远高于 0.7。对这一结果进行的模型可解释性分析有助于了解不同的方法在进行预测时是如何考虑输入变量的。另一方面,预测PEG或NIV导致的死亡则更具挑战性(F1得分和准确率在0.4-0.6之间):总之,预测 ALS 引起的死亡是可行的。然而,无论方法的复杂程度如何,通过这些数据以多分类方式预测 PEG 或 NIV 都被证明是不可行的。观察到的结果表明,从数据库中提取的信息量可能存在上限,例如,由于当前预测任务的内在难度,或者由于缺乏目前在日常实践中没有收集到的关键预测因子。