Detecting label noise in longitudinal Alzheimer's data with explainable artificial intelligence.

IF 4.5 Q1 Computer Science

Brain Informatics Pub Date : 2025-06-10 DOI:10.1186/s40708-025-00261-2

Paolo Sorino, Angela Lombardi, Domenico Lofù, Tommaso Colafiglio, Antonio Ferrara, Fedelucio Narducci, Eugenio Di Sciascio, Tommaso Di Noia

{"title":"Detecting label noise in longitudinal Alzheimer's data with explainable artificial intelligence.","authors":"Paolo Sorino, Angela Lombardi, Domenico Lofù, Tommaso Colafiglio, Antonio Ferrara, Fedelucio Narducci, Eugenio Di Sciascio, Tommaso Di Noia","doi":"10.1186/s40708-025-00261-2","DOIUrl":null,"url":null,"abstract":"<p><p>Reliable classification of cognitive states in longitudinal Alzheimer's Disease (AD) studies is critical for early diagnosis and intervention. However, inconsistencies in diagnostic labeling, arising from subjective assessments, evolving clinical criteria, and measurement variability, introduce noise that can impact machine learning (ML) model performance. This study explores the potential of explainable artificial intelligence to detect and characterize noisy labels in longitudinal datasets. A predictive model is trained using a Leave-One-Subject-Out validation strategy, ensuring robustness across subjects while enabling individual-level interpretability. By leveraging SHapley Additive exPlanations values, we analyze the temporal variations in feature importance across multiple patient visits, aiming to identify transitions that may reflect either genuine cognitive changes or inconsistencies in labeling. Using statistical thresholds derived from cognitively stable individuals, we propose an approach to flag potential misclassifications while preserving clinical labels. Rather than modifying diagnoses, this framework provides a structured way to highlight cases where diagnostic reassessment may be warranted. By integrating explainability into the assessment of cognitive state transitions, this approach enhances the reliability of longitudinal analyses and supports a more robust use of ML in AD research.</p>","PeriodicalId":37465,"journal":{"name":"Brain Informatics","volume":"12 1","pages":"15"},"PeriodicalIF":4.5000,"publicationDate":"2025-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12151964/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Brain Informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1186/s40708-025-00261-2","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"Computer Science","Score":null,"Total":0}

引用次数: 0

Abstract

Reliable classification of cognitive states in longitudinal Alzheimer's Disease (AD) studies is critical for early diagnosis and intervention. However, inconsistencies in diagnostic labeling, arising from subjective assessments, evolving clinical criteria, and measurement variability, introduce noise that can impact machine learning (ML) model performance. This study explores the potential of explainable artificial intelligence to detect and characterize noisy labels in longitudinal datasets. A predictive model is trained using a Leave-One-Subject-Out validation strategy, ensuring robustness across subjects while enabling individual-level interpretability. By leveraging SHapley Additive exPlanations values, we analyze the temporal variations in feature importance across multiple patient visits, aiming to identify transitions that may reflect either genuine cognitive changes or inconsistencies in labeling. Using statistical thresholds derived from cognitively stable individuals, we propose an approach to flag potential misclassifications while preserving clinical labels. Rather than modifying diagnoses, this framework provides a structured way to highlight cases where diagnostic reassessment may be warranted. By integrating explainability into the assessment of cognitive state transitions, this approach enhances the reliability of longitudinal analyses and supports a more robust use of ML in AD research.

Abstract Image

查看原文本刊更多论文

用可解释的人工智能检测纵向阿尔茨海默病数据中的标签噪声。

纵向阿尔茨海默病（AD）研究中认知状态的可靠分类对早期诊断和干预至关重要。然而，由于主观评估、不断发展的临床标准和测量变异性引起的诊断标签不一致，引入了可能影响机器学习（ML）模型性能的噪声。本研究探索了可解释的人工智能在纵向数据集中检测和表征噪声标签的潜力。使用Leave-One-Subject-Out验证策略训练预测模型，确保跨主题的鲁棒性，同时实现个人层面的可解释性。通过利用SHapley加性解释值，我们分析了多个患者就诊中特征重要性的时间变化，旨在识别可能反映真实认知变化或标签不一致的过渡。使用来自认知稳定个体的统计阈值，我们提出了一种方法来标记潜在的错误分类，同时保留临床标签。该框架不是修改诊断，而是提供了一种结构化的方法来突出可能需要重新评估诊断的病例。通过将可解释性整合到认知状态转换的评估中，该方法提高了纵向分析的可靠性，并支持在AD研究中更有力地使用ML。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Brain Informatics Computer Science-Computer Science Applications

CiteScore

9.50

自引率

0.00%

发文量

审稿时长

13 weeks

期刊介绍： Brain Informatics is an international, peer-reviewed, interdisciplinary open-access journal published under the brand SpringerOpen, which provides a unique platform for researchers and practitioners to disseminate original research on computational and informatics technologies related to brain. This journal addresses the computational, cognitive, physiological, biological, physical, ecological and social perspectives of brain informatics. It also welcomes emerging information technologies and advanced neuro-imaging technologies, such as big data analytics and interactive knowledge discovery related to various large-scale brain studies and their applications. This journal will publish high-quality original research papers, brief reports and critical reviews in all theoretical, technological, clinical and interdisciplinary studies that make up the field of brain informatics and its applications in brain-machine intelligence, brain-inspired intelligent systems, mental health and brain disorders, etc. The scope of papers includes the following five tracks: Track 1: Cognitive and Computational Foundations of Brain Science Track 2: Human Information Processing Systems Track 3: Brain Big Data Analytics, Curation and Management Track 4: Informatics Paradigms for Brain and Mental Health Research Track 5: Brain-Machine Intelligence and Brain-Inspired Computing