Paolo Sorino, Angela Lombardi, Domenico Lofù, Tommaso Colafiglio, Antonio Ferrara, Fedelucio Narducci, Eugenio Di Sciascio, Tommaso Di Noia
{"title":"Detecting label noise in longitudinal Alzheimer's data with explainable artificial intelligence.","authors":"Paolo Sorino, Angela Lombardi, Domenico Lofù, Tommaso Colafiglio, Antonio Ferrara, Fedelucio Narducci, Eugenio Di Sciascio, Tommaso Di Noia","doi":"10.1186/s40708-025-00261-2","DOIUrl":null,"url":null,"abstract":"<p><p>Reliable classification of cognitive states in longitudinal Alzheimer's Disease (AD) studies is critical for early diagnosis and intervention. However, inconsistencies in diagnostic labeling, arising from subjective assessments, evolving clinical criteria, and measurement variability, introduce noise that can impact machine learning (ML) model performance. This study explores the potential of explainable artificial intelligence to detect and characterize noisy labels in longitudinal datasets. A predictive model is trained using a Leave-One-Subject-Out validation strategy, ensuring robustness across subjects while enabling individual-level interpretability. By leveraging SHapley Additive exPlanations values, we analyze the temporal variations in feature importance across multiple patient visits, aiming to identify transitions that may reflect either genuine cognitive changes or inconsistencies in labeling. Using statistical thresholds derived from cognitively stable individuals, we propose an approach to flag potential misclassifications while preserving clinical labels. Rather than modifying diagnoses, this framework provides a structured way to highlight cases where diagnostic reassessment may be warranted. By integrating explainability into the assessment of cognitive state transitions, this approach enhances the reliability of longitudinal analyses and supports a more robust use of ML in AD research.</p>","PeriodicalId":37465,"journal":{"name":"Brain Informatics","volume":"12 1","pages":"15"},"PeriodicalIF":4.5000,"publicationDate":"2025-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12151964/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Brain Informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1186/s40708-025-00261-2","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"Computer Science","Score":null,"Total":0}
引用次数: 0
Abstract
Reliable classification of cognitive states in longitudinal Alzheimer's Disease (AD) studies is critical for early diagnosis and intervention. However, inconsistencies in diagnostic labeling, arising from subjective assessments, evolving clinical criteria, and measurement variability, introduce noise that can impact machine learning (ML) model performance. This study explores the potential of explainable artificial intelligence to detect and characterize noisy labels in longitudinal datasets. A predictive model is trained using a Leave-One-Subject-Out validation strategy, ensuring robustness across subjects while enabling individual-level interpretability. By leveraging SHapley Additive exPlanations values, we analyze the temporal variations in feature importance across multiple patient visits, aiming to identify transitions that may reflect either genuine cognitive changes or inconsistencies in labeling. Using statistical thresholds derived from cognitively stable individuals, we propose an approach to flag potential misclassifications while preserving clinical labels. Rather than modifying diagnoses, this framework provides a structured way to highlight cases where diagnostic reassessment may be warranted. By integrating explainability into the assessment of cognitive state transitions, this approach enhances the reliability of longitudinal analyses and supports a more robust use of ML in AD research.
期刊介绍:
Brain Informatics is an international, peer-reviewed, interdisciplinary open-access journal published under the brand SpringerOpen, which provides a unique platform for researchers and practitioners to disseminate original research on computational and informatics technologies related to brain. This journal addresses the computational, cognitive, physiological, biological, physical, ecological and social perspectives of brain informatics. It also welcomes emerging information technologies and advanced neuro-imaging technologies, such as big data analytics and interactive knowledge discovery related to various large-scale brain studies and their applications. This journal will publish high-quality original research papers, brief reports and critical reviews in all theoretical, technological, clinical and interdisciplinary studies that make up the field of brain informatics and its applications in brain-machine intelligence, brain-inspired intelligent systems, mental health and brain disorders, etc. The scope of papers includes the following five tracks: Track 1: Cognitive and Computational Foundations of Brain Science Track 2: Human Information Processing Systems Track 3: Brain Big Data Analytics, Curation and Management Track 4: Informatics Paradigms for Brain and Mental Health Research Track 5: Brain-Machine Intelligence and Brain-Inspired Computing