A data-centric and interpretable EEG framework for depression severity grading using SHAP-based insights.

IF 5.2 2区 医学 Q1 ENGINEERING, BIOMEDICAL
Anruo Shen, Jingnan Sun, Xiaogang Chen, Xiaorong Gao
{"title":"A data-centric and interpretable EEG framework for depression severity grading using SHAP-based insights.","authors":"Anruo Shen, Jingnan Sun, Xiaogang Chen, Xiaorong Gao","doi":"10.1186/s12984-025-01645-5","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Major Depressive Disorder is a leading cause of disability worldwide. An accurate assessment of depression severity is critical for diagnosis, treatment planning, and monitoring, yet current clinical tools are largely subjective, relying on self-report and clinician judgment via traditional assessment scales. EEG has emerged as a promising, non-invasive modality for capturing neural correlates of depression. However, most EEG-based machine learning diagnostic studies focus on boosting classification accuracy through complex algorithms and small, homogenous datasets. These black-box approaches often yield results that are difficult to interpret and poorly generalizable, making clinical translation impractical. Therefore there remains a critical need for models that are not only accurate but also transparent, robust, and grounded in the physiological properties of the data itself.</p><p><strong>Methods: </strong>We proposed a data-centric, interpretable framework for EEG-based depression severity grading. A hybrid feature selection method was used, combining p-value and SHapley Additive exPlanations (SHAP) methods to select features that are both independently significant and jointly informative. The system was trained and evaluated on a large-scale, multi-site resting-state EEG dataset, using random forest for both classification and regression tasks. The SHAP method, an explainable artificial intelligence technique, is also used post-hoc to infer the key electrophysiological features and key brain regions associated with MDD mechanism to further increase interpretability.</p><p><strong>Results: </strong>The proposed system achieved 74.5% (95% CI [70.97%, 78.80%], p < 0.001) ten-fold classification accuracy and a correlation coefficient of 0.56 (95% CI [0.407, 0.683], p < 0.001) for severity estimation. SHAP analysis identified consistent, clinically meaningful EEG features, particularly in the left parietal-occipital lobe. Through in-depth SHAP value analysis, we identified critical disease-related brain areas in the left occipital and parietal lobes, along with key features including relative beta power in the left parietal lobe, time-domain features at the parietal midline, 1/f intercept, left occipital relative beta power, and global brain alpha energy.</p><p><strong>Conclusion: </strong>This study proposes a data-centric, interpretable depression grading system built on large-scale, multi-center EEG data, using simple models and hybrid feature selection to emphasize explainability, generalizability and data fidelity. By shifting the focus from algorithmic complexity to data transparency and feature-level insight, the model offers a practical and trustworthy path toward real-world mental health assessment.</p>","PeriodicalId":16384,"journal":{"name":"Journal of NeuroEngineering and Rehabilitation","volume":"22 1","pages":"116"},"PeriodicalIF":5.2000,"publicationDate":"2025-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12103758/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of NeuroEngineering and Rehabilitation","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1186/s12984-025-01645-5","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, BIOMEDICAL","Score":null,"Total":0}
引用次数: 0

Abstract

Background: Major Depressive Disorder is a leading cause of disability worldwide. An accurate assessment of depression severity is critical for diagnosis, treatment planning, and monitoring, yet current clinical tools are largely subjective, relying on self-report and clinician judgment via traditional assessment scales. EEG has emerged as a promising, non-invasive modality for capturing neural correlates of depression. However, most EEG-based machine learning diagnostic studies focus on boosting classification accuracy through complex algorithms and small, homogenous datasets. These black-box approaches often yield results that are difficult to interpret and poorly generalizable, making clinical translation impractical. Therefore there remains a critical need for models that are not only accurate but also transparent, robust, and grounded in the physiological properties of the data itself.

Methods: We proposed a data-centric, interpretable framework for EEG-based depression severity grading. A hybrid feature selection method was used, combining p-value and SHapley Additive exPlanations (SHAP) methods to select features that are both independently significant and jointly informative. The system was trained and evaluated on a large-scale, multi-site resting-state EEG dataset, using random forest for both classification and regression tasks. The SHAP method, an explainable artificial intelligence technique, is also used post-hoc to infer the key electrophysiological features and key brain regions associated with MDD mechanism to further increase interpretability.

Results: The proposed system achieved 74.5% (95% CI [70.97%, 78.80%], p < 0.001) ten-fold classification accuracy and a correlation coefficient of 0.56 (95% CI [0.407, 0.683], p < 0.001) for severity estimation. SHAP analysis identified consistent, clinically meaningful EEG features, particularly in the left parietal-occipital lobe. Through in-depth SHAP value analysis, we identified critical disease-related brain areas in the left occipital and parietal lobes, along with key features including relative beta power in the left parietal lobe, time-domain features at the parietal midline, 1/f intercept, left occipital relative beta power, and global brain alpha energy.

Conclusion: This study proposes a data-centric, interpretable depression grading system built on large-scale, multi-center EEG data, using simple models and hybrid feature selection to emphasize explainability, generalizability and data fidelity. By shifting the focus from algorithmic complexity to data transparency and feature-level insight, the model offers a practical and trustworthy path toward real-world mental health assessment.

一个以数据为中心和可解释的脑电图框架,用于使用基于shap的见解进行抑郁症严重程度分级。
背景:重度抑郁症是世界范围内致残的主要原因。对抑郁症严重程度的准确评估对于诊断、治疗计划和监测至关重要,但目前的临床工具在很大程度上是主观的,依赖于传统评估量表的自我报告和临床医生的判断。脑电图已经成为一种有前途的、非侵入性的方式来捕捉抑郁症的神经相关。然而,大多数基于脑电图的机器学习诊断研究都侧重于通过复杂的算法和小型同质数据集来提高分类准确性。这些黑盒方法通常产生的结果难以解释和不好概括,使临床翻译不切实际。因此,我们迫切需要的模型不仅要准确,还要透明、稳健,并以数据本身的生理特性为基础。方法:我们提出了一个以数据为中心的、可解释的基于脑电图的抑郁症严重程度分级框架。采用混合特征选择方法,结合p值和SHapley加性解释(SHAP)方法来选择既独立显著又联合信息丰富的特征。该系统在一个大规模、多站点静息状态脑电图数据集上进行训练和评估,使用随机森林进行分类和回归任务。SHAP方法是一种可解释的人工智能技术,也被用于事后推断与MDD机制相关的关键电生理特征和关键脑区域,以进一步提高可解释性。结论:本研究提出了一种以数据为中心、可解释的抑郁症评分系统,该系统建立在大规模、多中心的脑电数据基础上,采用简单的模型和混合特征选择,强调可解释性、通用性和数据保真度。通过将重点从算法复杂性转移到数据透明度和特征级洞察力,该模型为现实世界的心理健康评估提供了一条实用且值得信赖的途径。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Journal of NeuroEngineering and Rehabilitation
Journal of NeuroEngineering and Rehabilitation 工程技术-工程:生物医学
CiteScore
9.60
自引率
3.90%
发文量
122
审稿时长
24 months
期刊介绍: Journal of NeuroEngineering and Rehabilitation considers manuscripts on all aspects of research that result from cross-fertilization of the fields of neuroscience, biomedical engineering, and physical medicine & rehabilitation.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信