RADIANCE：使用变压器从语音中进行可靠且可解释的抑郁检测。

IF 7 2区医学 Q1 BIOLOGY

Computers in biology and medicine Pub Date : 2024-11-02 DOI:10.1016/j.compbiomed.2024.109325

Anup Kumar Gupta, Ashutosh Dhamaniya, Puneet Gupta

{"title":"RADIANCE：使用变压器从语音中进行可靠且可解释的抑郁检测。","authors":"Anup Kumar Gupta, Ashutosh Dhamaniya, Puneet Gupta","doi":"10.1016/j.compbiomed.2024.109325","DOIUrl":null,"url":null,"abstract":"<div><div>Depression is a common but severe mental disorder that adversely impacts the ability of an individual to function normally in their day-to-day life. A majority of depressed individuals remain undiagnosed due to factors such as social stigma and a shortage of healthcare professionals. Consequently, several Machine Learning and Deep Learning (DL) models based on speech have been proposed for automatic depression detection, with the latter generally outperforming the former. However, DL models are blackbox and offer no transparency. In contrast, healthcare professionals prefer models that provide interpretability besides being accurate. In this direction, we propose a method <em>RADIANCE</em> (Reliable AnD InterpretAble depressioN deteCtion transformErs). <em>RADIANCE</em> incorporates a novel FilterBank VIsion Transformer (<em>FBViT</em>) network, which provides the symptoms of depression as interpretable features. Additionally, we employ a novel loss function that handles the class imbalance issue in the datasets. It also incorporates a penalty term that addresses the hierarchy of misclassification errors. We also propose a reliability predictor based on low-level descriptors that provides a reliability score to indicate the trustworthiness of the prediction by <em>FBViT</em>. Furthermore, in contrast to the conventional averaging and majority pooling, <em>RADIANCE</em> consolidates predictions from multiple clips of the input audio by intricately weighing each prediction based on its reliability score, ensuring a more accurate overall prediction. <em>RADIANCE</em> outperforms the state-of-the-art depression detection methods, achieving an accuracy of 89.36%, 80.36%, and 94.44% over the DAIC-WOZ, E-DAIC, and CMDC datasets, respectively. Further, <em>RADIANCE</em> achieves MAE scores of 3.27 and 5.04 on the DAIC-WOZ and E-DAIC datasets, respectively.</div></div>","PeriodicalId":10578,"journal":{"name":"Computers in biology and medicine","volume":"183 ","pages":"Article 109325"},"PeriodicalIF":7.0000,"publicationDate":"2024-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"RADIANCE: Reliable and interpretable depression detection from speech using transformer\",\"authors\":\"Anup Kumar Gupta, Ashutosh Dhamaniya, Puneet Gupta\",\"doi\":\"10.1016/j.compbiomed.2024.109325\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Depression is a common but severe mental disorder that adversely impacts the ability of an individual to function normally in their day-to-day life. A majority of depressed individuals remain undiagnosed due to factors such as social stigma and a shortage of healthcare professionals. Consequently, several Machine Learning and Deep Learning (DL) models based on speech have been proposed for automatic depression detection, with the latter generally outperforming the former. However, DL models are blackbox and offer no transparency. In contrast, healthcare professionals prefer models that provide interpretability besides being accurate. In this direction, we propose a method <em>RADIANCE</em> (Reliable AnD InterpretAble depressioN deteCtion transformErs). <em>RADIANCE</em> incorporates a novel FilterBank VIsion Transformer (<em>FBViT</em>) network, which provides the symptoms of depression as interpretable features. Additionally, we employ a novel loss function that handles the class imbalance issue in the datasets. It also incorporates a penalty term that addresses the hierarchy of misclassification errors. We also propose a reliability predictor based on low-level descriptors that provides a reliability score to indicate the trustworthiness of the prediction by <em>FBViT</em>. Furthermore, in contrast to the conventional averaging and majority pooling, <em>RADIANCE</em> consolidates predictions from multiple clips of the input audio by intricately weighing each prediction based on its reliability score, ensuring a more accurate overall prediction. <em>RADIANCE</em> outperforms the state-of-the-art depression detection methods, achieving an accuracy of 89.36%, 80.36%, and 94.44% over the DAIC-WOZ, E-DAIC, and CMDC datasets, respectively. Further, <em>RADIANCE</em> achieves MAE scores of 3.27 and 5.04 on the DAIC-WOZ and E-DAIC datasets, respectively.</div></div>\",\"PeriodicalId\":10578,\"journal\":{\"name\":\"Computers in biology and medicine\",\"volume\":\"183 \",\"pages\":\"Article 109325\"},\"PeriodicalIF\":7.0000,\"publicationDate\":\"2024-11-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computers in biology and medicine\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0010482524014100\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers in biology and medicine","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0010482524014100","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOLOGY","Score":null,"Total":0}

引用次数: 0

摘要

抑郁症是一种常见但严重的精神障碍，会对个人日常生活的正常能力产生不利影响。由于社会耻辱感和医疗保健专业人员短缺等因素，大多数抑郁症患者仍未得到诊断。因此，人们提出了几种基于语音的机器学习和深度学习（DL）模型，用于自动检测抑郁症，后者的表现普遍优于前者。然而，深度学习模型是黑盒子，不透明。相比之下，医疗专业人员更喜欢除了准确之外还能提供可解释性的模型。为此，我们提出了一种方法 RADIANCE（Reliable AnD InterpretAble DepressioN DeteCtion transformErs）。RADIANCE 融合了一个新颖的滤波库虚拟转换器（FBViT）网络，它提供了可解释的抑郁症状特征。此外，我们还采用了一种新颖的损失函数来处理数据集中的类不平衡问题。它还包含一个惩罚项，可解决误分类错误的层次问题。我们还提出了一种基于低级描述符的可靠性预测器，该预测器可提供可靠性评分，以显示 FBViT 预测的可信度。此外，与传统的平均法和多数池法不同，RADIANCE 通过基于可靠性得分对每个预测进行复杂的权衡，对来自多个输入音频片段的预测进行整合，从而确保整体预测更加准确。RADIANCE 优于最先进的抑郁检测方法，在 DAIC-WOZ、E-DAIC 和 CMDC 数据集上的准确率分别达到 89.36%、80.36% 和 94.44%。此外，RADIANCE 在 DAIC-WOZ 和 E-DAIC 数据集上的 MAE 分数分别为 3.27 和 5.04。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

RADIANCE: Reliable and interpretable depression detection from speech using transformer

Depression is a common but severe mental disorder that adversely impacts the ability of an individual to function normally in their day-to-day life. A majority of depressed individuals remain undiagnosed due to factors such as social stigma and a shortage of healthcare professionals. Consequently, several Machine Learning and Deep Learning (DL) models based on speech have been proposed for automatic depression detection, with the latter generally outperforming the former. However, DL models are blackbox and offer no transparency. In contrast, healthcare professionals prefer models that provide interpretability besides being accurate. In this direction, we propose a method RADIANCE (Reliable AnD InterpretAble depressioN deteCtion transformErs). RADIANCE incorporates a novel FilterBank VIsion Transformer (FBViT) network, which provides the symptoms of depression as interpretable features. Additionally, we employ a novel loss function that handles the class imbalance issue in the datasets. It also incorporates a penalty term that addresses the hierarchy of misclassification errors. We also propose a reliability predictor based on low-level descriptors that provides a reliability score to indicate the trustworthiness of the prediction by FBViT. Furthermore, in contrast to the conventional averaging and majority pooling, RADIANCE consolidates predictions from multiple clips of the input audio by intricately weighing each prediction based on its reliability score, ensuring a more accurate overall prediction. RADIANCE outperforms the state-of-the-art depression detection methods, achieving an accuracy of 89.36%, 80.36%, and 94.44% over the DAIC-WOZ, E-DAIC, and CMDC datasets, respectively. Further, RADIANCE achieves MAE scores of 3.27 and 5.04 on the DAIC-WOZ and E-DAIC datasets, respectively.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Computers in biology and medicine 工程技术-工程：生物医学

CiteScore

11.70

自引率

10.40%

发文量

1086

审稿时长

74 days

期刊介绍： Computers in Biology and Medicine is an international forum for sharing groundbreaking advancements in the use of computers in bioscience and medicine. This journal serves as a medium for communicating essential research, instruction, ideas, and information regarding the rapidly evolving field of computer applications in these domains. By encouraging the exchange of knowledge, we aim to facilitate progress and innovation in the utilization of computers in biology and medicine.