DPD (DePression Detection) Net: a deep neural network for multimodal depression detection.

IF 3.4 3区医学 Q1 MEDICAL INFORMATICS

Health Information Science and Systems Pub Date : 2024-11-12 eCollection Date: 2024-12-01 DOI:10.1007/s13755-024-00311-9

Manlu He, Erwin M Bakker, Michael S Lew

{"title":"DPD (DePression Detection) Net: a deep neural network for multimodal depression detection.","authors":"Manlu He, Erwin M Bakker, Michael S Lew","doi":"10.1007/s13755-024-00311-9","DOIUrl":null,"url":null,"abstract":"<p><p>Depression is one of the most prevalent mental conditions which could impair people's productivity and lead to severe consequences. The diagnosis of this disease is complex as it often relies on a physician's subjective interview-based screening. The aim of our work is to propose deep learning models for automatic depression detection by using different data modalities, which could assist in the diagnosis of depression. Current works on automatic depression detection mostly are tested on a single dataset, which might lack robustness, flexibility and scalability. To alleviate this problem, we design a novel Graph Neural Network-enhanced Transformer model named DePressionDetect Net (DPD Net) that leverages textual, audio and visual features and can work under two different application settings: the clinical setting and the social media setting. The model consists of a unimodal encoder module for encoding single modality, a multimodal encoder module for integrating the multimodal information, and a detection module for producing the final prediction. We also propose a model named DePressionDetect-with-EEG Net (DPD-E Net) to incorporate Electroencephalography (EEG) signals and speech data for depression detection. Experiments across four benchmark datasets show that DPD Net and DPD-E Net can outperform the state-of-the-art models on three datasets (i.e., E-DAIC dataset, Twitter depression dataset and MODMA dataset), and achieve competitive performance on the fourth one (i.e., D-vlog dataset). Ablation studies demonstrate the advantages of the proposed modules and the effectiveness of combining diverse modalities for automatic depression detection.</p>","PeriodicalId":46312,"journal":{"name":"Health Information Science and Systems","volume":"12 1","pages":"53"},"PeriodicalIF":3.4000,"publicationDate":"2024-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11557813/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Health Information Science and Systems","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1007/s13755-024-00311-9","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/12/1 0:00:00","PubModel":"eCollection","JCR":"Q1","JCRName":"MEDICAL INFORMATICS","Score":null,"Total":0}

引用次数: 0

Abstract

Depression is one of the most prevalent mental conditions which could impair people's productivity and lead to severe consequences. The diagnosis of this disease is complex as it often relies on a physician's subjective interview-based screening. The aim of our work is to propose deep learning models for automatic depression detection by using different data modalities, which could assist in the diagnosis of depression. Current works on automatic depression detection mostly are tested on a single dataset, which might lack robustness, flexibility and scalability. To alleviate this problem, we design a novel Graph Neural Network-enhanced Transformer model named DePressionDetect Net (DPD Net) that leverages textual, audio and visual features and can work under two different application settings: the clinical setting and the social media setting. The model consists of a unimodal encoder module for encoding single modality, a multimodal encoder module for integrating the multimodal information, and a detection module for producing the final prediction. We also propose a model named DePressionDetect-with-EEG Net (DPD-E Net) to incorporate Electroencephalography (EEG) signals and speech data for depression detection. Experiments across four benchmark datasets show that DPD Net and DPD-E Net can outperform the state-of-the-art models on three datasets (i.e., E-DAIC dataset, Twitter depression dataset and MODMA dataset), and achieve competitive performance on the fourth one (i.e., D-vlog dataset). Ablation studies demonstrate the advantages of the proposed modules and the effectiveness of combining diverse modalities for automatic depression detection.

查看原文本刊更多论文

DPD（抑郁检测）网络：用于多模态抑郁检测的深度神经网络。

抑郁症是最常见的精神疾病之一，会损害人们的工作效率并导致严重后果。这种疾病的诊断非常复杂，因为它通常依赖于医生基于访谈的主观筛查。我们的工作旨在通过使用不同的数据模式，为抑郁症的自动检测提出深度学习模型，从而为抑郁症的诊断提供帮助。目前的抑郁症自动检测工作大多在单一数据集上进行测试，可能缺乏鲁棒性、灵活性和可扩展性。为了缓解这一问题，我们设计了一种名为 "抑郁检测网络"（DePressionDetect Net，DPD Net）的新型图神经网络增强变换器模型，该模型利用文本、音频和视觉特征，可在两种不同的应用环境下工作：临床环境和社交媒体环境。该模型由用于编码单一模态的单模态编码器模块、用于整合多模态信息的多模态编码器模块和用于生成最终预测结果的检测模块组成。我们还提出了一个名为 "DePressionDetect-with-EEG Net"（DPD-E Net）的模型，用于结合脑电图（EEG）信号和语音数据进行抑郁检测。四个基准数据集的实验表明，DPD Net 和 DPD-E Net 在三个数据集（即 E-DAIC 数据集、Twitter 抑郁症数据集和 MODMA 数据集）上的表现优于最先进的模型，并在第四个数据集（即 D-vlog 数据集）上取得了具有竞争力的性能。消融研究证明了所提模块的优势，以及结合多种模式进行抑郁自动检测的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Health Information Science and Systems MEDICAL INFORMATICS-

CiteScore

11.30

自引率

5.00%

发文量

期刊介绍： Health Information Science and Systems is a multidisciplinary journal that integrates artificial intelligence/computer science/information technology with health science and services, embracing information science research coupled with topics related to the modeling, design, development, integration and management of health information systems, smart health, artificial intelligence in medicine, and computer aided diagnosis, medical expert systems. The scope includes: i.) smart health, artificial Intelligence in medicine, computer aided diagnosis, medical image processing, medical expert systems ii.) medical big data, medical/health/biomedicine information resources such as patient medical records, devices and equipments, software and tools to capture, store, retrieve, process, analyze, optimize the use of information in the health domain, iii.) data management, data mining, and knowledge discovery, all of which play a key role in decision making, management of public health, examination of standards, privacy and security issues, iv.) development of new architectures and applications for health information systems.