多模态抑郁检测的多层次时空图注意融合

IF 4.9 2区 医学 Q1 ENGINEERING, BIOMEDICAL
Yujie Yang, Wenbin Zheng
{"title":"多模态抑郁检测的多层次时空图注意融合","authors":"Yujie Yang,&nbsp;Wenbin Zheng","doi":"10.1016/j.bspc.2025.108123","DOIUrl":null,"url":null,"abstract":"<div><div>Depression is a severe mental illness that affects hundreds of millions of people worldwide. In recent years, depression detection methods that integrate multimodal information have achieved significant results. However, limited by the small sample size of depression datasets, previous studies primarily focus on the impact of heterogeneous information in multimodal fusion, while deep interactions within each modality are often overlooked. Moreover, previous multimodal fusion methods often employed concatenation operations, which only allow modal features to be statically combined in the vector space and do not explicitly model the cross-modal semantic relationships. To address these issues, we propose a novel method named Multi-level Spatiotemporal Graph Attention Fusion (MSGAF), which enhances information interaction and sharing through multi-step fusion both within and between modalities. Specifically, within each modality containing multiple features, we designed a Multi-feature Temporal Fusion (MTF) module. The MTF module can fuse various features during the same time period to discover interactions among these features. For multimodal fusion, we adopt a multi-level fusion strategy to integrate these modalities, with the fusion process is represented as a Bidirectional Fusion Graph (BiFG). The graph attention mechanism is utilized to aggregate node information across the spatial neighborhood of the BiFG, which allows the graph structure to dynamically and adaptively capture the asymmetric relationships between modalities. Extensive experiments and analyses demonstrate the effectiveness of MSGAF, which achieves state-of-the-art performance on both the DAIC-WOZ and E-DAIC datasets. The code is available at: <span><span>https://github.com/wenbin-zheng/MSGAF</span><svg><path></path></svg></span></div></div>","PeriodicalId":55362,"journal":{"name":"Biomedical Signal Processing and Control","volume":"110 ","pages":"Article 108123"},"PeriodicalIF":4.9000,"publicationDate":"2025-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Multi-level spatiotemporal graph attention fusion for multimodal depression detection\",\"authors\":\"Yujie Yang,&nbsp;Wenbin Zheng\",\"doi\":\"10.1016/j.bspc.2025.108123\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Depression is a severe mental illness that affects hundreds of millions of people worldwide. In recent years, depression detection methods that integrate multimodal information have achieved significant results. However, limited by the small sample size of depression datasets, previous studies primarily focus on the impact of heterogeneous information in multimodal fusion, while deep interactions within each modality are often overlooked. Moreover, previous multimodal fusion methods often employed concatenation operations, which only allow modal features to be statically combined in the vector space and do not explicitly model the cross-modal semantic relationships. To address these issues, we propose a novel method named Multi-level Spatiotemporal Graph Attention Fusion (MSGAF), which enhances information interaction and sharing through multi-step fusion both within and between modalities. Specifically, within each modality containing multiple features, we designed a Multi-feature Temporal Fusion (MTF) module. The MTF module can fuse various features during the same time period to discover interactions among these features. For multimodal fusion, we adopt a multi-level fusion strategy to integrate these modalities, with the fusion process is represented as a Bidirectional Fusion Graph (BiFG). The graph attention mechanism is utilized to aggregate node information across the spatial neighborhood of the BiFG, which allows the graph structure to dynamically and adaptively capture the asymmetric relationships between modalities. Extensive experiments and analyses demonstrate the effectiveness of MSGAF, which achieves state-of-the-art performance on both the DAIC-WOZ and E-DAIC datasets. The code is available at: <span><span>https://github.com/wenbin-zheng/MSGAF</span><svg><path></path></svg></span></div></div>\",\"PeriodicalId\":55362,\"journal\":{\"name\":\"Biomedical Signal Processing and Control\",\"volume\":\"110 \",\"pages\":\"Article 108123\"},\"PeriodicalIF\":4.9000,\"publicationDate\":\"2025-06-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Biomedical Signal Processing and Control\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1746809425006342\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, BIOMEDICAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Biomedical Signal Processing and Control","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1746809425006342","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, BIOMEDICAL","Score":null,"Total":0}
引用次数: 0

摘要

抑郁症是一种严重的精神疾病,影响着全世界数亿人。近年来,综合多模态信息的抑郁症检测方法取得了显著的成果。然而,受限于抑郁症数据集的小样本量,以往的研究主要关注异构信息在多模态融合中的影响,而每个模态之间的深层相互作用往往被忽视。此外,以前的多模态融合方法通常采用串联操作,仅允许模态特征在向量空间中静态组合,而不能显式地建模跨模态语义关系。为了解决这些问题,我们提出了一种新的方法,即多层次时空图注意融合(MSGAF),该方法通过模态内部和模态之间的多步融合来增强信息交互和共享。具体来说,在每个包含多个特征的模态中,我们设计了一个多特征时间融合(MTF)模块。MTF模块可以在同一时间段内融合各种特征,以发现这些特征之间的相互作用。对于多模态融合,我们采用多级融合策略来整合这些模态,融合过程被表示为双向融合图(Bidirectional fusion Graph, BiFG)。利用图的注意机制,对跨空间邻域的节点信息进行聚合,使图结构能够动态、自适应地捕捉模态之间的不对称关系。大量的实验和分析证明了MSGAF的有效性,它在DAIC-WOZ和E-DAIC数据集上都达到了最先进的性能。代码可从https://github.com/wenbin-zheng/MSGAF获得
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Multi-level spatiotemporal graph attention fusion for multimodal depression detection
Depression is a severe mental illness that affects hundreds of millions of people worldwide. In recent years, depression detection methods that integrate multimodal information have achieved significant results. However, limited by the small sample size of depression datasets, previous studies primarily focus on the impact of heterogeneous information in multimodal fusion, while deep interactions within each modality are often overlooked. Moreover, previous multimodal fusion methods often employed concatenation operations, which only allow modal features to be statically combined in the vector space and do not explicitly model the cross-modal semantic relationships. To address these issues, we propose a novel method named Multi-level Spatiotemporal Graph Attention Fusion (MSGAF), which enhances information interaction and sharing through multi-step fusion both within and between modalities. Specifically, within each modality containing multiple features, we designed a Multi-feature Temporal Fusion (MTF) module. The MTF module can fuse various features during the same time period to discover interactions among these features. For multimodal fusion, we adopt a multi-level fusion strategy to integrate these modalities, with the fusion process is represented as a Bidirectional Fusion Graph (BiFG). The graph attention mechanism is utilized to aggregate node information across the spatial neighborhood of the BiFG, which allows the graph structure to dynamically and adaptively capture the asymmetric relationships between modalities. Extensive experiments and analyses demonstrate the effectiveness of MSGAF, which achieves state-of-the-art performance on both the DAIC-WOZ and E-DAIC datasets. The code is available at: https://github.com/wenbin-zheng/MSGAF
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Biomedical Signal Processing and Control
Biomedical Signal Processing and Control 工程技术-工程:生物医学
CiteScore
9.80
自引率
13.70%
发文量
822
审稿时长
4 months
期刊介绍: Biomedical Signal Processing and Control aims to provide a cross-disciplinary international forum for the interchange of information on research in the measurement and analysis of signals and images in clinical medicine and the biological sciences. Emphasis is placed on contributions dealing with the practical, applications-led research on the use of methods and devices in clinical diagnosis, patient monitoring and management. Biomedical Signal Processing and Control reflects the main areas in which these methods are being used and developed at the interface of both engineering and clinical science. The scope of the journal is defined to include relevant review papers, technical notes, short communications and letters. Tutorial papers and special issues will also be published.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信