A dual-branch self-supervised contrastive learning framework for emotion recognition based on time-frequency fusion

IF 6.6 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Applied Soft Computing Pub Date : 2025-09-27 DOI:10.1016/j.asoc.2025.113958

Jie Ouyang , Yangfan Liang , Hong Sun , Xianchao Zhang , Jingxue Chen , Gao Liu , Zhiquan Liu , Yining Liu

{"title":"A dual-branch self-supervised contrastive learning framework for emotion recognition based on time-frequency fusion","authors":"Jie Ouyang , Yangfan Liang , Hong Sun , Xianchao Zhang , Jingxue Chen , Gao Liu , Zhiquan Liu , Yining Liu","doi":"10.1016/j.asoc.2025.113958","DOIUrl":null,"url":null,"abstract":"<div><div>Emotion recognition based on electroencephalography(EEG) signals is becoming a prominent research hotspot due to its wide-ranging applications in brain–computer interfaces (BCIs), mental health assessment, and human-computer interaction. Traditional emotion recognition methods often rely on supervised learning, which requires large amounts of labeled data to effectively train deep models. However, EEG signals exhibit inherent complexity and substantial variability across individuals and sessions, making it challenging to obtain consistent and reliable labels. In this paper, we propose a novel pretraining framework for EEG-based emotion recognition that enables mutual learning between time-domain and time-frequency-domain representations, while requiring simple network architectures. Experimental results demonstrate that our method achieves 84.39 % accuracy on the SEED dataset, and 89.01 % valence accuracy and 79.75 % arousal accuracy on the DEAP dataset using only 10 % labeled data, indicating strong performance under limited label conditions. Furthermore, we evaluate the transfer learning capability of our framework by pretraining it on the SEED dataset and then fine-tuning it on SEED-V. This cross-dataset transfer leads to a 1.9 % absolute improvement in classification accuracy on SEED-V, demonstrating the effectiveness of the learned representations in generalizing across datasets.</div></div>","PeriodicalId":50737,"journal":{"name":"Applied Soft Computing","volume":"185 ","pages":"Article 113958"},"PeriodicalIF":6.6000,"publicationDate":"2025-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Soft Computing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1568494625012712","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Emotion recognition based on electroencephalography(EEG) signals is becoming a prominent research hotspot due to its wide-ranging applications in brain–computer interfaces (BCIs), mental health assessment, and human-computer interaction. Traditional emotion recognition methods often rely on supervised learning, which requires large amounts of labeled data to effectively train deep models. However, EEG signals exhibit inherent complexity and substantial variability across individuals and sessions, making it challenging to obtain consistent and reliable labels. In this paper, we propose a novel pretraining framework for EEG-based emotion recognition that enables mutual learning between time-domain and time-frequency-domain representations, while requiring simple network architectures. Experimental results demonstrate that our method achieves 84.39 % accuracy on the SEED dataset, and 89.01 % valence accuracy and 79.75 % arousal accuracy on the DEAP dataset using only 10 % labeled data, indicating strong performance under limited label conditions. Furthermore, we evaluate the transfer learning capability of our framework by pretraining it on the SEED dataset and then fine-tuning it on SEED-V. This cross-dataset transfer leads to a 1.9 % absolute improvement in classification accuracy on SEED-V, demonstrating the effectiveness of the learned representations in generalizing across datasets.

查看原文本刊更多论文

基于时频融合的双分支自监督对比学习框架

基于脑电图（EEG）信号的情绪识别在脑机接口、心理健康评估、人机交互等领域有着广泛的应用，已成为一个突出的研究热点。传统的情绪识别方法通常依赖于监督学习，这需要大量的标记数据来有效地训练深度模型。然而，脑电图信号在个体和会话之间表现出固有的复杂性和巨大的可变性，这使得获得一致和可靠的标签具有挑战性。在本文中，我们提出了一种新的基于脑电图的情感识别预训练框架，该框架可以在时域和时频域表示之间相互学习，同时需要简单的网络架构。实验结果表明，该方法在SEED数据集上的准确率为84.39%，在DEAP数据集上的效价准确率为89.01%，唤醒准确率为79.75%，仅使用10%的标记数据，表明该方法在有限的标签条件下具有较强的性能。此外，我们通过在SEED数据集上对框架进行预训练，然后在SEED- v上对其进行微调来评估框架的迁移学习能力。这种跨数据集传输导致SEED-V的分类准确率绝对提高1.9%，证明了学习到的表示在跨数据集泛化方面的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Applied Soft Computing 工程技术-计算机：跨学科应用

CiteScore

15.80

自引率

6.90%

发文量

874

审稿时长

10.9 months

期刊介绍： Applied Soft Computing is an international journal promoting an integrated view of soft computing to solve real life problems.The focus is to publish the highest quality research in application and convergence of the areas of Fuzzy Logic, Neural Networks, Evolutionary Computing, Rough Sets and other similar techniques to address real world complexities. Applied Soft Computing is a rolling publication: articles are published as soon as the editor-in-chief has accepted them. Therefore, the web site will continuously be updated with new articles and the publication time will be short.