{"title":"Multi-scale spatiotemporal representation learning for EEG-based emotion recognition","authors":"Xin Zhou, Xiaojing Peng","doi":"arxiv-2409.07589","DOIUrl":null,"url":null,"abstract":"EEG-based emotion recognition holds significant potential in the field of\nbrain-computer interfaces. A key challenge lies in extracting discriminative\nspatiotemporal features from electroencephalogram (EEG) signals. Existing\nstudies often rely on domain-specific time-frequency features and analyze\ntemporal dependencies and spatial characteristics separately, neglecting the\ninteraction between local-global relationships and spatiotemporal dynamics. To\naddress this, we propose a novel network called Multi-Scale Inverted Mamba\n(MS-iMamba), which consists of Multi-Scale Temporal Blocks (MSTB) and\nTemporal-Spatial Fusion Blocks (TSFB). Specifically, MSTBs are designed to\ncapture both local details and global temporal dependencies across different\nscale subsequences. The TSFBs, implemented with an inverted Mamba structure,\nfocus on the interaction between dynamic temporal dependencies and spatial\ncharacteristics. The primary advantage of MS-iMamba lies in its ability to\nleverage reconstructed multi-scale EEG sequences, exploiting the interaction\nbetween temporal and spatial features without the need for domain-specific\ntime-frequency feature extraction. Experimental results on the DEAP, DREAMER,\nand SEED datasets demonstrate that MS-iMamba achieves classification accuracies\nof 94.86%, 94.94%, and 91.36%, respectively, using only four-channel EEG\nsignals, outperforming state-of-the-art methods.","PeriodicalId":501034,"journal":{"name":"arXiv - EE - Signal Processing","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - EE - Signal Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.07589","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
EEG-based emotion recognition holds significant potential in the field of
brain-computer interfaces. A key challenge lies in extracting discriminative
spatiotemporal features from electroencephalogram (EEG) signals. Existing
studies often rely on domain-specific time-frequency features and analyze
temporal dependencies and spatial characteristics separately, neglecting the
interaction between local-global relationships and spatiotemporal dynamics. To
address this, we propose a novel network called Multi-Scale Inverted Mamba
(MS-iMamba), which consists of Multi-Scale Temporal Blocks (MSTB) and
Temporal-Spatial Fusion Blocks (TSFB). Specifically, MSTBs are designed to
capture both local details and global temporal dependencies across different
scale subsequences. The TSFBs, implemented with an inverted Mamba structure,
focus on the interaction between dynamic temporal dependencies and spatial
characteristics. The primary advantage of MS-iMamba lies in its ability to
leverage reconstructed multi-scale EEG sequences, exploiting the interaction
between temporal and spatial features without the need for domain-specific
time-frequency feature extraction. Experimental results on the DEAP, DREAMER,
and SEED datasets demonstrate that MS-iMamba achieves classification accuracies
of 94.86%, 94.94%, and 91.36%, respectively, using only four-channel EEG
signals, outperforming state-of-the-art methods.