Xujin Li , Wei Wei , Kun Zhao , Jiayu Mao , Yizhuo Lu , Shuang Qiu , Huiguang He
{"title":"Exploring EEG and eye movement fusion for multi-class target RSVP-BCI","authors":"Xujin Li , Wei Wei , Kun Zhao , Jiayu Mao , Yizhuo Lu , Shuang Qiu , Huiguang He","doi":"10.1016/j.inffus.2025.103135","DOIUrl":null,"url":null,"abstract":"<div><div>Rapid Serial Visual Presentation (RSVP)-based Brain-Computer Interfaces (BCIs) enable high-throughput target image detection by identifying event-related potentials (ERPs) in electroencephalography (EEG) signals. Traditional RSVP-BCI systems detect only single-class targets within image streams, limiting their ability to handle more complex tasks requiring multi-class target identification. Multi-class target RSVP-BCI systems are designed to detect multi-class targets in real-world scenarios. However, distinguishing between different target categories remains challenging due to the high similarity across ERPs evoked by different target categories. In this work, we incorporate the eye movement (EM) modality into traditional EEG-based RSVP decoding and develop an open-source multi-modal dataset comprising EM and EEG signals from 43 subjects in three multi-class target RSVP tasks. We further propose the <strong>M</strong>ulti-class <strong>T</strong>arget <strong>R</strong>SVP <strong>E</strong>EG and <strong>E</strong>M fusion <strong>Net</strong>work (MTREE-Net) to enhance multi-class RSVP decoding. Specifically, a dual-complementary module is designed to strengthen the differentiation of uni-modal features across categories. To achieve more effective multi-modal fusion, we adopt a dynamic reweighting fusion strategy guided by theoretically derived modality contribution ratios for optimization. Furthermore, we propose a hierarchical self-distillation module to reduce the misclassification of non-target samples through knowledge transfer between two hierarchical classifiers. Extensive experiments demonstrate that MTREE-Net achieves significant performance improvements, including over 5.4% and 3.32% increases in balanced accuracy compared to existing EEG decoding and EEG-EM fusion methods, respectively. Our research offers a promising framework that can simultaneously detect target existence and identify their specific categories, enabling more robust and efficient applications in scenarios such as multi-class target detection.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"121 ","pages":"Article 103135"},"PeriodicalIF":14.7000,"publicationDate":"2025-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Fusion","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1566253525002088","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Rapid Serial Visual Presentation (RSVP)-based Brain-Computer Interfaces (BCIs) enable high-throughput target image detection by identifying event-related potentials (ERPs) in electroencephalography (EEG) signals. Traditional RSVP-BCI systems detect only single-class targets within image streams, limiting their ability to handle more complex tasks requiring multi-class target identification. Multi-class target RSVP-BCI systems are designed to detect multi-class targets in real-world scenarios. However, distinguishing between different target categories remains challenging due to the high similarity across ERPs evoked by different target categories. In this work, we incorporate the eye movement (EM) modality into traditional EEG-based RSVP decoding and develop an open-source multi-modal dataset comprising EM and EEG signals from 43 subjects in three multi-class target RSVP tasks. We further propose the Multi-class Target RSVP EEG and EM fusion Network (MTREE-Net) to enhance multi-class RSVP decoding. Specifically, a dual-complementary module is designed to strengthen the differentiation of uni-modal features across categories. To achieve more effective multi-modal fusion, we adopt a dynamic reweighting fusion strategy guided by theoretically derived modality contribution ratios for optimization. Furthermore, we propose a hierarchical self-distillation module to reduce the misclassification of non-target samples through knowledge transfer between two hierarchical classifiers. Extensive experiments demonstrate that MTREE-Net achieves significant performance improvements, including over 5.4% and 3.32% increases in balanced accuracy compared to existing EEG decoding and EEG-EM fusion methods, respectively. Our research offers a promising framework that can simultaneously detect target existence and identify their specific categories, enabling more robust and efficient applications in scenarios such as multi-class target detection.
期刊介绍:
Information Fusion serves as a central platform for showcasing advancements in multi-sensor, multi-source, multi-process information fusion, fostering collaboration among diverse disciplines driving its progress. It is the leading outlet for sharing research and development in this field, focusing on architectures, algorithms, and applications. Papers dealing with fundamental theoretical analyses as well as those demonstrating their application to real-world problems will be welcome.