Exploring EEG and eye movement fusion for multi-class target RSVP-BCI

IF 14.7 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Information Fusion Pub Date : 2025-03-26 DOI:10.1016/j.inffus.2025.103135

Xujin Li , Wei Wei , Kun Zhao , Jiayu Mao , Yizhuo Lu , Shuang Qiu , Huiguang He

{"title":"Exploring EEG and eye movement fusion for multi-class target RSVP-BCI","authors":"Xujin Li , Wei Wei , Kun Zhao , Jiayu Mao , Yizhuo Lu , Shuang Qiu , Huiguang He","doi":"10.1016/j.inffus.2025.103135","DOIUrl":null,"url":null,"abstract":"<div><div>Rapid Serial Visual Presentation (RSVP)-based Brain-Computer Interfaces (BCIs) enable high-throughput target image detection by identifying event-related potentials (ERPs) in electroencephalography (EEG) signals. Traditional RSVP-BCI systems detect only single-class targets within image streams, limiting their ability to handle more complex tasks requiring multi-class target identification. Multi-class target RSVP-BCI systems are designed to detect multi-class targets in real-world scenarios. However, distinguishing between different target categories remains challenging due to the high similarity across ERPs evoked by different target categories. In this work, we incorporate the eye movement (EM) modality into traditional EEG-based RSVP decoding and develop an open-source multi-modal dataset comprising EM and EEG signals from 43 subjects in three multi-class target RSVP tasks. We further propose the <strong>M</strong>ulti-class <strong>T</strong>arget <strong>R</strong>SVP <strong>E</strong>EG and <strong>E</strong>M fusion <strong>Net</strong>work (MTREE-Net) to enhance multi-class RSVP decoding. Specifically, a dual-complementary module is designed to strengthen the differentiation of uni-modal features across categories. To achieve more effective multi-modal fusion, we adopt a dynamic reweighting fusion strategy guided by theoretically derived modality contribution ratios for optimization. Furthermore, we propose a hierarchical self-distillation module to reduce the misclassification of non-target samples through knowledge transfer between two hierarchical classifiers. Extensive experiments demonstrate that MTREE-Net achieves significant performance improvements, including over 5.4% and 3.32% increases in balanced accuracy compared to existing EEG decoding and EEG-EM fusion methods, respectively. Our research offers a promising framework that can simultaneously detect target existence and identify their specific categories, enabling more robust and efficient applications in scenarios such as multi-class target detection.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"121 ","pages":"Article 103135"},"PeriodicalIF":14.7000,"publicationDate":"2025-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Fusion","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1566253525002088","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Rapid Serial Visual Presentation (RSVP)-based Brain-Computer Interfaces (BCIs) enable high-throughput target image detection by identifying event-related potentials (ERPs) in electroencephalography (EEG) signals. Traditional RSVP-BCI systems detect only single-class targets within image streams, limiting their ability to handle more complex tasks requiring multi-class target identification. Multi-class target RSVP-BCI systems are designed to detect multi-class targets in real-world scenarios. However, distinguishing between different target categories remains challenging due to the high similarity across ERPs evoked by different target categories. In this work, we incorporate the eye movement (EM) modality into traditional EEG-based RSVP decoding and develop an open-source multi-modal dataset comprising EM and EEG signals from 43 subjects in three multi-class target RSVP tasks. We further propose the Multi-class Target RSVP EEG and EM fusion Network (MTREE-Net) to enhance multi-class RSVP decoding. Specifically, a dual-complementary module is designed to strengthen the differentiation of uni-modal features across categories. To achieve more effective multi-modal fusion, we adopt a dynamic reweighting fusion strategy guided by theoretically derived modality contribution ratios for optimization. Furthermore, we propose a hierarchical self-distillation module to reduce the misclassification of non-target samples through knowledge transfer between two hierarchical classifiers. Extensive experiments demonstrate that MTREE-Net achieves significant performance improvements, including over 5.4% and 3.32% increases in balanced accuracy compared to existing EEG decoding and EEG-EM fusion methods, respectively. Our research offers a promising framework that can simultaneously detect target existence and identify their specific categories, enabling more robust and efficient applications in scenarios such as multi-class target detection.

查看原文本刊更多论文

脑电与眼动融合在多类别RSVP-BCI中的应用

基于快速串行视觉呈现（RSVP）的脑机接口（bci）通过识别脑电图（EEG）信号中的事件相关电位（ERPs）实现高通量目标图像检测。传统的RSVP-BCI系统仅检测图像流中的单一类目标，限制了它们处理需要多类目标识别的更复杂任务的能力。多类目标RSVP-BCI系统设计用于检测现实场景中的多类目标。然而，由于不同目标类别引起的erp高度相似，区分不同目标类别仍然具有挑战性。在这项工作中，我们将眼动（EM）模态融入到传统的基于脑电图的RSVP解码中，并开发了一个开源的多模态数据集，该数据集包括43名受试者在三个多类目标RSVP任务中的EM和EEG信号。进一步提出了多类目标RSVP脑电与EM融合网络（MTREE-Net）来增强多类RSVP解码。具体而言，双互补模块旨在加强跨类别单模态特征的区分。为了实现更有效的多模态融合，我们采用了一种以理论推导的模态贡献比为指导的动态重加权融合策略进行优化。此外，我们提出了一个层次自蒸馏模块，通过两个层次分类器之间的知识转移来减少非目标样本的误分类。大量的实验表明，MTREE-Net取得了显著的性能改进，与现有的EEG解码和EEG- em融合方法相比，平衡精度分别提高了5.4%和3.32%。我们的研究提供了一个有前途的框架，可以同时检测目标的存在并识别其特定类别，从而在多类目标检测等场景中实现更强大和高效的应用。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Information Fusion 工程技术-计算机：理论方法

CiteScore

33.20

自引率

4.30%

发文量

161

审稿时长

7.9 months

期刊介绍： Information Fusion serves as a central platform for showcasing advancements in multi-sensor, multi-source, multi-process information fusion, fostering collaboration among diverse disciplines driving its progress. It is the leading outlet for sharing research and development in this field, focusing on architectures, algorithms, and applications. Papers dealing with fundamental theoretical analyses as well as those demonstrating their application to real-world problems will be welcome.