基于脑眼机的模糊目标检测自适应模态平衡在线知识提炼。

IF 8.9 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

IEEE transactions on neural networks and learning systems Pub Date : 2025-09-15 DOI:10.1109/TNNLS.2025.3605710

Zixing Li, Chao Yan, Zhen Lan, Xiaojia Xiang, Han Zhou, Jun Lai, Dengqing Tang

{"title":"基于脑眼机的模糊目标检测自适应模态平衡在线知识提炼。","authors":"Zixing Li, Chao Yan, Zhen Lan, Xiaojia Xiang, Han Zhou, Jun Lai, Dengqing Tang","doi":"10.1109/TNNLS.2025.3605710","DOIUrl":null,"url":null,"abstract":"Advanced cognition can be measured from the human brain using brain-computer interfaces (BCIs). Integrating these interfaces with computer vision techniques, which possess efficient feature extraction capabilities, can achieve more robust and accurate detection of dim targets in aerial images. However, existing target detection methods primarily concentrate on homogeneous data, lacking efficient and versatile processing capabilities for heterogeneous multimodal data. In this article, we first build a brain-eye-computer-based object detection system for aerial images under few-shot conditions. This system detects suspicious targets using region proposal networks (RPNs), evokes the event-related potential (ERP) signal in electroencephalogram (EEG) through the eye-tracking-based slow serial visual presentation (ESSVP) paradigm, and constructs the EEG-image data pairs with eye movement data. Then, an adaptive modality balanced online knowledge distillation (AMBOKD) method is proposed to recognize dim objects with the EEG-image data. AMBOKD fuses EEG and image features using a multihead attention module, establishing a new modality with comprehensive features. To enhance the performance and robust capability of the fusion modality, simultaneous training and mutual learning between modalities are enabled by end-to-end online KD (OKD). During the learning process, an adaptive modality balancing module is proposed to ensure multimodal equilibrium by dynamically adjusting the weights of the importance and the training gradients across various modalities. The effectiveness and superiority of our method are demonstrated by comparing it with existing state-of-the-art methods. Additionally, experiments conducted on public datasets and real-world scenarios demonstrate the reliability and practicality of the proposed system and the designed method. The dataset and the source code can be found at: https://github.com/lizixing23/AMBOKD.","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"PP ","pages":""},"PeriodicalIF":8.9000,"publicationDate":"2025-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Adaptive Modality Balanced Online Knowledge Distillation for Brain-Eye-Computer-Based Dim Object Detection.\",\"authors\":\"Zixing Li, Chao Yan, Zhen Lan, Xiaojia Xiang, Han Zhou, Jun Lai, Dengqing Tang\",\"doi\":\"10.1109/TNNLS.2025.3605710\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Advanced cognition can be measured from the human brain using brain-computer interfaces (BCIs). Integrating these interfaces with computer vision techniques, which possess efficient feature extraction capabilities, can achieve more robust and accurate detection of dim targets in aerial images. However, existing target detection methods primarily concentrate on homogeneous data, lacking efficient and versatile processing capabilities for heterogeneous multimodal data. In this article, we first build a brain-eye-computer-based object detection system for aerial images under few-shot conditions. This system detects suspicious targets using region proposal networks (RPNs), evokes the event-related potential (ERP) signal in electroencephalogram (EEG) through the eye-tracking-based slow serial visual presentation (ESSVP) paradigm, and constructs the EEG-image data pairs with eye movement data. Then, an adaptive modality balanced online knowledge distillation (AMBOKD) method is proposed to recognize dim objects with the EEG-image data. AMBOKD fuses EEG and image features using a multihead attention module, establishing a new modality with comprehensive features. To enhance the performance and robust capability of the fusion modality, simultaneous training and mutual learning between modalities are enabled by end-to-end online KD (OKD). During the learning process, an adaptive modality balancing module is proposed to ensure multimodal equilibrium by dynamically adjusting the weights of the importance and the training gradients across various modalities. The effectiveness and superiority of our method are demonstrated by comparing it with existing state-of-the-art methods. Additionally, experiments conducted on public datasets and real-world scenarios demonstrate the reliability and practicality of the proposed system and the designed method. The dataset and the source code can be found at: https://github.com/lizixing23/AMBOKD.\",\"PeriodicalId\":13303,\"journal\":{\"name\":\"IEEE transactions on neural networks and learning systems\",\"volume\":\"PP \",\"pages\":\"\"},\"PeriodicalIF\":8.9000,\"publicationDate\":\"2025-09-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE transactions on neural networks and learning systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1109/TNNLS.2025.3605710\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on neural networks and learning systems","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1109/TNNLS.2025.3605710","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

高级认知可以通过脑机接口（bci）从人脑中测量出来。将这些接口与具有高效特征提取能力的计算机视觉技术相结合，可以实现对航空图像中弱小目标更加鲁棒和准确的检测。然而，现有的目标检测方法主要集中在同构数据上，缺乏对异构多模态数据的高效通用处理能力。在本文中，我们首先构建了一个基于脑眼机的少镜头条件下航拍图像目标检测系统。该系统利用区域建议网络（RPNs）检测可疑目标，通过基于眼动追踪的慢速串行视觉呈现（ESSVP）范式在脑电图（EEG）中唤起事件相关电位（ERP）信号，并利用眼动数据构建脑电图像数据对。在此基础上，提出了一种基于自适应模态平衡在线知识蒸馏（AMBOKD）的脑电图像模糊目标识别方法。AMBOKD利用多头注意模块融合脑电和图像特征，建立了一种特征综合的新模态。为了提高融合模式的性能和鲁棒性，通过端到端在线KD （OKD）实现模式之间的同步训练和相互学习。在学习过程中，提出了一种自适应模态平衡模块，通过动态调整各模态的重要度和训练梯度的权重来保证多模态平衡。通过与现有方法的比较，证明了该方法的有效性和优越性。此外，在公共数据集和真实场景上进行的实验证明了所提出的系统和设计方法的可靠性和实用性。数据集和源代码可以在https://github.com/lizixing23/AMBOKD上找到。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Adaptive Modality Balanced Online Knowledge Distillation for Brain-Eye-Computer-Based Dim Object Detection.

Advanced cognition can be measured from the human brain using brain-computer interfaces (BCIs). Integrating these interfaces with computer vision techniques, which possess efficient feature extraction capabilities, can achieve more robust and accurate detection of dim targets in aerial images. However, existing target detection methods primarily concentrate on homogeneous data, lacking efficient and versatile processing capabilities for heterogeneous multimodal data. In this article, we first build a brain-eye-computer-based object detection system for aerial images under few-shot conditions. This system detects suspicious targets using region proposal networks (RPNs), evokes the event-related potential (ERP) signal in electroencephalogram (EEG) through the eye-tracking-based slow serial visual presentation (ESSVP) paradigm, and constructs the EEG-image data pairs with eye movement data. Then, an adaptive modality balanced online knowledge distillation (AMBOKD) method is proposed to recognize dim objects with the EEG-image data. AMBOKD fuses EEG and image features using a multihead attention module, establishing a new modality with comprehensive features. To enhance the performance and robust capability of the fusion modality, simultaneous training and mutual learning between modalities are enabled by end-to-end online KD (OKD). During the learning process, an adaptive modality balancing module is proposed to ensure multimodal equilibrium by dynamically adjusting the weights of the importance and the training gradients across various modalities. The effectiveness and superiority of our method are demonstrated by comparing it with existing state-of-the-art methods. Additionally, experiments conducted on public datasets and real-world scenarios demonstrate the reliability and practicality of the proposed system and the designed method. The dataset and the source code can be found at: https://github.com/lizixing23/AMBOKD.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE transactions on neural networks and learning systems COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE-COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

CiteScore

23.80

自引率

9.60%

发文量

2102

审稿时长

3-8 weeks

期刊介绍： The focus of IEEE Transactions on Neural Networks and Learning Systems is to present scholarly articles discussing the theory, design, and applications of neural networks as well as other learning systems. The journal primarily highlights technical and scientific research in this domain.