Gai Lu, Yi-Feng Zhang, Xingxing Chu, Yingxin Liu, Yang Yu
{"title":"Combining multi-scale convolutional neural network and Transformers for EEG-Based RSVP detection","authors":"Gai Lu, Yi-Feng Zhang, Xingxing Chu, Yingxin Liu, Yang Yu","doi":"10.1109/YAC57282.2022.10023688","DOIUrl":null,"url":null,"abstract":"Rapid serial visual presentation (RSVP) is an effective brain-computer interface (BCI) technique for recognizing target objects. Decoding the subject’s intention from the single-trial electroencephalogram (EEG) signal through a decoding algorithm is the key to RSVP-based BCI. The unavoidable noise and variability between trials in EEG signals lead to low accuracy of EEG-based RSVP detection and low universality of the model. It is necessary to develop an EEG decoding algorithm with robust generalization ability and high recognition accuracy. In this study, we proposed a novel end-to-end model architecture that combines multi-scale spatiotemporal convolutional neural network (CNN) and Transformers. Specifically, the multi-scale CNN is used to capture spatiotemporal features at different scales, while the Transformers are used to extract the most discriminative global information. Experimental results on the RSVP-based benchmark datasets show that the proposed method in this study can achieve higher recognition accuracy compared to the other three advanced methods in both cross-subject and within-subject experiments. The results of fine-tuning experiments using pre-trained models on a new subject show that better results can be obtained in single-subject experiments using only a small amount of data. The experimental results validate the effectiveness of our method and provide a new idea for constructing a feature extraction method with better generalization capability for RSVP-based BCI.","PeriodicalId":272227,"journal":{"name":"2022 37th Youth Academic Annual Conference of Chinese Association of Automation (YAC)","volume":"40 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 37th Youth Academic Annual Conference of Chinese Association of Automation (YAC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/YAC57282.2022.10023688","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Rapid serial visual presentation (RSVP) is an effective brain-computer interface (BCI) technique for recognizing target objects. Decoding the subject’s intention from the single-trial electroencephalogram (EEG) signal through a decoding algorithm is the key to RSVP-based BCI. The unavoidable noise and variability between trials in EEG signals lead to low accuracy of EEG-based RSVP detection and low universality of the model. It is necessary to develop an EEG decoding algorithm with robust generalization ability and high recognition accuracy. In this study, we proposed a novel end-to-end model architecture that combines multi-scale spatiotemporal convolutional neural network (CNN) and Transformers. Specifically, the multi-scale CNN is used to capture spatiotemporal features at different scales, while the Transformers are used to extract the most discriminative global information. Experimental results on the RSVP-based benchmark datasets show that the proposed method in this study can achieve higher recognition accuracy compared to the other three advanced methods in both cross-subject and within-subject experiments. The results of fine-tuning experiments using pre-trained models on a new subject show that better results can be obtained in single-subject experiments using only a small amount of data. The experimental results validate the effectiveness of our method and provide a new idea for constructing a feature extraction method with better generalization capability for RSVP-based BCI.