Gai Lu, Yi-Feng Zhang, Xingxing Chu, Yingxin Liu, Yang Yu
{"title":"结合多尺度卷积神经网络和变压器的脑电图RSVP检测","authors":"Gai Lu, Yi-Feng Zhang, Xingxing Chu, Yingxin Liu, Yang Yu","doi":"10.1109/YAC57282.2022.10023688","DOIUrl":null,"url":null,"abstract":"Rapid serial visual presentation (RSVP) is an effective brain-computer interface (BCI) technique for recognizing target objects. Decoding the subject’s intention from the single-trial electroencephalogram (EEG) signal through a decoding algorithm is the key to RSVP-based BCI. The unavoidable noise and variability between trials in EEG signals lead to low accuracy of EEG-based RSVP detection and low universality of the model. It is necessary to develop an EEG decoding algorithm with robust generalization ability and high recognition accuracy. In this study, we proposed a novel end-to-end model architecture that combines multi-scale spatiotemporal convolutional neural network (CNN) and Transformers. Specifically, the multi-scale CNN is used to capture spatiotemporal features at different scales, while the Transformers are used to extract the most discriminative global information. Experimental results on the RSVP-based benchmark datasets show that the proposed method in this study can achieve higher recognition accuracy compared to the other three advanced methods in both cross-subject and within-subject experiments. The results of fine-tuning experiments using pre-trained models on a new subject show that better results can be obtained in single-subject experiments using only a small amount of data. The experimental results validate the effectiveness of our method and provide a new idea for constructing a feature extraction method with better generalization capability for RSVP-based BCI.","PeriodicalId":272227,"journal":{"name":"2022 37th Youth Academic Annual Conference of Chinese Association of Automation (YAC)","volume":"40 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Combining multi-scale convolutional neural network and Transformers for EEG-Based RSVP detection\",\"authors\":\"Gai Lu, Yi-Feng Zhang, Xingxing Chu, Yingxin Liu, Yang Yu\",\"doi\":\"10.1109/YAC57282.2022.10023688\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Rapid serial visual presentation (RSVP) is an effective brain-computer interface (BCI) technique for recognizing target objects. Decoding the subject’s intention from the single-trial electroencephalogram (EEG) signal through a decoding algorithm is the key to RSVP-based BCI. The unavoidable noise and variability between trials in EEG signals lead to low accuracy of EEG-based RSVP detection and low universality of the model. It is necessary to develop an EEG decoding algorithm with robust generalization ability and high recognition accuracy. In this study, we proposed a novel end-to-end model architecture that combines multi-scale spatiotemporal convolutional neural network (CNN) and Transformers. Specifically, the multi-scale CNN is used to capture spatiotemporal features at different scales, while the Transformers are used to extract the most discriminative global information. Experimental results on the RSVP-based benchmark datasets show that the proposed method in this study can achieve higher recognition accuracy compared to the other three advanced methods in both cross-subject and within-subject experiments. The results of fine-tuning experiments using pre-trained models on a new subject show that better results can be obtained in single-subject experiments using only a small amount of data. The experimental results validate the effectiveness of our method and provide a new idea for constructing a feature extraction method with better generalization capability for RSVP-based BCI.\",\"PeriodicalId\":272227,\"journal\":{\"name\":\"2022 37th Youth Academic Annual Conference of Chinese Association of Automation (YAC)\",\"volume\":\"40 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-11-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 37th Youth Academic Annual Conference of Chinese Association of Automation (YAC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/YAC57282.2022.10023688\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 37th Youth Academic Annual Conference of Chinese Association of Automation (YAC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/YAC57282.2022.10023688","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Combining multi-scale convolutional neural network and Transformers for EEG-Based RSVP detection
Rapid serial visual presentation (RSVP) is an effective brain-computer interface (BCI) technique for recognizing target objects. Decoding the subject’s intention from the single-trial electroencephalogram (EEG) signal through a decoding algorithm is the key to RSVP-based BCI. The unavoidable noise and variability between trials in EEG signals lead to low accuracy of EEG-based RSVP detection and low universality of the model. It is necessary to develop an EEG decoding algorithm with robust generalization ability and high recognition accuracy. In this study, we proposed a novel end-to-end model architecture that combines multi-scale spatiotemporal convolutional neural network (CNN) and Transformers. Specifically, the multi-scale CNN is used to capture spatiotemporal features at different scales, while the Transformers are used to extract the most discriminative global information. Experimental results on the RSVP-based benchmark datasets show that the proposed method in this study can achieve higher recognition accuracy compared to the other three advanced methods in both cross-subject and within-subject experiments. The results of fine-tuning experiments using pre-trained models on a new subject show that better results can be obtained in single-subject experiments using only a small amount of data. The experimental results validate the effectiveness of our method and provide a new idea for constructing a feature extraction method with better generalization capability for RSVP-based BCI.