Banghua Yang, Fenqi Rong, Yunlong Xie, Du Li, Jiayang Zhang, Fu Li, Guangming Shi, Xiaorong Gao
{"title":"A multi-day and high-quality EEG dataset for motor imagery brain-computer interface.","authors":"Banghua Yang, Fenqi Rong, Yunlong Xie, Du Li, Jiayang Zhang, Fu Li, Guangming Shi, Xiaorong Gao","doi":"10.1038/s41597-025-04826-y","DOIUrl":null,"url":null,"abstract":"<p><p>A key challenge in developing a robust electroencephalography (EEG)-based brain-computer interface (BCI) is obtaining reliable classification performance across multiple days. In particular, EEG-based motor imagery (MI) BCI faces large variability and low signal-to-noise ratio. To address these issues, collecting a large and reliable dataset is critical for learning of cross-session and cross-subject patterns while mitigating EEG signals inherent instability. In this study, we obtained a comprehensive MI dataset from the 2019 World Robot Conference Contest-BCI Robot Contest. We collected EEG data from 62 healthy participants across three recording sessions. This experiment includes two paradigms: (1) two-class tasks: left and right hand-grasping, (2) three-class tasks: left and right hand-grasping, and foot-hooking. The dataset comprises raw data, and preprocessed data. For the two-class data, an average classification accuracy of 85.32% was achieved using EEGNet, while the three-class data achieved an accuracy of 76.90% using deepConvNet. Different researchers can reuse the dataset according to their needs. We hope that this dataset will significantly advance MI-BCI research, particularly in addressing cross-session and cross-subject challenges.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":"12 1","pages":"488"},"PeriodicalIF":5.8000,"publicationDate":"2025-03-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11930978/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Scientific Data","FirstCategoryId":"103","ListUrlMain":"https://doi.org/10.1038/s41597-025-04826-y","RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}
引用次数: 0
Abstract
A key challenge in developing a robust electroencephalography (EEG)-based brain-computer interface (BCI) is obtaining reliable classification performance across multiple days. In particular, EEG-based motor imagery (MI) BCI faces large variability and low signal-to-noise ratio. To address these issues, collecting a large and reliable dataset is critical for learning of cross-session and cross-subject patterns while mitigating EEG signals inherent instability. In this study, we obtained a comprehensive MI dataset from the 2019 World Robot Conference Contest-BCI Robot Contest. We collected EEG data from 62 healthy participants across three recording sessions. This experiment includes two paradigms: (1) two-class tasks: left and right hand-grasping, (2) three-class tasks: left and right hand-grasping, and foot-hooking. The dataset comprises raw data, and preprocessed data. For the two-class data, an average classification accuracy of 85.32% was achieved using EEGNet, while the three-class data achieved an accuracy of 76.90% using deepConvNet. Different researchers can reuse the dataset according to their needs. We hope that this dataset will significantly advance MI-BCI research, particularly in addressing cross-session and cross-subject challenges.
期刊介绍:
Scientific Data is an open-access journal focused on data, publishing descriptions of research datasets and articles on data sharing across natural sciences, medicine, engineering, and social sciences. Its goal is to enhance the sharing and reuse of scientific data, encourage broader data sharing, and acknowledge those who share their data.
The journal primarily publishes Data Descriptors, which offer detailed descriptions of research datasets, including data collection methods and technical analyses validating data quality. These descriptors aim to facilitate data reuse rather than testing hypotheses or presenting new interpretations, methods, or in-depth analyses.