合并脑机接口P300拼写数据集:观点和陷阱

Frontiers in Neuroergonomics Pub Date : 2022-12-21 DOI:10.3389/fnrgo.2022.1045653

L. Bianchi, R. Ferrante, Yaoping Hu, Guillermo Sahonero-Alvarez, N. Z. Zenia

{"title":"合并脑机接口P300拼写数据集:观点和陷阱","authors":"L. Bianchi, R. Ferrante, Yaoping Hu, Guillermo Sahonero-Alvarez, N. Z. Zenia","doi":"10.3389/fnrgo.2022.1045653","DOIUrl":null,"url":null,"abstract":"Background In the last decades, the P300 Speller paradigm was replicated in many experiments, and collected data were released to the public domain to allow research groups, particularly those in the field of machine learning, to test and improve their algorithms for higher performances of brain-computer interface (BCI) systems. Training data is needed to learn the identification of brain activity. The more training data are available, the better the algorithms will perform. The availability of larger datasets is highly desirable, eventually obtained by merging datasets from different repositories. The main obstacle to such merging is that all public datasets are released in various file formats because no standard way is established to share these data. Additionally, all datasets necessitate reading documents or scientific papers to retrieve relevant information, which prevents automating the processing. In this study, we thus adopted a unique file format to demonstrate the importance of having a standard and to propose which information should be stored and why. Methods We described our process to convert a dozen of P300 Speller datasets and reported the main encountered problems while converting them into the same file format. All the datasets are characterized by the same 6 × 6 matrix of alphanumeric symbols (characters and numbers or symbols) and by the same subset of acquired signals (8 EEG sensors at the same recording sites). Results and discussion Nearly a million stimuli were converted, relative to about 7000 spelled characters and belonging to 127 subjects. The converted stimuli represent the most extensively available platform for training and testing new algorithms on the specific paradigm – the P300 Speller. The platform could potentially allow exploring transfer learning procedures to reduce or eliminate the time needed for training a classifier to improve the performance and accuracy of such BCI systems.","PeriodicalId":207447,"journal":{"name":"Frontiers in Neuroergonomics","volume":"119 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Merging Brain-Computer Interface P300 speller datasets: Perspectives and pitfalls\",\"authors\":\"L. Bianchi, R. Ferrante, Yaoping Hu, Guillermo Sahonero-Alvarez, N. Z. Zenia\",\"doi\":\"10.3389/fnrgo.2022.1045653\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Background In the last decades, the P300 Speller paradigm was replicated in many experiments, and collected data were released to the public domain to allow research groups, particularly those in the field of machine learning, to test and improve their algorithms for higher performances of brain-computer interface (BCI) systems. Training data is needed to learn the identification of brain activity. The more training data are available, the better the algorithms will perform. The availability of larger datasets is highly desirable, eventually obtained by merging datasets from different repositories. The main obstacle to such merging is that all public datasets are released in various file formats because no standard way is established to share these data. Additionally, all datasets necessitate reading documents or scientific papers to retrieve relevant information, which prevents automating the processing. In this study, we thus adopted a unique file format to demonstrate the importance of having a standard and to propose which information should be stored and why. Methods We described our process to convert a dozen of P300 Speller datasets and reported the main encountered problems while converting them into the same file format. All the datasets are characterized by the same 6 × 6 matrix of alphanumeric symbols (characters and numbers or symbols) and by the same subset of acquired signals (8 EEG sensors at the same recording sites). Results and discussion Nearly a million stimuli were converted, relative to about 7000 spelled characters and belonging to 127 subjects. The converted stimuli represent the most extensively available platform for training and testing new algorithms on the specific paradigm – the P300 Speller. The platform could potentially allow exploring transfer learning procedures to reduce or eliminate the time needed for training a classifier to improve the performance and accuracy of such BCI systems.\",\"PeriodicalId\":207447,\"journal\":{\"name\":\"Frontiers in Neuroergonomics\",\"volume\":\"119 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-12-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Frontiers in Neuroergonomics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3389/fnrgo.2022.1045653\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Frontiers in Neuroergonomics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3389/fnrgo.2022.1045653","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

在过去的几十年里，P300拼写范式在许多实验中被复制，收集到的数据被发布到公共领域，以允许研究小组，特别是机器学习领域的研究小组，测试和改进他们的算法，以提高脑机接口(BCI)系统的性能。学习识别大脑活动需要训练数据。可用的训练数据越多，算法的性能就越好。更大的数据集的可用性是非常理想的，最终通过合并来自不同存储库的数据集来获得。这种合并的主要障碍是所有公共数据集都以各种文件格式发布，因为没有建立标准方法来共享这些数据。此外，所有数据集都需要阅读文档或科学论文来检索相关信息，这阻碍了自动化处理。因此，在本研究中，我们采用了一种独特的文件格式来证明拥有标准的重要性，并提出了应该存储哪些信息以及为什么要存储这些信息。方法我们描述了转换十几个P300 Speller数据集的过程，并报告了将它们转换为相同文件格式时遇到的主要问题。所有数据集都具有相同的6 × 6字母数字符号矩阵(字符和数字或符号)和相同的采集信号子集(相同记录位置的8个EEG传感器)。结果和讨论近100万个刺激物被转换，相对于7000个拼写字符，属于127个受试者。转换后的刺激代表了最广泛可用的平台，用于训练和测试特定范式上的新算法- P300拼写器。该平台可能允许探索迁移学习过程，以减少或消除训练分类器所需的时间，以提高此类BCI系统的性能和准确性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Merging Brain-Computer Interface P300 speller datasets: Perspectives and pitfalls

Background In the last decades, the P300 Speller paradigm was replicated in many experiments, and collected data were released to the public domain to allow research groups, particularly those in the field of machine learning, to test and improve their algorithms for higher performances of brain-computer interface (BCI) systems. Training data is needed to learn the identification of brain activity. The more training data are available, the better the algorithms will perform. The availability of larger datasets is highly desirable, eventually obtained by merging datasets from different repositories. The main obstacle to such merging is that all public datasets are released in various file formats because no standard way is established to share these data. Additionally, all datasets necessitate reading documents or scientific papers to retrieve relevant information, which prevents automating the processing. In this study, we thus adopted a unique file format to demonstrate the importance of having a standard and to propose which information should be stored and why. Methods We described our process to convert a dozen of P300 Speller datasets and reported the main encountered problems while converting them into the same file format. All the datasets are characterized by the same 6 × 6 matrix of alphanumeric symbols (characters and numbers or symbols) and by the same subset of acquired signals (8 EEG sensors at the same recording sites). Results and discussion Nearly a million stimuli were converted, relative to about 7000 spelled characters and belonging to 127 subjects. The converted stimuli represent the most extensively available platform for training and testing new algorithms on the specific paradigm – the P300 Speller. The platform could potentially allow exploring transfer learning procedures to reduce or eliminate the time needed for training a classifier to improve the performance and accuracy of such BCI systems.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Frontiers in Neuroergonomics

自引率

0.00%

发文量