Alexander Craik, Heather R Dial, Jose L Contreras-Vidal
{"title":"利用头皮脑电图(EEG)对公开语音进行连续和离散解码。","authors":"Alexander Craik, Heather R Dial, Jose L Contreras-Vidal","doi":"10.1088/1741-2552/ad8d0a","DOIUrl":null,"url":null,"abstract":"<p><p>Neurological disorders affecting speech production adversely impact quality of
life for over 7 million individuals in the US. Traditional speech interfaces like eyetracking
devices and P300 spellers are slow and unnatural for these patients. An
alternative solution, speech Brain-Computer Interfaces (BCIs), directly decodes speech
characteristics, offering a more natural communication mechanism. This research
explores the feasibility of decoding speech features using non-invasive EEG. Nine
neurologically intact participants were equipped with a 63-channel EEG system
with additional sensors to eliminate eye artifacts. Participants read aloud sentences
displayed on a screen selected for phonetic similarity to the English language. Deep
learning models, including Convolutional Neural Networks and Recurrent Neural
Networks with and without attention modules, were optimized with a focus on
minimizing trainable parameters and utilizing small input window sizes for real-time
application. These models were employed for discrete and continuous speech decoding
tasks, achieving statistically significant participant-independent decoding performance
for discrete classes and continuous characteristics of the produced audio signal. A
frequency sub-band analysis highlighted the significance of certain frequency bands
(delta, theta, and gamma) for decoding performance, and a perturbation analysis
was used to identify crucial channels. Assessed channel selection methods did not
significantly improve performance, suggesting a distributed representation of speech
information encoded in the EEG signals. Leave-One-Out training demonstrated
the feasibility of utilizing common speech neural correlates, reducing data collection
requirements from individual participants.</p>","PeriodicalId":94096,"journal":{"name":"Journal of neural engineering","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Continuous and discrete decoding of overt speech with scalp electroencephalography (EEG).\",\"authors\":\"Alexander Craik, Heather R Dial, Jose L Contreras-Vidal\",\"doi\":\"10.1088/1741-2552/ad8d0a\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Neurological disorders affecting speech production adversely impact quality of
life for over 7 million individuals in the US. Traditional speech interfaces like eyetracking
devices and P300 spellers are slow and unnatural for these patients. An
alternative solution, speech Brain-Computer Interfaces (BCIs), directly decodes speech
characteristics, offering a more natural communication mechanism. This research
explores the feasibility of decoding speech features using non-invasive EEG. Nine
neurologically intact participants were equipped with a 63-channel EEG system
with additional sensors to eliminate eye artifacts. Participants read aloud sentences
displayed on a screen selected for phonetic similarity to the English language. Deep
learning models, including Convolutional Neural Networks and Recurrent Neural
Networks with and without attention modules, were optimized with a focus on
minimizing trainable parameters and utilizing small input window sizes for real-time
application. These models were employed for discrete and continuous speech decoding
tasks, achieving statistically significant participant-independent decoding performance
for discrete classes and continuous characteristics of the produced audio signal. A
frequency sub-band analysis highlighted the significance of certain frequency bands
(delta, theta, and gamma) for decoding performance, and a perturbation analysis
was used to identify crucial channels. Assessed channel selection methods did not
significantly improve performance, suggesting a distributed representation of speech
information encoded in the EEG signals. Leave-One-Out training demonstrated
the feasibility of utilizing common speech neural correlates, reducing data collection
requirements from individual participants.</p>\",\"PeriodicalId\":94096,\"journal\":{\"name\":\"Journal of neural engineering\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-10-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of neural engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1088/1741-2552/ad8d0a\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of neural engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1088/1741-2552/ad8d0a","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Continuous and discrete decoding of overt speech with scalp electroencephalography (EEG).
Neurological disorders affecting speech production adversely impact quality of
life for over 7 million individuals in the US. Traditional speech interfaces like eyetracking
devices and P300 spellers are slow and unnatural for these patients. An
alternative solution, speech Brain-Computer Interfaces (BCIs), directly decodes speech
characteristics, offering a more natural communication mechanism. This research
explores the feasibility of decoding speech features using non-invasive EEG. Nine
neurologically intact participants were equipped with a 63-channel EEG system
with additional sensors to eliminate eye artifacts. Participants read aloud sentences
displayed on a screen selected for phonetic similarity to the English language. Deep
learning models, including Convolutional Neural Networks and Recurrent Neural
Networks with and without attention modules, were optimized with a focus on
minimizing trainable parameters and utilizing small input window sizes for real-time
application. These models were employed for discrete and continuous speech decoding
tasks, achieving statistically significant participant-independent decoding performance
for discrete classes and continuous characteristics of the produced audio signal. A
frequency sub-band analysis highlighted the significance of certain frequency bands
(delta, theta, and gamma) for decoding performance, and a perturbation analysis
was used to identify crucial channels. Assessed channel selection methods did not
significantly improve performance, suggesting a distributed representation of speech
information encoded in the EEG signals. Leave-One-Out training demonstrated
the feasibility of utilizing common speech neural correlates, reducing data collection
requirements from individual participants.