Qikai Fan, Lurong Jiang, Amira El Gohary, Fang Dong, Duanpo Wu, Tiejia Jiang, Chen Wang, Junbiao Liu
{"title":"A multi-domain feature fusion epilepsy seizure detection method based on spike matching and PLV functional networks.","authors":"Qikai Fan, Lurong Jiang, Amira El Gohary, Fang Dong, Duanpo Wu, Tiejia Jiang, Chen Wang, Junbiao Liu","doi":"10.1088/1741-2552/adaef3","DOIUrl":"10.1088/1741-2552/adaef3","url":null,"abstract":"<p><p><i>Objective.</i>The identification of spikes, as a typical characteristic wave of epilepsy, is crucial for diagnosing and locating the epileptogenic region. The traditional seizure detection methods lack spike features and have low sample richness. This paper proposes a seizure detection method with spike-based phase locking value (PLV) functional brain networks and multi-domain fused features.<i>Approach.</i>In the spiking detection part, brain functional networks based on PLV are constructed to explore the changes in brain functional states during spiking discharge, from the perspective of microscopic neuronal activity to macroscopic brain region interactions. Then, in the epilepsy seizure detection task, multi-domain fused feature sequences are constructed using time-domain, frequency-domain, inter-channel correlation, and the spike detection features. Finally, Bi-LSTM and Transformer encoders and their optimized models are used to verify the effectiveness of the proposed method.<i>Main results.</i>Experimental results achieve the best seizure detection metrics on Bi-LSTM-Attention, with accuracy, sensitivity, and specificity reaching 98.40%, 98.94%, and 97.86%, respectively.<i>Significance.</i>The method is significant as it innovatively applies multi channel spike network features to seizure detection. It can potentially improve the diagnosis and location of the epileptogenic region by accurately detecting seizures through the identification of spikes, which is a crucial characteristic wave of epilepsy.</p>","PeriodicalId":94096,"journal":{"name":"Journal of neural engineering","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143054694","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Reliable quantification of neural entrainment to rhythmic auditory stimulation: simulation and experimental validation.","authors":"Yiwen Xu, Xiaodan Tan, Minmin Luo, Qiuyou Xie, Feng Yang, Chang'an A Zhan","doi":"10.1088/1741-2552/adaeec","DOIUrl":"10.1088/1741-2552/adaeec","url":null,"abstract":"<p><p><i>Objective.</i>Entrainment has been considered as a potential mechanism underlying the facilitatory effect of rhythmic neural stimulation on neurorehabilitation. The inconsistent effects of brain stimulation on neurorehabilitation found in the literature may be caused by the variability in neural entrainment. To dissect the underlying mechanisms and optimize brain stimulation for improved effectiveness, it is critical to reliably assess the occurrence and the strength of neural entrainment. However, the factors influencing entrainment assessment are not yet fully understood. This study aims to investigate whether and how the relevant factors (i.e. data length, frequency bandwidth, signal-to-noise ratio (SNR), center frequency, and the constant component of stimulus-response phase-difference) influence the assessment reliability of neural entrainment.<i>Approach.</i>We simulated data for 28 scenarios to answer above questions. We also recorded experimental data to verify the findings from our simulation study.<i>Main results.</i>A minimal data length is required to achieve reliable neural entrainment assessment, and this requirement critically depends on the bandwidth and SNR, but is independent of the center frequency and the constant component of stimulus-response phase-difference. Furthermore, changing of bandwidth is accompanied by the change of SNR.<i>Significance.</i>The present study has revealed how data length, bandwidth, and SNR critically affect the assessment reliability of neural entrainment. The findings provide a foundation for the parameter setting in experiment design and data analysis in neural entrainment studies. While this study is within the context of rhythmic auditory stimulation, the conclusions may be applicable for neural entrainment to other rhythmic stimulations.</p>","PeriodicalId":94096,"journal":{"name":"Journal of neural engineering","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143054704","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Spatio-temporal transformers for decoding neural movement control.","authors":"Benedetta Candelori, Giampiero Bardella, Indro Spinelli, Surabhi Ramawat, Pierpaolo Pani, Stefano Ferraina, Simone Scardapane","doi":"10.1088/1741-2552/adaef0","DOIUrl":"10.1088/1741-2552/adaef0","url":null,"abstract":"<p><p><i>Objective</i>. Deep learning tools applied to high-resolution neurophysiological data have significantly progressed, offering enhanced decoding, real-time processing, and readability for practical applications. However, the design of artificial neural networks to analyze neural activity<i>in vivo</i>remains a challenge, requiring a delicate balance between efficiency in low-data regimes and the interpretability of the results.<i>Approach</i>. To address this challenge, we introduce a novel specialized transformer architecture to analyze single-neuron spiking activity. The model is tested on multi-electrode recordings from the dorsal premotor cortex of non-human primates performing a motor inhibition task.<i>Main results</i>. The proposed architecture provides an early prediction of the correct movement direction, achieving accurate results no later than 230 ms after the Go signal presentation across animals. Additionally, the model can forecast whether the movement will be generated or withheld before a stop signal, unattended, is actually presented. To further understand the internal dynamics of the model, we compute the predicted correlations between time steps and between neurons at successive layers of the architecture, with the evolution of these correlations mirrors findings from previous theoretical analyses.<i>Significance</i>. Overall, our framework provides a comprehensive use case for the practical implementation of deep learning tools in motor control research, highlighting both the predictive capabilities and interpretability of the proposed architecture.</p>","PeriodicalId":94096,"journal":{"name":"Journal of neural engineering","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143054218","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Junkongshuai Wang, Yangjie Luo, Haoran Wang, Lu Wang, Lihua Zhang, Zhongxue Gan, Xiaoyang Kang
{"title":"FLANet: A multiscale temporal convolution and spatial-spectral attention network for EEG artifact removal with adversarial training.","authors":"Junkongshuai Wang, Yangjie Luo, Haoran Wang, Lu Wang, Lihua Zhang, Zhongxue Gan, Xiaoyang Kang","doi":"10.1088/1741-2552/adae34","DOIUrl":"https://doi.org/10.1088/1741-2552/adae34","url":null,"abstract":"<p><p><i>Objective.</i>Denoising artifacts, such as noise from muscle or cardiac activity, is a crucial and ubiquitous concern in neurophysiological signal processing, particularly for enhancing the signal-to-noise ratio in electroencephalograph (EEG) analysis. Novel methods based on deep learning demonstrate a notably prominent effect compared to traditional denoising approaches. However, those still suffer from certain limitations. Some methods often neglect the multi-domain characteristics of the artifact signal. Even among those that do consider these, there are deficiencies in terms of efficiency, effectiveness and computation cost.<i>Approach.</i>In this study, we propose a multiscale temporal convolution and spatial-spectral attention network with adversarial training for automatically filtering artifacts, named filter artifacts network (FLANet). The multiscale convolution module can extract sufficient temporal information and the spatial-spectral attention network can extract not only non-local similarity but also spectral dependencies. To make data denoising more efficient and accurate, we adopt adversarial training with novel loss functions to generate outputs that are closer to pure signals.<i>Main results.</i>The results show that the method proposed in this paper achieves great performance in artifact removal and valid information preservation on EEG signals contaminated by different types of artifacts. This approach enables a more optimal trade-off between denoising efficacy and computational overhead.<i>Significance.</i>The proposed artifact removal framework facilitates the implementation of an efficient denoising method, contributing to the advancement of neural analysis and neural engineering, and can be expected to be applied to clinical research and to realize novel human-computer interaction systems.</p>","PeriodicalId":94096,"journal":{"name":"Journal of neural engineering","volume":"22 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143191445","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Amparo Güemes, Tiago da Silva Costa, Tamar R Makin
{"title":"Foundational guidelines for enhancing neurotechnology research and development through end-user involvement.","authors":"Amparo Güemes, Tiago da Silva Costa, Tamar R Makin","doi":"10.1088/1741-2552/adac0d","DOIUrl":"10.1088/1741-2552/adac0d","url":null,"abstract":"<p><p>Neurotechnologies are increasingly becoming integrated with our everyday lives, our bodies and our mental states. As the popularity and impact of neurotechnology grows, so does our responsibility to ensure we understand its particular implications on its end users, as well as broader ethical and societal implications. There are many different terms and frameworks to articulate the concept of involving end users in the technology development lifecycle, for example: 'Public and Patient Involvement and Engagement' (PPIE), 'lived experience', 'co-design' or 'co-production'. The objective of this tutorial is to utilise the PPIE framework to develop clear guidelines for implementing a robust involvement process of current and future end-users in neurotechnology, with emphasis on patient involvement. After an introduction that coveys the tangible and conceptual benefits of user involvement, we first guide the reader to develop a general strategy towards setting up their own PPIE process. We then help the reader map out their relevant stakeholders and provide advice on how to consider user diversity and representation. We also provide advice and tools on how to quantify the outcomes of the engagement. We consolidate advice from various online sources to orient individual teams (and their funders) to carve up their own approach to meaningful involvement. Key outputs include a stakeholder mapping tool, methods to measure the impact of engagement, and a structured checklist for transparent reporting. Enabling end-users and other stakeholders to participate in the development of neurotechnology, even at its earliest stages of conception, will help us better navigate our design around ethical, social, and usability considerations, and deliver more impactful technologies. The overall aim is the establishment of gold-standard methodologies for ensuring that patient and public insights are at the forefront of our scientific inquiry and product development.</p>","PeriodicalId":94096,"journal":{"name":"Journal of neural engineering","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143018955","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Betts Peters, Basak Celik, Dylan Gaines, Deirdre Galvin-McLaughlin, Tales Imbiriba, Michelle Kinsella, Daniel Klee, Matthew Lawhead, Tab Memmott, Niklas Smedemark-Margulies, Jack Wiedrick, Deniz Erdogmus, Barry Oken, Keith Vertanen, Melanie Fried-Oken
{"title":"RSVP keyboard with inquiry preview: mixed performance and user experience with an adaptive, multimodal typing interface combining EEG and switch input.","authors":"Betts Peters, Basak Celik, Dylan Gaines, Deirdre Galvin-McLaughlin, Tales Imbiriba, Michelle Kinsella, Daniel Klee, Matthew Lawhead, Tab Memmott, Niklas Smedemark-Margulies, Jack Wiedrick, Deniz Erdogmus, Barry Oken, Keith Vertanen, Melanie Fried-Oken","doi":"10.1088/1741-2552/ada8e0","DOIUrl":"10.1088/1741-2552/ada8e0","url":null,"abstract":"<p><p><i>Objective.</i>The RSVP Keyboard is a non-implantable, event-related potential-based brain-computer interface (BCI) system designed to support communication access for people with severe speech and physical impairments. Here we introduce inquiry preview (IP), a new RSVP Keyboard interface incorporating switch input for users with some voluntary motor function, and describe its effects on typing performance and other outcomes.<i>Approach.</i>Four individuals with disabilities participated in the collaborative design of possible switch input applications for the RSVP Keyboard, leading to the development of IP and a method of fusing switch input with language model and electroencephalography (EEG) evidence for typing. Twenty-four participants without disabilities and one potential end user with incomplete locked-in syndrome took part in two experiments investigating the effects of IP and two modes of switch input on typing accuracy and speed during a copy-spelling task.<i>Main results.</i>For participants without disabilities, IP and switch input tended to worsen typing performance compared to the standard RSVP Keyboard condition, with more consistent effects across participants for speed than for accuracy. However, there was considerable variability, with some participants demonstrating improved typing performance and better user experience (UX) with IP and switch input. Typing performance for the potential end user was comparable to that of participants without disabilities. He typed most quickly and accurately with IP and switch input and gave favorable UX ratings to those conditions, but preferred standard RSVP Keyboard.<i>Significance.</i>IP is a novel multimodal interface for the RSVP Keyboard BCI, incorporating switch input as an additional control signal. Typing performance and UX and preference varied widely across participants, reinforcing the need for flexible, customizable BCI systems that can adapt to individual users. ClinicalTrials.gov Identifier: NCT04468919.</p>","PeriodicalId":94096,"journal":{"name":"Journal of neural engineering","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11921818/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142966651","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jacob Tobias Gusman, Tommy Hosman, Rekha Crawford, Tyler Singer-Clark, Anastasia Kapitonava, Jessica Kelemen, Nick Hahn, Jaimie M Henderson, Leigh Hochberg, John Simeral, Carlos Vargas-Irwin
{"title":"Multi-gesture drag-and-drop decoding in a 2D iBCI control task.","authors":"Jacob Tobias Gusman, Tommy Hosman, Rekha Crawford, Tyler Singer-Clark, Anastasia Kapitonava, Jessica Kelemen, Nick Hahn, Jaimie M Henderson, Leigh Hochberg, John Simeral, Carlos Vargas-Irwin","doi":"10.1088/1741-2552/adb180","DOIUrl":"https://doi.org/10.1088/1741-2552/adb180","url":null,"abstract":"<p><strong>Objective: </strong>Intracortical brain-computer interfaces (iBCIs) have demonstrated the ability to enable point and click as well as reach and grasp control for people with tetraplegia. However, few studies have investigated iBCIs during long-duration discrete movements that would enable common computer interactions such as \"click-and-hold\" or \"drag-and-drop\".</p><p><strong>Approach: </strong>Here, we examined the performance of multi-class and binary (attempt/no-attempt) classification of neural activity in the left precentral gyrus of two BrainGate2 clinical trial participants performing hand gestures for 1, 2, and 4 seconds in duration. We then designed a novel \"latch decoder\" that utilizes parallel multi-class and binary decoding processes and evaluated its performance on data from isolated sustained gesture attempts and a multi-gesture drag-and-drop task.</p><p><strong>Main results: </strong>Neural activity during sustained gestures revealed a marked decrease in the discriminability of hand gestures sustained beyond 1 second. Compared to standard direct decoding methods, the latch decoder demonstrated substantial improvement in decoding accuracy for gestures performed independently or in conjunction with simultaneous 2D cursor control.</p><p><strong>Significance: </strong>This work highlights the unique neurophysiologic response patterns of sustained gesture attempts in human motor cortex and demonstrates a promising decoding approach that could enable individuals with tetraplegia to intuitively control a wider range of consumer electronics using an iBCI.</p>","PeriodicalId":94096,"journal":{"name":"Journal of neural engineering","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-02-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143124187","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ali K Zadeh, Oula Puonti, Björn Sigurðsson, Axel Thielscher, Oury Monchi, Samuel Pichardo
{"title":"Enhancing transcranial ultrasound stimulation planning with MRI-derived skull masks: a comparative analysis with CT-based processing.","authors":"Ali K Zadeh, Oula Puonti, Björn Sigurðsson, Axel Thielscher, Oury Monchi, Samuel Pichardo","doi":"10.1088/1741-2552/adab22","DOIUrl":"https://doi.org/10.1088/1741-2552/adab22","url":null,"abstract":"<p><p><i>Objective.</i>Transcranial ultrasound stimulation (TUS) presents challenges in ultrasound wave transmission through the skull, affecting study outcomes due to aberration and attenuation. While planning strategies incorporating 3D computed tomography (CT) scans help mitigate these issues, they expose participants to radiation, which can raise ethical concerns. A solution involves generating skull masks from participants' anatomical magnetic resonance imaging (MRI). This study aims to compare ultrasound field predictions between CT-derived and MRI-derived skull masks in TUS planning.<i>Approach.</i>Five participants with a range of skull density ratios (SDRs: 0.31, 0.42, 0.55, 0.67, and 0.79) were selected, each having both CT and T1/T2-weighted MRI scans. Ultrasound simulations were performed using BabelBrain software with a single-element transducer (diameter = 50 mm,<i>F</i># = 1) at 250, 500, and 750 kHz frequencies. CT scans were used to generate maps of the skull's acoustic properties. The MRI scans were processed using the Charm segmentation tool from the SimNIBS tool suite using default and custom settings adapted for better skull segmentation. Ultrasound was adjusted to target 30 mm below the skull's surface at 54 electroencephalogram (EEG) locations.<i>Main Results.</i>The custom setting in Charm significantly improved the Dice coefficient between MRI- and CT-derived masks when compared to the default setting (<i>p</i>< 0.001). The maximum pressure error significantly decreased in the custom setting compared to the default setting (<i>p</i>< 0.001). Additionally, the focus location error median across different SDRs averaged 2.32, 1.45, and 1.57 mm in default and 2.08, 1.38, and 1.44 mm in custom conditions for 250 kHz, 500 kHz, and 750 kHz respectively.<i>Significance.</i>MRI-derived skull masks offer satisfactory accuracy at many EEG sites, and using custom settings can further enhance this accuracy. However, significant errors at specific locations highlight the importance of carefully considering stimulation location when choosing between CT- and MRI-derived skull modeling.</p>","PeriodicalId":94096,"journal":{"name":"Journal of neural engineering","volume":"22 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143070414","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xuan Ma, Fabio Rizzoglio, Kevin L Bodkin, Lee E Miller
{"title":"Unsupervised, piecewise linear decoding enables an accurate prediction of muscle activity in a multi-task brain computer interface.","authors":"Xuan Ma, Fabio Rizzoglio, Kevin L Bodkin, Lee E Miller","doi":"10.1088/1741-2552/adab93","DOIUrl":"10.1088/1741-2552/adab93","url":null,"abstract":"<p><p><i>Objective.</i>Creating an intracortical brain computer interface (iBCI) capable of seamless transitions between tasks and contexts would greatly enhance user experience. However, the nonlinearity in neural activity presents challenges to computing a global iBCI decoder. We aimed to develop a method that differs from a globally optimized decoder to address this issue.<i>Approach.</i>We devised an unsupervised approach that relies on the structure of a low-dimensional neural manifold to implement a piecewise linear decoder. We created a distinctive dataset in which monkeys performed a diverse set of tasks, some trained, others innate, while we recorded neural signals from the motor cortex (M1) and electromyographs (EMGs) from upper limb muscles. We used both linear and nonlinear dimensionality reduction techniques to discover neural manifolds and applied unsupervised algorithms to identify clusters within those spaces. Finally, we fit a linear decoder of EMG for each cluster. A specific decoder was activated corresponding to the cluster each new neural data point belonged to.<i>Main results.</i>We found clusters in the neural manifolds corresponding with the different tasks or task sub-phases. The performance of piecewise decoding improved as the number of clusters increased and plateaued gradually. With only two clusters it already outperformed a global linear decoder, and unexpectedly, it outperformed even a global recurrent neural network decoder with 10-12 clusters.<i>Significance.</i>This study introduced a computationally lightweight solution for creating iBCI decoders that can function effectively across a broad range of tasks. EMG decoding is particularly challenging, as muscle activity is used, under varying contexts, to control interaction forces and limb stiffness, as well as motion. The results suggest that a piecewise linear decoder can provide a good approximation to the nonlinearity between neural activity and motor outputs, a result of our increased understanding of the structure of neural manifolds in motor cortex.</p>","PeriodicalId":94096,"journal":{"name":"Journal of neural engineering","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11775726/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143019015","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Junbo Chen, Xupeng Chen, Ran Wang, Chenqian Le, Amirhossein Khalilian-Gourtani, Erika Jensen, Patricia Dugan, Werner Doyle, Orrin Devinsky, Daniel Friedman, Adeen Flinker, Yao Wang
{"title":"Transformer-based neural speech decoding from surface and depth electrode signals.","authors":"Junbo Chen, Xupeng Chen, Ran Wang, Chenqian Le, Amirhossein Khalilian-Gourtani, Erika Jensen, Patricia Dugan, Werner Doyle, Orrin Devinsky, Daniel Friedman, Adeen Flinker, Yao Wang","doi":"10.1088/1741-2552/adab21","DOIUrl":"10.1088/1741-2552/adab21","url":null,"abstract":"<p><p><i>Objective.</i>This study investigates speech decoding from neural signals captured by intracranial electrodes. Most prior works can only work with electrodes on a 2D grid (i.e. Electrocorticographic (ECoG) or ECoG array) and data from a single patient. We aim to design a deep-learning model architecture that can accommodate both surface ECoG and depth (stereotactic EEG or sEEG) electrodes. The architecture should allow training on data from multiple participants with large variability in electrode placements. The model should not have subject-specific layers and the trained model should perform well on participants unseen during training.<i>Approach.</i>We propose a novel transformer-based model architecture named SwinTW that can work with arbitrarily positioned electrodes by leveraging their 3D locations on the cortex rather than their positions on a 2D grid. We train subject-specific models using data from a single participant and multi-subject models exploiting data from multiple participants.<i>Main results.</i>The subject-specific models using only low-density 8 × 8 ECoG data achieved high decoding Pearson Correlation Coefficient with ground truth spectrogram (PCC = 0.817), over<i>N</i>= 43 participants, significantly outperforming our prior convolutional ResNet model and the 3D Swin transformer model. Incorporating additional strip, depth, and grid electrodes available in each participant (<i>N</i>= 39) led to further improvement (PCC = 0.838). For participants with only sEEG electrodes (<i>N</i>= 9), subject-specific models still enjoy comparable performance with an average PCC = 0.798. A single multi-subject model trained on ECoG data from 15 participants yielded comparable results (PCC = 0.837) as 15 models trained individually for these participants (PCC = 0.831). Furthermore, the multi-subject models achieved high performance on unseen participants, with an average PCC = 0.765 in leave-one-out cross-validation.<i>Significance.</i>The proposed SwinTW decoder enables future speech decoding approaches to utilize any electrode placement that is clinically optimal or feasible for a particular participant, including using only depth electrodes, which are more routinely implanted in chronic neurosurgical procedures. The success of the single multi-subject model when tested on participants within the training cohort demonstrates that the model architecture is capable of exploiting data from multiple participants with diverse electrode placements. The architecture's flexibility in training with both single-subject and multi-subject data, as well as grid and non-grid electrodes, ensures its broad applicability. Importantly, the generalizability of the multi-subject models in our study population suggests that a model trained using paired acoustic and neural data from multiple patients can potentially be applied to new patients with speech disability where acoustic-neural training data is not feasible.</p>","PeriodicalId":94096,"journal":{"name":"Journal of neural engineering","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-01-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11773629/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143019010","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}