{"title":"Commonality and individuality based graph learning network for EEG emotion recognition.","authors":"Tengxu Zhang, Haiyan Zhou","doi":"10.1088/1741-2552/add466","DOIUrl":"10.1088/1741-2552/add466","url":null,"abstract":"<p><p><i>Objective.</i>The interplay between individual differences and shared human characteristics significantly impacts electroencephalogram (EEG) emotion recognition models, yet remains underexplored. To address this, we propose a commonality and individuality-based EEG graph learning network (CI-Graph), which captures both shared patterns and unique features to improve recognition accuracy.<i>Approach.</i>The proposed model integrates two key components, namely C-Graph and I-Graph, to synthesize a comprehensive graph representation. The C-Graph learns a commonality-based graph applied uniformly to all EEG samples, capturing shared emotional patterns across individuals. The Bootstrap method ensures stable updates while integrating complementary information from the I-Graph. In contrast, the I-Graph dynamically constructs individualized graphs for each sample using a dedicated graph learning module, capturing unique individual features. To enhance representation learning, the model employs a tokenized graph Transformer for robust data encoding and global context modeling, alongside graph diffusion convolution to refine graph connectivity and spatial convolution layer to strengthen local feature extraction. Finally, to reinforce feature learning constraints and accelerate model convergence, we employ a multi-task joint optimization strategy by integrating a self-supervised regression task and a contrastive learning task with the downstream classification task.<i>Main results.</i>We rigorously evaluate our CI-Graph model on three benchmark datasets: SEED, SEED-IV, and DEAP (both Arousal and Valence). Experimental results demonstrate consistent improvements in classification accuracy across all datasets, regardless of the classifier used.<i>Significance.</i>This study demonstrates the critical role of combining signal commonality and individuality in EEG-based emotion recognition. The proposed approach achieves cross-data and cross-model generalization, highlighting its broad applicability and potential to advance the field.</p>","PeriodicalId":94096,"journal":{"name":"Journal of neural engineering","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144033360","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Taylor G Hobbs, Charles M Greenspon, Ceci Verbaarschot, Giacomo Valle, Christopher L Hughes, Michael L Boninger, Sliman J Bensmaia, Robert A Gaunt
{"title":"Biomimetic stimulation patterns drive natural artificial touch percepts using intracortical microstimulation in humans.","authors":"Taylor G Hobbs, Charles M Greenspon, Ceci Verbaarschot, Giacomo Valle, Christopher L Hughes, Michael L Boninger, Sliman J Bensmaia, Robert A Gaunt","doi":"10.1088/1741-2552/adc2d4","DOIUrl":"10.1088/1741-2552/adc2d4","url":null,"abstract":"<p><p><i>Objective.</i>Intracortical microstimulation (ICMS) of human somatosensory cortex evokes tactile percepts that people describe as originating from their own body, but are not always described as feeling natural. It remains unclear whether stimulation parameters such as amplitude, frequency, and spatiotemporal patterns across electrodes can be chosen to increase the naturalness of these artificial tactile percepts.<i>Approach.</i>In this study, we investigated whether biomimetic stimulation patterns-ICMS patterns that reproduce essential features of natural neural activity-increased the perceived naturalness of ICMS-evoked sensations compared to a non-biomimetic pattern in three people with cervical spinal cord injuries. All participants had electrode arrays implanted in their somatosensory cortices. Rather than qualitatively asking which pattern felt more natural, participants directly compared natural residual percepts, delivered by mechanical indentation on a sensate region of their hand, to artificial percepts evoked by ICMS and were asked whether linear non-biomimetic or biomimetic stimulation felt most like the mechanical indentation.<i>Main results.</i>We show that simple biomimetic ICMS, which modulated the stimulation amplitude on a single electrode, was perceived as being more like a mechanical indentation reference on 32% of the electrodes. We also tested an advanced biomimetic stimulation scheme that captured more of the spatiotemporal dynamics of cortical activity using co-modulated stimulation amplitudes and frequencies across four electrodes. Here, ICMS felt more like the mechanical reference for 75% of the electrode groups. Finally, biomimetic stimulus trains required less charge than their non-biomimetic counterparts to create an intensity-matched sensation.<i>Significance.</i>We conclude that ICMS encoding schemes that mimic naturally occurring neural spatiotemporal activation patterns in the somatosensory cortex feel more like an actual touch than non-biomimetic encoding schemes. This also suggests that using key elements of neuronal activity can be a useful conceptual guide to constrain the large stimulus parameter space when designing future stimulation strategies. This work is a part of Clinical Trial NCT01894802.</p>","PeriodicalId":94096,"journal":{"name":"Journal of neural engineering","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143665740","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mehdi Khantan, James Lim, Alessandro Napoli, Iyad Obeid, Mijail Demian Serruya
{"title":"Virtual white matter: a novel system for cross-dish neural interaction and modulation.","authors":"Mehdi Khantan, James Lim, Alessandro Napoli, Iyad Obeid, Mijail Demian Serruya","doi":"10.1088/1741-2552/add49c","DOIUrl":"https://doi.org/10.1088/1741-2552/add49c","url":null,"abstract":"<p><p><i>Objective</i>. Biological neural networks (BNNs) are characterized by complex interregional connectivity, allowing for seamless communication between different brain regions.<i>In vitro</i>models traditionally consist of single-dish neural cultures that cannot recapitulate the dynamics of interregional interactions and little effort has been made to interconnect multiple BNNs to process information through a hybrid interconnection of the biological and digital systems.<i>Approach</i>. We introduce virtual white matter (VWM), a novel platform enabling real-time functional digital connectivity between neural cultures in separate microelectrode array dishes. By detecting neural activity in one dish and providing precisely timed electrical stimulation to another, VWM recreates bidirectional inter-regional neural communication. The study introduces the conceptual framework, technical implementation, and proof-of-concept validation of the VWM system.<i>Main Results.</i>VWM represents a significant advancement<i>in vitro</i>modeling and data processing by enabling controlled interactions between heterogeneous neural cultures, such as different brain regions or cell types. The platform successfully enables the investigation of dynamic network behaviors and integration with both biological and artificial neural networks.<i>Significance</i>. VWM will push forward biocomputing, wetware computing, and organic intelligence by establishing a reliable form of interconnection between these systems. Furthermore, VWM has the potential to be applied in fields like therapeutic interventions that use directed neural plasticity to promote recovery from brain injury or disease responses. VWM enables complex<i>in vitro</i>models to be built with the same neural connectivity as in the human brain. VWM is versatile, placing it at the core of a transformational tool for experimental neuroscience, biocomputing, and translational research to bridge biological and digital systems.</p>","PeriodicalId":94096,"journal":{"name":"Journal of neural engineering","volume":"22 3","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144061343","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Tyler Singer-Clark, Xianda Hou, Nicholas S Card, Maitreyee Wairagkar, Carrina Iacobacci, Hamza Peracha, Leigh R Hochberg, Sergey D Stavisky, David M Brandman
{"title":"Speech motor cortex enables BCI cursor control and click.","authors":"Tyler Singer-Clark, Xianda Hou, Nicholas S Card, Maitreyee Wairagkar, Carrina Iacobacci, Hamza Peracha, Leigh R Hochberg, Sergey D Stavisky, David M Brandman","doi":"10.1088/1741-2552/add0e5","DOIUrl":"10.1088/1741-2552/add0e5","url":null,"abstract":"<p><p><i>Objective.</i>Decoding neural activity from ventral (speech) motor cortex is known to enable high-performance speech brain-computer interface (BCI) control. It was previously unknown whether this brain area could also enable computer control via neural cursor and click, as is typically associated with dorsal (arm and hand) motor cortex.<i>Approach.</i>We recruited a clinical trial participant with amyotrophic lateral sclerosis and implanted intracortical microelectrode arrays in ventral precentral gyrus (vPCG), which the participant used to operate a speech BCI in a prior study. We developed a cursor BCI driven by the participant's vPCG neural activity, and evaluated performance on a series of target selection tasks.<i>Main results.</i>The reported vPCG cursor BCI enabled rapidly-calibrating (40 s), accurate (2.90 bits per second) cursor control and click. The participant also used the BCI to control his own personal computer independently.<i>Significance.</i>These results suggest that placing electrodes in vPCG to optimize for speech decoding may also be a viable strategy for building a multi-modal BCI which enables both speech-based communication and computer control via cursor and click. (BrainGate2 ClinicalTrials.gov ID NCT00912041).</p>","PeriodicalId":94096,"journal":{"name":"Journal of neural engineering","volume":"22 3","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12208300/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144055789","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Tran Hiep Dinh, Avinash Kumar Singh, Quang Manh Doan, Nguyen Linh Trung, Diep N Nguyen, Chin-Teng Lin
{"title":"An EEG signal smoothing algorithm using upscale and downscale representation<sup />.","authors":"Tran Hiep Dinh, Avinash Kumar Singh, Quang Manh Doan, Nguyen Linh Trung, Diep N Nguyen, Chin-Teng Lin","doi":"10.1088/1741-2552/add297","DOIUrl":"https://doi.org/10.1088/1741-2552/add297","url":null,"abstract":"<p><p><i>Objective.</i>Effective smoothing of electroencephalogram (EEG) signals while maintaining the original signal's features is important in EEG signal analysis and brain-computer interface. This paper proposes a novel EEG signal-smoothing algorithm and its potential application in cognitive conflict (CC) processing.<i>Approach.</i>Instead of being processed in the time domain, the input signal is visualized in increasing line width, the representation frame of which is converted into a binary image. An effective thinning algorithm is employed to obtain a unit-width skeleton as the smoothed signal.<i>Main results.</i>Experimental results on data fitting have verified the effectiveness of the proposed approach on different levels of signal-to-noise (SNR) ratio, especially on high noise levels (SNR⩽5 dB), where our fitting error is only 86.4%-90.4% compared to that of its best counterpart. The potential application of the proposed algorithm in EEG-based CC processing is comprehensively evaluated in a classification and a visual inspection task. The employment of the proposed approach in pre-processing the input data has significantly boosted the<i>F</i><sub>1</sub>score of state-of-the-art models by more than 1%. The robustness of our algorithm is also evaluated via a visual inspection task, where specific CC peaks, i.e. the prediction error negativity and error-related positive potential (Pe), can be easily observed at multiple line-width levels, while the insignificant ones are eliminated.<i>Significance.</i>These results demonstrated not only the advance of the proposed approach but also its impact on classification accuracy enhancement.</p>","PeriodicalId":94096,"journal":{"name":"Journal of neural engineering","volume":"22 3","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144060697","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Phase-amplitude coupling within MEG data can identify eloquent cortex.","authors":"Srijita Das, Kevin Tyner, Stephen V Gliske","doi":"10.1088/1741-2552/add37c","DOIUrl":"https://doi.org/10.1088/1741-2552/add37c","url":null,"abstract":"<p><p><i>Objective.</i>Proper identification of eloquent cortices is essential to minimize post-surgical deficits in patients undergoing resection for epilepsy and tumors. Current methods are subjective, vary across centers, and require significant expertise, underscoring the need for more objective pre-surgical mapping. Phase-amplitude coupling (PAC), the interaction between the phase of low-frequency oscillations and the amplitude of high-frequency activity, has been implicated in task-induced brain activity and may serve as a biomarker for functional mapping. Our objective was to develop a novel PAC-based algorithm to non-invasively identify somatosensory eloquent cortex using magnetoencephalography (MEG) data in epilepsy patients.<i>Approach.</i>We analyzed somatosensory and spontaneous MEG recordings from 30 subjects with drug-resistant epilepsy. PAC was calculated on source-reconstructed data (5-12 Hz for low frequencies and 30-300 Hz for high frequencies), followed by rank-2 tensor decomposition. Density-based clustering compared active brain regions during somatosensory task and spontaneous data at a population level. We employed a linear mixed-effects model to quantify changes in PAC between somatosensory and resting-state data. We developed a patient-specific support vector machine (SVM) classifier to identify active brain regions based on PAC values during the somatosensory task.<i>Main results.</i>Five of six expected brain regions were active during left and right-sided stimulation (<i>p</i>=1.08×10-8, hypergeometric probability test). The mixed-effects model confirmed task-specific PAC in anatomically relevant brain regions (<i>p</i> < 0.01). The SVM classifier gave a specificity of 99.46% and a precision of 66.9%. These results demonstrate that the PAC algorithm reliably identifies somatosensory cortex activation at both individual and population levels with statistical significance.<i>Significance.</i>This study demonstrates the feasibility of using PAC as a non-invasive marker for identifying functionally relevant brain regions during somatosensory task in epilepsy patients. Future work will evaluate its applicability for mapping other eloquent cortices, including language, motor, and auditory areas.</p>","PeriodicalId":94096,"journal":{"name":"Journal of neural engineering","volume":"22 3","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144036021","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Simon Kojima, Benjamin Eren Kortenbach, Crispijn Aalberts, Sara Miloševska, Kim de Wit, Rosie Zheng, Shin'ichiro Kanoh, Mariacristina Musso, Michael Tangermann
{"title":"Influence of pitch modulation on event-related potentials elicited by Dutch word stimuli in a brain-computer interface language rehabilitation task.","authors":"Simon Kojima, Benjamin Eren Kortenbach, Crispijn Aalberts, Sara Miloševska, Kim de Wit, Rosie Zheng, Shin'ichiro Kanoh, Mariacristina Musso, Michael Tangermann","doi":"10.1088/1741-2552/adc83d","DOIUrl":"10.1088/1741-2552/adc83d","url":null,"abstract":"<p><p><i>Objective.</i>Recently, a novel language training using an auditory brain-computer interface (BCI) based on electroencephalogram recordings has been proposed for chronic stroke patients with aphasia. Tested with native German patients, it has shown significant and medium to large effect sizes in improving multiple aspects of language. During the training, the auditory BCI system delivers word stimuli using six spatially arranged loudspeakers. As delivering the word stimuli via headphones reduces spatial cues and makes the attention to target words more difficult, we investigate the influence of added pitch information. While pitch modulations have shown benefits for tone stimuli, they have not yet been investigated in the context of language stimuli.<i>Approach.</i>The study translated the German experimental setup into Dutch. Seventeen native Dutch speakers participated in a single session of an exploratory study. An incomplete Dutch sentence cued them to listen to a target word embedded into a sequence of comparable non-target words while an electroencephalogram was recorded. Four conditions were compared within-subject to investigate the influence of pitch modulation: presenting the words spatially from six loudspeakers without (<i>6D</i>) and with pitch modulation (<i>6D-Pitch</i>), via stereo headphones with simulated spatial cues and pitch modulation (<i>Stereo-Pitch</i>), and via headphones without spatial cues or pitch modulation (<i>Mono</i>).<i>Main results.</i>Comparing the 6D conditions of both language setups, the Dutch setup could be validated. For the Dutch setup, the binary AUC classification score in the 6D and the 6D-Pitch condition were 0.75 and 0.76, respectively, and adding pitch information did not significantly alter the binary classification accuracy of the event-related potential responses. The classification scores in the 6D condition and the Stereo-Pitch condition were on the same level.<i>Significance.</i>The competitive performance of pitch-modulated word stimuli suggests that the complex hardware setup of the 6D condition could be replaced by a headphone condition. If future studies with aphasia patients confirm the effectiveness and higher usability of a headphone-based language rehabilitation training, a simplified setup could be implemented more easily outside of clinics to deliver frequent training sessions to patients in need.</p>","PeriodicalId":94096,"journal":{"name":"Journal of neural engineering","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143775110","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Andrea Rozo, Shafiul Hasan, Zhe Zhang, Carlo Iorio, Carolina Varon, Xiao Hu
{"title":"Exploring neurovascular coupling in stroke patients: insights on linear and nonlinear dynamics using transfer entropy.","authors":"Andrea Rozo, Shafiul Hasan, Zhe Zhang, Carlo Iorio, Carolina Varon, Xiao Hu","doi":"10.1088/1741-2552/adce34","DOIUrl":"https://doi.org/10.1088/1741-2552/adce34","url":null,"abstract":"<p><p><i>Objective.</i>The study of neurovascular coupling (NVC), the relationship between neuronal activity and cerebral blood flow, is essential for understanding brain physiology in both healthy and pathological states. Current methods to study NVC include neuroimaging techniques with limited temporal resolution and indirect neuronal activity measures. Methods including electroencephalographic (EEG) data are predominantly linear and display limitations that nonlinear methods address. Transfer entropy (TE) explores linear and nonlinear relationships simultaneously. This study hypothesizes that complex NVC interactions in stroke patients, both linear and nonlinear, can be detected using TE.<i>Approach.</i>TE between simultaneously recorded EEG and cerebral blood flow velocity (CBFV) signals was computed and analyzed in three settings: ipsilateral (EEG and CBFV from same hemisphere) stroke and nonstroke, and contralateral (EEG from stroke hemisphere, CBFV from nonstroke hemisphere). A surrogate analysis was performed to evaluate the significance of TE values and to identify the nature of the interactions.<i>Main results.</i>The results showed that EEG generally influenced CBFV. There were more linear+nonlinear interactions in the ipsilateral nonstroke setting and in the delta band in ipsilateral stroke and contralateral settings. Interactions between EEG and CBFV were stronger on the nonstroke side for linear+nonlinear dynamics. The strength and nature of the interactions were weakly correlated with clinical outcomes (e.g. delta band (p<0.05): infarct growth linear = -0.448, linear+nonlinear = -0.339; NIHSS linear = -0.473, linear+nonlinear = -0.457).<i>Significance.</i>This study exemplifies the benefits of using TE in linear and nonlinear NVC analysis to better understand the implications of these dynamics in stroke severity.</p>","PeriodicalId":94096,"journal":{"name":"Journal of neural engineering","volume":"22 3","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144033309","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Ramped kilohertz-frequency signals produce nerve conduction block without onset response.","authors":"Edgar Peña, Nicole A Pelot, Warren M Grill","doi":"10.1088/1741-2552/add20e","DOIUrl":"10.1088/1741-2552/add20e","url":null,"abstract":"<p><p><i>Objective.</i>Reversible block of peripheral nerve conduction using kilohertz-frequency (KHF) electrical signals has substantial potential for treating diseases. However, onset response, i.e. KHF-induced excitation en route to producing nerve block, is an undesired outcome of neural block protocols. Previous studies of KHF nerve block observed increased onset responses when KHF signal amplitude was linearly ramped for up to 60 s at frequencies up to 30 kHz. Here, we evaluated the onset response across a broad range of ramp durations and frequencies.<i>Approach</i>. In experiments on the rat tibial nerve and biophysical axon models, we quantified nerve responses to linearly ramped KHF signals applied for durations from 16 to 512 s and at frequencies from 10 to 83.3 kHz. We also investigated the role of slow inactivation on onset response during linear ramps by using lacosamide to enhance slow inactivation pharmacologically and by introducing a slow inactivation gating variable in computational models.<i>Main results</i>. In experiments, sufficiently high frequencies (⩾20.8 kHz) with amplitudes that were ramped sufficiently slowly (4.4-570<i>μ</i>A s<sup>-1</sup>) generated conduction block without onset response, and increasing frequency enabled shorter ramps to block without onset response. Experimental use of lacosamide to enhance slow inactivation also eliminated onset response. In computational models, the effects of ramp duration/ramp rate on onset response only occurred after introducing a slow inactivation gating variable, and the models did not account for frequency effects.<i>Significance</i>. The results reveal, for the first time, the ability to use charge-balanced linearly ramped KHF signals to block without onset response. This novel approach enhances the precision of neural blocking protocols and enables coordinated neural control to restore organ function, such as in urinary control after spinal cord injury.</p>","PeriodicalId":94096,"journal":{"name":"Journal of neural engineering","volume":"22 3","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12139514/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144028425","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
M Asjid Tanveer, Jesper Jensen, Zheng-Hua Tan, Jan Østergaard
{"title":"Single-microphone deep envelope separation based auditory attention decoding for competing speech and music.","authors":"M Asjid Tanveer, Jesper Jensen, Zheng-Hua Tan, Jan Østergaard","doi":"10.1088/1741-2552/add0e7","DOIUrl":"https://doi.org/10.1088/1741-2552/add0e7","url":null,"abstract":"<p><p><i>Objective.</i>In this study, we introduce an end-to-end single microphone deep learning system for source separation and auditory attention decoding (AAD) in a competing speech and music setup. Deep source separation is applied directly on the envelope of the observed mixed audio signal. The resulting separated envelopes are compared to the envelope obtained from the electroencephalography (EEG) signals via deep stimulus reconstruction, where Pearson correlation is used as a loss function for training and evaluation.<i>Approach.</i>Deep learning models for source envelope separation and AAD are trained on target/distractor pairs from speech and music, covering four cases: speech vs. speech, speech vs. music, music vs. speech, and music vs. music. We convolve 10 different HRTFs with our audio signals to simulate the effects of head, torso and outer ear, and evaluate our model's ability to generalize. The models are trained (and evaluated) on 20 s time windows extracted from 60 s EEG trials.<i>Main results.</i>We achieve a target Pearson correlation and accuracy of 0.122% and 82.4% on the original dataset and an average target Pearson correlation and accuracy of 0.106% and 75.4% across the 10 HRTF variants. For the distractor, we achieve an average Pearson correlation of 0.004. Additionally, our model gives an accuracy of 82.8%, 85.8%, 79.7% and 81.5% across the four aforementioned cases for speech and music. With perfectly separated envelopes, we can achieve an accuracy of 83.0%, which is comparable to the case of source separated envelopes.<i>Significance.</i>We conclude that the deep learning models for source envelope separation and AAD generalize well across the set of speech and music signals and HRTFs tested in this study. We notice that source separation performs worse for a mixed music and speech signal, but the resulting AAD performance is not impacted.</p>","PeriodicalId":94096,"journal":{"name":"Journal of neural engineering","volume":"22 3","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144012883","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}