Improving auditory attention decoding in noisy environments for listeners with hearing impairment through contrastive learning.

Gautam Sridhar, Sofía Boselli, Martin A Skoglund, Bo Bernhardsson, Emina Alickovic
{"title":"Improving auditory attention decoding in noisy environments for listeners with hearing impairment through contrastive learning.","authors":"Gautam Sridhar, Sofía Boselli, Martin A Skoglund, Bo Bernhardsson, Emina Alickovic","doi":"10.1088/1741-2552/ade28a","DOIUrl":null,"url":null,"abstract":"<p><p><i>Objective</i>. This study aimed to investigate the potential of contrastive learning to improve auditory attention decoding (AAD) using electroencephalography (EEG) data in challenging cocktail-party scenarios with competing speech and background noise.<i>Approach</i>. Three different models were implemented for comparison: a baseline linear model (LM), a non-LM without contrastive learning (NLM), and a non-LM with contrastive learning (NLMwCL). The EEG data and speech envelopes were used to train these models. The NLMwCL model used SigLIP, a variant of CLIP loss, to embed the data. The speech envelopes were reconstructed from the models and compared with the attended and ignored speech envelopes to assess reconstruction accuracy, measured as the correlation between the reconstructed and actual speech envelopes. These reconstruction accuracies were then compared to classify attention. All models were evaluated in 34 listeners with hearing impairment.<i>Results</i>. The reconstruction accuracy for attended and ignored speech, along with attention classification accuracy, was calculated for each model across various time windows. The NLMwCL consistently outperformed the other models in both speech reconstruction and attention classification. For a 3-second time window, the NLMwCL model achieved a mean attended speech reconstruction accuracy of 0.105 and a mean attention classification accuracy of 68.0%, while the NLM model scored 0.096 and 64.4%, and the LM achieved 0.084 and 62.6%, respectively.<i>Significance</i>. These findings demonstrate the promise of contrastive learning in improving AAD and highlight the potential of EEG-based tools for clinical applications, and progress in hearing technology, particularly in the design of new neuro-steered signal processing algorithms.</p>","PeriodicalId":94096,"journal":{"name":"Journal of neural engineering","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2025-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of neural engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1088/1741-2552/ade28a","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Objective. This study aimed to investigate the potential of contrastive learning to improve auditory attention decoding (AAD) using electroencephalography (EEG) data in challenging cocktail-party scenarios with competing speech and background noise.Approach. Three different models were implemented for comparison: a baseline linear model (LM), a non-LM without contrastive learning (NLM), and a non-LM with contrastive learning (NLMwCL). The EEG data and speech envelopes were used to train these models. The NLMwCL model used SigLIP, a variant of CLIP loss, to embed the data. The speech envelopes were reconstructed from the models and compared with the attended and ignored speech envelopes to assess reconstruction accuracy, measured as the correlation between the reconstructed and actual speech envelopes. These reconstruction accuracies were then compared to classify attention. All models were evaluated in 34 listeners with hearing impairment.Results. The reconstruction accuracy for attended and ignored speech, along with attention classification accuracy, was calculated for each model across various time windows. The NLMwCL consistently outperformed the other models in both speech reconstruction and attention classification. For a 3-second time window, the NLMwCL model achieved a mean attended speech reconstruction accuracy of 0.105 and a mean attention classification accuracy of 68.0%, while the NLM model scored 0.096 and 64.4%, and the LM achieved 0.084 and 62.6%, respectively.Significance. These findings demonstrate the promise of contrastive learning in improving AAD and highlight the potential of EEG-based tools for clinical applications, and progress in hearing technology, particularly in the design of new neuro-steered signal processing algorithms.

通过对比学习提高听力障碍听者在嘈杂环境中的听觉注意解码。
目的:利用脑电图数据,探讨在具有竞争性语音和背景噪声的鸡尾酒会情境下,对比学习对听觉注意解码(AAD)的改善潜力。方法:采用三种不同的模型进行比较:基线线性模型(LM)、不含对比学习的非线性模型(NLM)和含对比学习的非线性模型(NLMwCL)。利用脑电数据和语音包络对模型进行训练。NLMwCL模型使用了SigLIP (CLIP loss的一种变体)来嵌入数据。从模型中重建语音信封,并与出席和忽略的语音信封进行比较,以评估重建的准确性,以重建的语音信封与实际语音信封的相关性来衡量。然后比较这些重建的准确性来分类注意力。在34名听力受损的听者中对所有模型进行了评估。结果:计算了每个模型在不同时间窗下对注意和忽略语音的重构精度以及注意分类精度。NLMwCL在语音重构和注意分类方面均优于其他模型。在3秒时间窗下,NLMwCL模型的平均出席语音重构准确率为0.105,平均注意分类准确率为68.0%,NLM模型的平均出席语音重构准确率为0.096,平均注意力分类准确率为64.4%,LM模型的平均注意力分类准确率为0.084,平均注意力分类准确率为62.6%。意义:这些发现证明了对比学习在改善AAD方面的前景,并强调了基于脑电图的工具在临床应用中的潜力,以及听力技术的进步,特别是在设计新的神经导向信号处理算法方面。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信