Vector-Quantized Zero-Delay Deep Autoencoders for the Compression of Electrical Stimulation Patterns of Cochlear Implants using STOI

Reemt Hinrichs, Felix Ortmann, Jörn Ostermann
{"title":"Vector-Quantized Zero-Delay Deep Autoencoders for the Compression of Electrical Stimulation Patterns of Cochlear Implants using STOI","authors":"Reemt Hinrichs, Felix Ortmann, Jörn Ostermann","doi":"10.1109/IECBES54088.2022.10079466","DOIUrl":null,"url":null,"abstract":"Cochlear implants (CIs) are battery-powered, surgically implanted hearing-aids capable of restoring a sense of hearing in people suffering from moderate to profound hearing loss. Wireless transmission of audio from or to signal processors of cochlear implants can be used to improve speech understanding and localization of CI users. Data compression algorithms can be used to conserve battery power in this wireless transmission. However, very low latency is a strict requirement, limiting severly the available source coding algorithms. Previously, instead of coding the audio, coding of the electrical stimulation patterns of CIs was proposed to optimize the trade-off between bit-rate, latency and quality. In this work, a zero-delay deep autoencoder (DAE) for the coding of the electrical stimulation patters of CIs is proposed. Combining for the first time bayesian optimization with numerical approximated gradients of a nondifferential speech intelligibility measure for CIs, the short-time intelligibility measure (STOI), an optimized DAE architecture was found and trained that achieved equal or superior speech understanding at zero delay, outperforming well-known audio codecs. The DAE achieved reference vocoder STOI scores at 13.5 kbit/s compared to 33.6 kbit/s for Opus and 24.5 kbit/s for AMR-WB.","PeriodicalId":146681,"journal":{"name":"2022 IEEE-EMBS Conference on Biomedical Engineering and Sciences (IECBES)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE-EMBS Conference on Biomedical Engineering and Sciences (IECBES)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IECBES54088.2022.10079466","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

Cochlear implants (CIs) are battery-powered, surgically implanted hearing-aids capable of restoring a sense of hearing in people suffering from moderate to profound hearing loss. Wireless transmission of audio from or to signal processors of cochlear implants can be used to improve speech understanding and localization of CI users. Data compression algorithms can be used to conserve battery power in this wireless transmission. However, very low latency is a strict requirement, limiting severly the available source coding algorithms. Previously, instead of coding the audio, coding of the electrical stimulation patterns of CIs was proposed to optimize the trade-off between bit-rate, latency and quality. In this work, a zero-delay deep autoencoder (DAE) for the coding of the electrical stimulation patters of CIs is proposed. Combining for the first time bayesian optimization with numerical approximated gradients of a nondifferential speech intelligibility measure for CIs, the short-time intelligibility measure (STOI), an optimized DAE architecture was found and trained that achieved equal or superior speech understanding at zero delay, outperforming well-known audio codecs. The DAE achieved reference vocoder STOI scores at 13.5 kbit/s compared to 33.6 kbit/s for Opus and 24.5 kbit/s for AMR-WB.
矢量量化零延迟深度自编码器在人工耳蜗电刺激模式压缩中的应用
人工耳蜗(CIs)是一种通过手术植入的电池供电的助听器,能够帮助患有中度到重度听力损失的人恢复听力。通过人工耳蜗信号处理器之间的音频无线传输,可以提高人工耳蜗用户的语音理解和定位能力。在这种无线传输中,可以使用数据压缩算法来节省电池电量。然而,非常低的延迟是一个严格的要求,严重限制了可用的源编码算法。以前,为了优化比特率、延迟和质量之间的权衡,提出了对ci的电刺激模式进行编码,而不是对音频进行编码。在这项工作中,提出了一个零延迟深度自编码器(DAE)编码的电刺激模式的ci。首次将贝叶斯优化与CIs的非差分语音可理解度度量(短时可理解度度量(STOI))的数值近似梯度相结合,发现并训练了一个优化的DAE架构,该架构在零延迟下实现了同等或更好的语音理解,优于知名的音频编解码器。DAE实现了参考声码器STOI分数为13.5 kbit/s,而Opus为33.6 kbit/s, AMR-WB为24.5 kbit/s。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信