Spatial Coding for Microphone Arrays Using Ipnlms-Based RTF Estimation

Daniel T. Jones, D. Sharma, S. Kruchinin, P. Naylor
{"title":"Spatial Coding for Microphone Arrays Using Ipnlms-Based RTF Estimation","authors":"Daniel T. Jones, D. Sharma, S. Kruchinin, P. Naylor","doi":"10.1109/WASPAA52581.2021.9632747","DOIUrl":null,"url":null,"abstract":"We propose a method for encoding multichannel microphone array signals and show that our proposed algorithm can operate effectively at very low bitrates. Our approach leverages the high interchannel correlations that arise from the close proximity of microphones in an array to compactly represent the signals. An $M$ channel microphone array signal is encoded as one reference signal and $M-1$ Relative Transfer Functions (RTFs). When the RTFs require updating only infrequently, a significant reduction in data-rate is obtained. Applications of interest include cloud-based beamforming and End-to-End Automatic Speech Recognition (ASR) systems. The efficiency of this encoding enables multichannel audio to be transmitted to the cloud at very low bitrates. A system has been developed that estimates, and periodically updates, the RTFs between each channel of the array and a chosen reference channel using an Improved Proportionate Normalized Least Mean Squares (IPNLMS) adaptive filter. The proposed system is experimentally evaluated in comparison with the Opus codec. It achieves equal ΔPESQ performance with a data-rate reduction of up to 90% and un-degraded Word Error Rate (WER) down to bitrates as low as 3.3 kbps.","PeriodicalId":429900,"journal":{"name":"2021 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)","volume":"46 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/WASPAA52581.2021.9632747","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

We propose a method for encoding multichannel microphone array signals and show that our proposed algorithm can operate effectively at very low bitrates. Our approach leverages the high interchannel correlations that arise from the close proximity of microphones in an array to compactly represent the signals. An $M$ channel microphone array signal is encoded as one reference signal and $M-1$ Relative Transfer Functions (RTFs). When the RTFs require updating only infrequently, a significant reduction in data-rate is obtained. Applications of interest include cloud-based beamforming and End-to-End Automatic Speech Recognition (ASR) systems. The efficiency of this encoding enables multichannel audio to be transmitted to the cloud at very low bitrates. A system has been developed that estimates, and periodically updates, the RTFs between each channel of the array and a chosen reference channel using an Improved Proportionate Normalized Least Mean Squares (IPNLMS) adaptive filter. The proposed system is experimentally evaluated in comparison with the Opus codec. It achieves equal ΔPESQ performance with a data-rate reduction of up to 90% and un-degraded Word Error Rate (WER) down to bitrates as low as 3.3 kbps.
基于ipnlms的RTF估计的麦克风阵列空间编码
我们提出了一种多通道麦克风阵列信号的编码方法,并表明我们提出的算法可以在非常低的比特率下有效地工作。我们的方法利用了由于阵列中麦克风的接近而产生的高通道间相关性来紧凑地表示信号。一个$M$通道麦克风阵列信号被编码为一个参考信号和$M-1$相对传递函数(rtf)。当rtf只需要不频繁地更新时,数据速率会显著降低。感兴趣的应用包括基于云的波束成形和端到端自动语音识别(ASR)系统。这种编码的效率使多通道音频能够以非常低的比特率传输到云。已经开发了一个系统,该系统使用改进的比例归一化最小均方(IPNLMS)自适应滤波器估计并定期更新阵列的每个通道和所选参考通道之间的rtf。该系统与Opus编解码器进行了实验比较。它实现了相同的ΔPESQ性能,数据速率降低高达90%,未降级的字错误率(WER)降至3.3 kbps的比特率。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信