语音去噪部分多通道均衡的正则化

IEEE Transactions on Audio Speech and Language Processing Pub Date : 2013-09-01 DOI:10.1109/TASL.2013.2260743

I. Kodrasi, Stefan Goetze, S. Doclo

{"title":"语音去噪部分多通道均衡的正则化","authors":"I. Kodrasi, Stefan Goetze, S. Doclo","doi":"10.1109/TASL.2013.2260743","DOIUrl":null,"url":null,"abstract":"Acoustic multichannel equalization techniques such as the multiple-input/output inverse theorem (MINT), which aim to equalize the room impulse responses (RIRs) between the source and the microphone array, are known to be highly sensitive to RIR estimation errors. To increase robustness, it has been proposed to incorporate regularization in order to decrease the energy of the equalization filters. In addition, more robust partial multichannel equalization techniques such as relaxed multichannel least-squares (RMCLS) and channel shortening (CS) have recently been proposed. In this paper, we propose a partial multichannel equalization technique based on MINT (P-MINT) which aims to shorten the RIR. Furthermore, we investigate the effectiveness of incorporating regularization to further increase the robustness of P-MINT and the aforementioned partial multichannel equalization techniques, i.e., RMCLS and CS. In addition, we introduce an automatic non-intrusive procedure for determining the regularization parameter based on the L-curve. Simulation results using measured RIRs show that incorporating regularization in P-MINT yields a significant performance improvement in the presence of RIR estimation errors, whereas a smaller performance improvement is observed when incorporating regularization in RMCLS and CS. Furthermore, it is shown that the intrusively regularized P-MINT technique outperforms all other investigated intrusively regularized multichannel equalization techniques in terms of perceptual speech quality (PESQ). Finally, it is shown that the automatic non-intrusive regularization parameter in regularized P-MINT leads to a very similar performance as the intrusively determined optimal regularization parameter, making regularized P-MINT a robust, perceptually advantageous, and practically applicable multichannel equalization technique for speech dereverberation.","PeriodicalId":55014,"journal":{"name":"IEEE Transactions on Audio Speech and Language Processing","volume":"21 1","pages":"1879-1890"},"PeriodicalIF":0.0000,"publicationDate":"2013-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/TASL.2013.2260743","citationCount":"67","resultStr":"{\"title\":\"Regularization for Partial Multichannel Equalization for Speech Dereverberation\",\"authors\":\"I. Kodrasi, Stefan Goetze, S. Doclo\",\"doi\":\"10.1109/TASL.2013.2260743\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Acoustic multichannel equalization techniques such as the multiple-input/output inverse theorem (MINT), which aim to equalize the room impulse responses (RIRs) between the source and the microphone array, are known to be highly sensitive to RIR estimation errors. To increase robustness, it has been proposed to incorporate regularization in order to decrease the energy of the equalization filters. In addition, more robust partial multichannel equalization techniques such as relaxed multichannel least-squares (RMCLS) and channel shortening (CS) have recently been proposed. In this paper, we propose a partial multichannel equalization technique based on MINT (P-MINT) which aims to shorten the RIR. Furthermore, we investigate the effectiveness of incorporating regularization to further increase the robustness of P-MINT and the aforementioned partial multichannel equalization techniques, i.e., RMCLS and CS. In addition, we introduce an automatic non-intrusive procedure for determining the regularization parameter based on the L-curve. Simulation results using measured RIRs show that incorporating regularization in P-MINT yields a significant performance improvement in the presence of RIR estimation errors, whereas a smaller performance improvement is observed when incorporating regularization in RMCLS and CS. Furthermore, it is shown that the intrusively regularized P-MINT technique outperforms all other investigated intrusively regularized multichannel equalization techniques in terms of perceptual speech quality (PESQ). Finally, it is shown that the automatic non-intrusive regularization parameter in regularized P-MINT leads to a very similar performance as the intrusively determined optimal regularization parameter, making regularized P-MINT a robust, perceptually advantageous, and practically applicable multichannel equalization technique for speech dereverberation.\",\"PeriodicalId\":55014,\"journal\":{\"name\":\"IEEE Transactions on Audio Speech and Language Processing\",\"volume\":\"21 1\",\"pages\":\"1879-1890\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2013-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://sci-hub-pdf.com/10.1109/TASL.2013.2260743\",\"citationCount\":\"67\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Audio Speech and Language Processing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/TASL.2013.2260743\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Audio Speech and Language Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/TASL.2013.2260743","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 67

摘要

声学多通道均衡技术，如多输入/输出逆定理(MINT)，旨在均衡源和麦克风阵列之间的房间脉冲响应(RIR)，已知对RIR估计误差高度敏感。为了提高鲁棒性，提出了加入正则化以降低均衡滤波器的能量。此外，最近还提出了更鲁棒的部分多通道均衡技术，如松弛多通道最小二乘(RMCLS)和信道缩短(CS)。在本文中，我们提出了一种基于MINT (P-MINT)的部分多通道均衡技术，旨在缩短RIR。此外，我们研究了纳入正则化以进一步提高P-MINT和上述部分多通道均衡技术(即RMCLS和CS)的鲁棒性的有效性。此外，我们还介绍了一种基于l曲线确定正则化参数的自动非侵入程序。使用测量RIR的仿真结果表明，在存在RIR估计误差的情况下，在P-MINT中加入正则化可以显著提高性能，而在RMCLS和CS中加入正则化时，可以观察到较小的性能改进。此外，研究表明，在感知语音质量(PESQ)方面，入侵式正则化P-MINT技术优于所有其他研究过的入侵式正则化多通道均衡技术。最后，研究表明，正则化P-MINT中的自动非侵入性正则化参数与入侵确定的最优正则化参数具有非常相似的性能，使正则化P-MINT成为一种鲁棒的、感知上有利的、实际适用的多通道语音去噪均衡技术。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Regularization for Partial Multichannel Equalization for Speech Dereverberation

Acoustic multichannel equalization techniques such as the multiple-input/output inverse theorem (MINT), which aim to equalize the room impulse responses (RIRs) between the source and the microphone array, are known to be highly sensitive to RIR estimation errors. To increase robustness, it has been proposed to incorporate regularization in order to decrease the energy of the equalization filters. In addition, more robust partial multichannel equalization techniques such as relaxed multichannel least-squares (RMCLS) and channel shortening (CS) have recently been proposed. In this paper, we propose a partial multichannel equalization technique based on MINT (P-MINT) which aims to shorten the RIR. Furthermore, we investigate the effectiveness of incorporating regularization to further increase the robustness of P-MINT and the aforementioned partial multichannel equalization techniques, i.e., RMCLS and CS. In addition, we introduce an automatic non-intrusive procedure for determining the regularization parameter based on the L-curve. Simulation results using measured RIRs show that incorporating regularization in P-MINT yields a significant performance improvement in the presence of RIR estimation errors, whereas a smaller performance improvement is observed when incorporating regularization in RMCLS and CS. Furthermore, it is shown that the intrusively regularized P-MINT technique outperforms all other investigated intrusively regularized multichannel equalization techniques in terms of perceptual speech quality (PESQ). Finally, it is shown that the automatic non-intrusive regularization parameter in regularized P-MINT leads to a very similar performance as the intrusively determined optimal regularization parameter, making regularized P-MINT a robust, perceptually advantageous, and practically applicable multichannel equalization technique for speech dereverberation.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Transactions on Audio Speech and Language Processing 工程技术-工程：电子与电气

自引率

0.00%

发文量

审稿时长

24.0 months

期刊介绍： The IEEE Transactions on Audio, Speech and Language Processing covers the sciences, technologies and applications relating to the analysis, coding, enhancement, recognition and synthesis of audio, music, speech and language. In particular, audio processing also covers auditory modeling, acoustic modeling and source separation. Speech processing also covers speech production and perception, adaptation, lexical modeling and speaker recognition. Language processing also covers spoken language understanding, translation, summarization, mining, general language modeling, as well as spoken dialog systems.