欠确定混响环境中基于期望最大化的室内脉冲响应重塑

IF 3.1 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Yuan Xie , Tao Zou , Junjie Yang , Weijun Sun , Shengli Xie
{"title":"欠确定混响环境中基于期望最大化的室内脉冲响应重塑","authors":"Yuan Xie ,&nbsp;Tao Zou ,&nbsp;Junjie Yang ,&nbsp;Weijun Sun ,&nbsp;Shengli Xie","doi":"10.1016/j.csl.2024.101664","DOIUrl":null,"url":null,"abstract":"<div><p>Source separation in an underdetermined reverberation environment is a very challenging issue. The classical method is based on the expectation–maximization algorithm. However, it is limited to high reverberation environments, resulting in bad or even invalid separation performance. To eliminate this restriction, a room impulse response reshaping-based expectation–maximization method is designed to solve the problem of source separation in an underdetermined reverberant environment. Firstly, a room impulse response reshaping technology is designed to eliminate the influence of audible echo on the reverberant environment, improving the quality of the received signals. Then, a new mathematical model of time-frequency mixing signals is established to reduce the approximation error of model transformation caused by high reverberation. Furthermore, an improved expectation–maximization method is proposed for real-time update learning rules of model parameters, and then the sources are separated using the estimators provided by the improved expectation–maximization method. Experimental results based on source separation of speech and music mixtures demonstrate that the proposed algorithm achieves better separation performance while maintaining much better robustness than popular expectation–maximization methods.</p></div>","PeriodicalId":50638,"journal":{"name":"Computer Speech and Language","volume":null,"pages":null},"PeriodicalIF":3.1000,"publicationDate":"2024-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Room impulse response reshaping-based expectation–maximization in an underdetermined reverberant environment\",\"authors\":\"Yuan Xie ,&nbsp;Tao Zou ,&nbsp;Junjie Yang ,&nbsp;Weijun Sun ,&nbsp;Shengli Xie\",\"doi\":\"10.1016/j.csl.2024.101664\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Source separation in an underdetermined reverberation environment is a very challenging issue. The classical method is based on the expectation–maximization algorithm. However, it is limited to high reverberation environments, resulting in bad or even invalid separation performance. To eliminate this restriction, a room impulse response reshaping-based expectation–maximization method is designed to solve the problem of source separation in an underdetermined reverberant environment. Firstly, a room impulse response reshaping technology is designed to eliminate the influence of audible echo on the reverberant environment, improving the quality of the received signals. Then, a new mathematical model of time-frequency mixing signals is established to reduce the approximation error of model transformation caused by high reverberation. Furthermore, an improved expectation–maximization method is proposed for real-time update learning rules of model parameters, and then the sources are separated using the estimators provided by the improved expectation–maximization method. Experimental results based on source separation of speech and music mixtures demonstrate that the proposed algorithm achieves better separation performance while maintaining much better robustness than popular expectation–maximization methods.</p></div>\",\"PeriodicalId\":50638,\"journal\":{\"name\":\"Computer Speech and Language\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":3.1000,\"publicationDate\":\"2024-05-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computer Speech and Language\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0885230824000470\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Speech and Language","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0885230824000470","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

摘要

在混响不确定的环境中进行声源分离是一个非常具有挑战性的问题。经典方法基于期望最大化算法。然而,这种方法仅限于高混响环境,导致分离效果不佳甚至无效。为了消除这一限制,我们设计了一种基于房间脉冲响应重塑的期望最大化方法,以解决混响不确定环境下的声源分离问题。首先,设计了一种房间脉冲响应重塑技术,以消除可听回声对混响环境的影响,提高接收信号的质量。然后,建立了一种新的时频混合信号数学模型,以减少高混响引起的模型变换近似误差。此外,还提出了一种改进的期望最大化方法,用于实时更新模型参数的学习规则,然后利用改进的期望最大化方法提供的估计值进行声源分离。基于语音和音乐混合物声源分离的实验结果表明,与流行的期望最大化方法相比,所提出的算法既能实现更好的分离性能,又能保持更好的鲁棒性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Room impulse response reshaping-based expectation–maximization in an underdetermined reverberant environment

Source separation in an underdetermined reverberation environment is a very challenging issue. The classical method is based on the expectation–maximization algorithm. However, it is limited to high reverberation environments, resulting in bad or even invalid separation performance. To eliminate this restriction, a room impulse response reshaping-based expectation–maximization method is designed to solve the problem of source separation in an underdetermined reverberant environment. Firstly, a room impulse response reshaping technology is designed to eliminate the influence of audible echo on the reverberant environment, improving the quality of the received signals. Then, a new mathematical model of time-frequency mixing signals is established to reduce the approximation error of model transformation caused by high reverberation. Furthermore, an improved expectation–maximization method is proposed for real-time update learning rules of model parameters, and then the sources are separated using the estimators provided by the improved expectation–maximization method. Experimental results based on source separation of speech and music mixtures demonstrate that the proposed algorithm achieves better separation performance while maintaining much better robustness than popular expectation–maximization methods.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Computer Speech and Language
Computer Speech and Language 工程技术-计算机:人工智能
CiteScore
11.30
自引率
4.70%
发文量
80
审稿时长
22.9 weeks
期刊介绍: Computer Speech & Language publishes reports of original research related to the recognition, understanding, production, coding and mining of speech and language. The speech and language sciences have a long history, but it is only relatively recently that large-scale implementation of and experimentation with complex models of speech and language processing has become feasible. Such research is often carried out somewhat separately by practitioners of artificial intelligence, computer science, electronic engineering, information retrieval, linguistics, phonetics, or psychology.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信