变分贝叶斯多通道鲁棒NMF用于可变形和部分遮挡麦克风阵列的人声增强

Yoshiaki Bando, Katsutoshi Itoyama, M. Konyo, S. Tadokoro, K. Nakadai, Kazuyoshi Yoshii, HIroshi G. Okuno
{"title":"变分贝叶斯多通道鲁棒NMF用于可变形和部分遮挡麦克风阵列的人声增强","authors":"Yoshiaki Bando, Katsutoshi Itoyama, M. Konyo, S. Tadokoro, K. Nakadai, Kazuyoshi Yoshii, HIroshi G. Okuno","doi":"10.1109/EUSIPCO.2016.7760402","DOIUrl":null,"url":null,"abstract":"This paper presents a human-voice enhancement method for a deformable and partially-occluded microphone array. Although microphone arrays distributed on the long bodies of hose-shaped rescue robots are crucial for finding victims under collapsed buildings, human voices captured by a microphone array are contaminated by non-stationary actuator and friction noise. Standard blind source separation methods cannot be used because the relative microphone positions change over time and some of them are occasionally shaded by rubble. To solve these problems, we develop a Bayesian model that separates multichannel amplitude spectrograms into sparse and low-rank components (human voice and noise) without using phase information, which depends on the array layout. The voice level at each microphone is estimated in a time-varying manner for reducing the influence of the shaded microphones. Experiments using a 3-m hose-shaped robot with eight microphones show that our method outperforms conventional methods by the signal-to-noise ratio of 2.7 dB.","PeriodicalId":127068,"journal":{"name":"2016 24th European Signal Processing Conference (EUSIPCO)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":"{\"title\":\"Variational Bayesian multi-channel robust NMF for human-voice enhancement with a deformable and partially-occluded microphone array\",\"authors\":\"Yoshiaki Bando, Katsutoshi Itoyama, M. Konyo, S. Tadokoro, K. Nakadai, Kazuyoshi Yoshii, HIroshi G. Okuno\",\"doi\":\"10.1109/EUSIPCO.2016.7760402\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper presents a human-voice enhancement method for a deformable and partially-occluded microphone array. Although microphone arrays distributed on the long bodies of hose-shaped rescue robots are crucial for finding victims under collapsed buildings, human voices captured by a microphone array are contaminated by non-stationary actuator and friction noise. Standard blind source separation methods cannot be used because the relative microphone positions change over time and some of them are occasionally shaded by rubble. To solve these problems, we develop a Bayesian model that separates multichannel amplitude spectrograms into sparse and low-rank components (human voice and noise) without using phase information, which depends on the array layout. The voice level at each microphone is estimated in a time-varying manner for reducing the influence of the shaded microphones. Experiments using a 3-m hose-shaped robot with eight microphones show that our method outperforms conventional methods by the signal-to-noise ratio of 2.7 dB.\",\"PeriodicalId\":127068,\"journal\":{\"name\":\"2016 24th European Signal Processing Conference (EUSIPCO)\",\"volume\":\"29 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-11-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"10\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 24th European Signal Processing Conference (EUSIPCO)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/EUSIPCO.2016.7760402\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 24th European Signal Processing Conference (EUSIPCO)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/EUSIPCO.2016.7760402","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 10

摘要

提出了一种针对可变形部分遮挡麦克风阵列的人声增强方法。尽管分布在软管形状的救援机器人长长的身体上的麦克风阵列对于在倒塌的建筑物下寻找受害者至关重要,但麦克风阵列捕获的人声受到非静止执行器和摩擦噪声的污染。标准的盲源分离方法不能使用,因为麦克风的相对位置会随着时间的推移而变化,其中一些麦克风偶尔会被碎石遮蔽。为了解决这些问题,我们开发了一个贝叶斯模型,该模型不使用依赖于阵列布局的相位信息,将多通道振幅谱图分离为稀疏和低秩分量(人的声音和噪声)。每个麦克风的声级以时变方式估计,以减少阴影麦克风的影响。通过一个3米长的软管型机器人和8个麦克风的实验表明,我们的方法比传统方法的信噪比高出2.7 dB。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Variational Bayesian multi-channel robust NMF for human-voice enhancement with a deformable and partially-occluded microphone array
This paper presents a human-voice enhancement method for a deformable and partially-occluded microphone array. Although microphone arrays distributed on the long bodies of hose-shaped rescue robots are crucial for finding victims under collapsed buildings, human voices captured by a microphone array are contaminated by non-stationary actuator and friction noise. Standard blind source separation methods cannot be used because the relative microphone positions change over time and some of them are occasionally shaded by rubble. To solve these problems, we develop a Bayesian model that separates multichannel amplitude spectrograms into sparse and low-rank components (human voice and noise) without using phase information, which depends on the array layout. The voice level at each microphone is estimated in a time-varying manner for reducing the influence of the shaded microphones. Experiments using a 3-m hose-shaped robot with eight microphones show that our method outperforms conventional methods by the signal-to-noise ratio of 2.7 dB.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信