视频会议中自动摄像机指向系统的语音源定位

H. Wang, P. Chu
{"title":"视频会议中自动摄像机指向系统的语音源定位","authors":"H. Wang, P. Chu","doi":"10.1109/ASPAA.1997.625639","DOIUrl":null,"url":null,"abstract":"This paper describes the voice source localization algorithm used in the PictureTel automatic camera pointing system (LimeLight/sup TM/, dynamic speech locating technology). The system uses an array of 46 cm wide and 30 cm high, which contains 4 microphones, and is mounted on top of the monitor. The three dimensional position of a sound source is calculated from the time delays of 4 pairs of microphones. In time delay estimation, the averaging of signal onsets of each frequency band is combined with phase correlation to reduce the influence of noise and reverberation. With this approach, it is possible to provide reliable three dimensional voice source localization by a small microphone array. Post processing based on a priori knowledge is also introduced to eliminate the influences of reflections from furniture such as tables. Results of speech source localization under real conference room conditions are given. Some system related issues are also discussed.","PeriodicalId":347087,"journal":{"name":"Proceedings of 1997 Workshop on Applications of Signal Processing to Audio and Acoustics","volume":"55 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1997-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"74","resultStr":"{\"title\":\"Voice source localization for automatic camera pointing system in videoconferencing\",\"authors\":\"H. Wang, P. Chu\",\"doi\":\"10.1109/ASPAA.1997.625639\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper describes the voice source localization algorithm used in the PictureTel automatic camera pointing system (LimeLight/sup TM/, dynamic speech locating technology). The system uses an array of 46 cm wide and 30 cm high, which contains 4 microphones, and is mounted on top of the monitor. The three dimensional position of a sound source is calculated from the time delays of 4 pairs of microphones. In time delay estimation, the averaging of signal onsets of each frequency band is combined with phase correlation to reduce the influence of noise and reverberation. With this approach, it is possible to provide reliable three dimensional voice source localization by a small microphone array. Post processing based on a priori knowledge is also introduced to eliminate the influences of reflections from furniture such as tables. Results of speech source localization under real conference room conditions are given. Some system related issues are also discussed.\",\"PeriodicalId\":347087,\"journal\":{\"name\":\"Proceedings of 1997 Workshop on Applications of Signal Processing to Audio and Acoustics\",\"volume\":\"55 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1997-10-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"74\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of 1997 Workshop on Applications of Signal Processing to Audio and Acoustics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ASPAA.1997.625639\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of 1997 Workshop on Applications of Signal Processing to Audio and Acoustics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ASPAA.1997.625639","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 74

摘要

本文介绍了PictureTel自动摄像机指向系统中使用的语音源定位算法(LimeLight/sup TM/,动态语音定位技术)。该系统使用宽46厘米、高30厘米的阵列,其中包含4个麦克风,安装在监视器的顶部。声源的三维位置由4对传声器的延时计算得到。在时延估计中,将各频段信号初始值的平均与相位相关相结合,降低了噪声和混响的影响。利用这种方法,可以通过一个小型麦克风阵列提供可靠的三维声源定位。还引入了基于先验知识的后处理,以消除家具(如桌子)反射的影响。给出了真实会议室条件下的语音源定位结果。并对系统相关问题进行了讨论。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Voice source localization for automatic camera pointing system in videoconferencing
This paper describes the voice source localization algorithm used in the PictureTel automatic camera pointing system (LimeLight/sup TM/, dynamic speech locating technology). The system uses an array of 46 cm wide and 30 cm high, which contains 4 microphones, and is mounted on top of the monitor. The three dimensional position of a sound source is calculated from the time delays of 4 pairs of microphones. In time delay estimation, the averaging of signal onsets of each frequency band is combined with phase correlation to reduce the influence of noise and reverberation. With this approach, it is possible to provide reliable three dimensional voice source localization by a small microphone array. Post processing based on a priori knowledge is also introduced to eliminate the influences of reflections from furniture such as tables. Results of speech source localization under real conference room conditions are given. Some system related issues are also discussed.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信