2008 Hands-Free Speech Communication and Microphone Arrays最新文献

筛选
英文 中文
A Comparative Study of Adaptation-Mode Control for Generalized Sidelobe Cancellers in Human-Robot Communication 人机通信中广义旁瓣对消器自适应控制的比较研究
2008 Hands-Free Speech Communication and Microphone Arrays Pub Date : 2008-05-06 DOI: 10.1109/HSCMA.2008.4538676
A. Sugiyama, Thanh Phong Hua, R. Le Bouquin Jeanne, G. Faucon
{"title":"A Comparative Study of Adaptation-Mode Control for Generalized Sidelobe Cancellers in Human-Robot Communication","authors":"A. Sugiyama, Thanh Phong Hua, R. Le Bouquin Jeanne, G. Faucon","doi":"10.1109/HSCMA.2008.4538676","DOIUrl":"https://doi.org/10.1109/HSCMA.2008.4538676","url":null,"abstract":"This paper presents a comparative study of adaptation-mode control (AMC) for generalized sidelobe cancellers in human-robot communication. Performance of recently proposed two AMC structures, namely, NBM-SLBM (nested blocking matrix-symmetric leaky blocking matrix) and M-SLBM (multiple symmetric leaky blocking matrix), are evaluated by computer simulations and in a real environment. In the computer simulations, it is shown that M-SLBM exhibits superior performance to NBM-SLBM. However, in the real environment, the performance of M-SLBM is degraded. This degradation comes from unexpected tonal interference in a frequency range covered by an SLBM, leading to errors. An appropriate selection between NBM-SLBM and M-SLBM is necessary based on the environment.","PeriodicalId":129827,"journal":{"name":"2008 Hands-Free Speech Communication and Microphone Arrays","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-05-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115400713","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A DOA Based Speaker Diarization System for Real Meetings 真实会议中基于DOA的说话人分类系统
2008 Hands-Free Speech Communication and Microphone Arrays Pub Date : 2008-05-06 DOI: 10.1109/HSCMA.2008.4538680
S. Araki, M. Fujimoto, K. Ishizuka, H. Sawada, S. Makino
{"title":"A DOA Based Speaker Diarization System for Real Meetings","authors":"S. Araki, M. Fujimoto, K. Ishizuka, H. Sawada, S. Makino","doi":"10.1109/HSCMA.2008.4538680","DOIUrl":"https://doi.org/10.1109/HSCMA.2008.4538680","url":null,"abstract":"This paper presents a speaker diarization system that estimates who spoke when in a meeting. Our proposed system is realized by using a noise robust voice activity detector (VAD), a direction of arrival (DOA) estimator, and a DOA classifier. Our previous system utilized the generalized cross correlation method with the phase transform (GCC-PHAT) approach for the DOA estimation. Because the GCC-PHAT can estimate just one DOA per frame, it was difficult to handle speaker overlaps. This paper tries to deal with this issue by employing a DOA at each time-frequency slot (TFDOA), and reports how it improves diarization performance for real meetings / conversations recorded in a room with a reverberation time of 350 ms.","PeriodicalId":129827,"journal":{"name":"2008 Hands-Free Speech Communication and Microphone Arrays","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-05-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121117131","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 33
Directional Audio Coding Using Planar Microphone Arrays 使用平面麦克风阵列的定向音频编码
2008 Hands-Free Speech Communication and Microphone Arrays Pub Date : 2008-05-06 DOI: 10.1109/HSCMA.2008.4538682
F. Kuech, M. Kallinger, R. Schultz-Amling, G. del Galdo, J. Ahonen, V. Pulkki
{"title":"Directional Audio Coding Using Planar Microphone Arrays","authors":"F. Kuech, M. Kallinger, R. Schultz-Amling, G. del Galdo, J. Ahonen, V. Pulkki","doi":"10.1109/HSCMA.2008.4538682","DOIUrl":"https://doi.org/10.1109/HSCMA.2008.4538682","url":null,"abstract":"Multichannel sound systems become more and more established in modern audio applications. Consequently, the recording and the reproduction of spatial audio gains increasing attention. Directional Audio Coding (DirAC) represents an efficient approach to analyze spatial sound and to reproduce it using arbitrary loudspeaker configurations. In DirAC, the direction-of-arrival and the diffuseness of sound within frequency subbands is used to encode the spatial properties of the observed sound field. The estimation of these parameters is based on an energetic sound field analysis using three- dimensional microphone arrays. In practice, however, physical design constraints make three-dimensional microphone configurations often not acceptable. In this paper, we consider a new approach to microphone array processing that allows for an estimation of both direction-of-arrival of sound and diffuseness based on planar microphone configurations. The performance of the proposed method is evaluated via simulations and real measured data.","PeriodicalId":129827,"journal":{"name":"2008 Hands-Free Speech Communication and Microphone Arrays","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-05-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124887876","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Fast Dereverberation for Hands-Free Speech Recognition 快速去噪免提语音识别
2008 Hands-Free Speech Communication and Microphone Arrays Pub Date : 2008-05-06 DOI: 10.1109/HSCMA.2008.4538706
R. Gomez, J. Even, H. Saruwatari, K. Shikano
{"title":"Fast Dereverberation for Hands-Free Speech Recognition","authors":"R. Gomez, J. Even, H. Saruwatari, K. Shikano","doi":"10.1109/HSCMA.2008.4538706","DOIUrl":"https://doi.org/10.1109/HSCMA.2008.4538706","url":null,"abstract":"A robust dereverberation technique for real-time hands-free speech recognition application is proposed. Real-time implementation is made possible by avoiding time-consuming blind estimation. Instead, we use the impulse response by effectively identifying the late reflection components of it. Using this information, together with the concept of Spectral Subtraction (SS), we were able to remove the effects of the late reflection of the reverberant signal. After dereverberation, only the effects of the early component is left and used as input to the recognizer. In this method, multi-band SS is used in order to compensate for the error arising from approximation. We also introduced a training strategy to optimize the values of the multi-band coefficients to minimize the error.","PeriodicalId":129827,"journal":{"name":"2008 Hands-Free Speech Communication and Microphone Arrays","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-05-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125903927","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
Maximum Likelihood Detector of Reliable Direction-of-Arrival Estimate 可靠到达方向估计的最大似然检测器
2008 Hands-Free Speech Communication and Microphone Arrays Pub Date : 2008-05-06 DOI: 10.1109/HSCMA.2008.4538691
Seungil Kim, G. Song, Hyejeong Jeon, Lag-Yong Kim
{"title":"Maximum Likelihood Detector of Reliable Direction-of-Arrival Estimate","authors":"Seungil Kim, G. Song, Hyejeong Jeon, Lag-Yong Kim","doi":"10.1109/HSCMA.2008.4538691","DOIUrl":"https://doi.org/10.1109/HSCMA.2008.4538691","url":null,"abstract":"In this paper, we propose a maximum likelihood detector for reliable sound source localization system. It is based on making a measure of reliability of estimation results. The reliability can be reduced from waterbed effect of source localization algorithm. If the calculated reliability measure has a lower value than a predefined threshold, the estimated direction-of-arrival (DOA) is regarded as a wrong result and subsequently discarded. We determine the threshold for reliable estimate selection using maximum likelihood rule. Some experiments show that the proposed method can reject perturbed results of the estimated DOA.","PeriodicalId":129827,"journal":{"name":"2008 Hands-Free Speech Communication and Microphone Arrays","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-05-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125285452","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enhanced Direction Estimation Using Microphone Arrays for Directional Audio Coding 基于麦克风阵列的定向音频编码增强方向估计
2008 Hands-Free Speech Communication and Microphone Arrays Pub Date : 2008-05-06 DOI: 10.1109/HSCMA.2008.4538684
M. Kallinger, F. Kuech, R. Schultz-Amling, G. del Galdo, J. Ahonen, V. Pulkki
{"title":"Enhanced Direction Estimation Using Microphone Arrays for Directional Audio Coding","authors":"M. Kallinger, F. Kuech, R. Schultz-Amling, G. del Galdo, J. Ahonen, V. Pulkki","doi":"10.1109/HSCMA.2008.4538684","DOIUrl":"https://doi.org/10.1109/HSCMA.2008.4538684","url":null,"abstract":"Modern home entertainment systems offer surround sound audio playback. This progress over known mono and stereo devices is also intended for high quality hands-free telephony to enhance intelligibility of speech in group conversation. Directional Audio Coding (DirAC) provides an efficient and well-established way to record and encode spatial sound and to render it at an arbitrary loudspeaker setup. On the recording site, DirAC is based on B-format microphone signals. These signals can be obtained by one omnidirectional and three figure-of-eight microphones pointing along the axes of a three-dimensional Cartesian coordinate system. However, a grid of omnidirectional microphones is more appropriate for consumer applications due to economic reasons. Arrays can provide the required figure-of-eight directionality only for a certain frequency range. However, in this contribution we show that a straightforward direction estimator is biased. After formulating the bias analytically we propose an unbiased estimator and derive the theoretical limits for unique direction estimation. The results are illustrated by means of simulations and measurements.","PeriodicalId":129827,"journal":{"name":"2008 Hands-Free Speech Communication and Microphone Arrays","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-05-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125416363","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
Speech Separation Using an Adaptive Sparse Dictionary Algorithm 基于自适应稀疏字典算法的语音分离
2008 Hands-Free Speech Communication and Microphone Arrays Pub Date : 2008-05-06 DOI: 10.1109/HSCMA.2008.4538679
M. Jafari, Mark D. Plumbley, M. Davies
{"title":"Speech Separation Using an Adaptive Sparse Dictionary Algorithm","authors":"M. Jafari, Mark D. Plumbley, M. Davies","doi":"10.1109/HSCMA.2008.4538679","DOIUrl":"https://doi.org/10.1109/HSCMA.2008.4538679","url":null,"abstract":"We present a greedy adaptive algorithm that builds a sparse orthogonal dictionary from the observed data. In this paper, the algorithm is used to separate stereo speech signals, and the phase information that is inherent to the extracted atom pairs is used for clustering and identification of the original sources. The performance of the algorithm is compared to that of the adaptive stereo basis algorithm, when the sources are mixed in echoic and anechoic environments. We find that the algorithm correctly separates the sources, and can do this even with a relatively small number of atoms.","PeriodicalId":129827,"journal":{"name":"2008 Hands-Free Speech Communication and Microphone Arrays","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-05-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126626749","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
System Identification for Multi-Channel Listening-Room Compensation Using an Acoustic Echo Canceller 基于回声消除器的多通道听音室补偿系统辨识
2008 Hands-Free Speech Communication and Microphone Arrays Pub Date : 2008-05-06 DOI: 10.1109/HSCMA.2008.4538727
Stefan Goetze, M. Kallinger, A. Mertins, K. Kammeyer
{"title":"System Identification for Multi-Channel Listening-Room Compensation Using an Acoustic Echo Canceller","authors":"Stefan Goetze, M. Kallinger, A. Mertins, K. Kammeyer","doi":"10.1109/HSCMA.2008.4538727","DOIUrl":"https://doi.org/10.1109/HSCMA.2008.4538727","url":null,"abstract":"Modern hands-free telecommunication devices jointly apply several subsystems, e.g. for noise reduction (NR), acoustic echo cancellation (AEC) and listening-room compensation (LRC). In this contribution the combination of an equalizer for listening room compensation and an acoustic echo canceller is analyzed. Inverse filtering of room impulse responses (RIRs) is a challenging task since they are, in general, mixed phase systems having hundreds of zeros inside and outside near the unit circle in the z-domain. Furthermore, a reliable estimate of the RIR which shall be inverted is important. Since RIRs are time-variant due to possible changes of the acoustic environment, they have to be identified adaptively. If an AEC (or any other adaptive method) is used to identify the time variant room impulse responses the estimate's distance to the real RIRs may be too high for a satisfying equalization, especially in periods of initial convergence of the AEC or after RIR changes. Therefore, we propose to estimate the convergence state of the AEC and to incorporate this knowledge into the equalizer design.","PeriodicalId":129827,"journal":{"name":"2008 Hands-Free Speech Communication and Microphone Arrays","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-05-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130436895","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Blind Estimation and Suppression of Late Reverberation Utilising Auditory Masking 利用听觉掩蔽的盲估计和抑制晚期混响
2008 Hands-Free Speech Communication and Microphone Arrays Pub Date : 2008-05-06 DOI: 10.1109/HSCMA.2008.4538723
A. Tsilfidis, J. Mourjopoulos, D. Tsoukalas
{"title":"Blind Estimation and Suppression of Late Reverberation Utilising Auditory Masking","authors":"A. Tsilfidis, J. Mourjopoulos, D. Tsoukalas","doi":"10.1109/HSCMA.2008.4538723","DOIUrl":"https://doi.org/10.1109/HSCMA.2008.4538723","url":null,"abstract":"A new method for blind estimation and suppression of late reverberation of speech signals is presented. The proposed algorithm consists of two steps. In a first step, the reverberation time is blindly determined from the reverberant signal. Then, an approximation of the power spectrum of late reverberation is subtracted from the power spectrum of the reverberant signal. Hence, a preliminary estimation of the anechoic speech spectrum is derived. In a second step, the auditory masking threshold of the clean spectrum estimation is calculated and used to define the coefficients for a nonlinear filter for the reverberant signal, which produces the final enhanced speech signal. The performance of the algorithm is demonstrated on artificially generated signals. Subjective tests are conducted and their results indicate that the quality of the speech signals obtained by the proposed method is superior when compared to previous methods.","PeriodicalId":129827,"journal":{"name":"2008 Hands-Free Speech Communication and Microphone Arrays","volume":"62 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-05-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134443258","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Integration of Phoneme-Subspaces Using ICA for Speech Feature Extraction and Recognition 基于ICA的语音特征提取与识别中的音素-子空间集成
2008 Hands-Free Speech Communication and Microphone Arrays Pub Date : 2008-05-06 DOI: 10.1109/HSCMA.2008.4538708
Hyunsin Park, T. Takiguchi, Y. Ariki
{"title":"Integration of Phoneme-Subspaces Using ICA for Speech Feature Extraction and Recognition","authors":"Hyunsin Park, T. Takiguchi, Y. Ariki","doi":"10.1109/HSCMA.2008.4538708","DOIUrl":"https://doi.org/10.1109/HSCMA.2008.4538708","url":null,"abstract":"In our previous work, the use of PCA instead of DCT shows robustness in distorted speech recognition because the main speech element is projected onto low-order features, while the noise or distortion element is projected onto high-order features [1]. This paper introduces a new feature extraction technique that collects the correlation information among phoneme subspaces and their elements are statistically mutual independent. The proposed speech feature vector is generated by projecting observed vector onto integrated space obtained by PCA and ICA. The performance evaluation shows that the proposed method provides a higher isolated word recognition accuracy than conventional methods in some reverberant conditions.","PeriodicalId":129827,"journal":{"name":"2008 Hands-Free Speech Communication and Microphone Arrays","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-05-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131807442","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信