Robust digit recognition using phase-dependent time-frequency masking

2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03). Pub Date : 2003-07-06 DOI:10.1109/ICASSP.2003.1198873

Guangji Shi, P. Aarabi

引用次数: 21

Abstract

A technique using the time-frequency phase information of two microphones is proposed to estimate an ideal time-frequency mask using time-delay-of-arrival (TDOA) of the signal of interest. At a signal-to-noise ratio (SNR) of 0 dB, the proposed technique using two microphones achieves a digit recognition rate (average over 5 speakers, each speaking 20-30 digits) of 71%. In contrast, delay-and-sum beamforming only achieves a 40% recognition rate with two microphones and 60% with four microphones. Superdirective beamforming achieves a 44% recognition rate with two microphones and 65% with four microphones.

查看原文本刊更多论文

使用相位相关时频掩蔽的鲁棒数字识别

提出了一种利用两个传声器的时频相位信息，利用目标信号的到达延时(TDOA)估计理想时频掩模的方法。在0 dB的信噪比(SNR)下，使用两个麦克风的技术实现了71%的数字识别率(平均超过5个扬声器，每个扬声器讲20-30个数字)。相比之下，延迟和波束形成在两个麦克风时只能达到40%的识别率，在四个麦克风时只能达到60%的识别率。超指令波束成形在两个麦克风和四个麦克风下的识别率分别为44%和65%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03).

自引率

0.00%

发文量