Speaker Recognition using Speaker-independent Universal Acoustic Model and Synchronous Sensing for Business Microscope

2009 4th International Symposium on Wireless Pervasive Computing Pub Date : 2009-02-11 DOI:10.1109/ISWPC.2009.4800609

J. Nishimura, T. Kuroda

引用次数: 3

Abstract

"Business Microscope" visualizes interactions among knowledge workers in organization by sensing their face-to-face communication using sensornet. To analyze the workers communication in detail, speaker recognition for each node is needed. In the conventional studies, specific speaker-dependent training samples and acoustic model are required to recognize each speaker. In this work, speaker recognition using speaker-independent universal acoustic model is proposed. This method utilizes synchronous sensing of sensornet to extract the cepstral difference in acoustic channel and allows all speakers in the system to use same single acoustic model. The universal acoustic model constructed from 41 channel filterbank MFCC and large-sized LBG codebook achieved speaker recognition accuracy of 97.32% on test inputs of 0.2s for four speakers. With the synchronization error (≪ 120ms) among sensor nodes, the drop in recognition accuracy of less than 2 pts is observed.

查看原文本刊更多论文

基于说话人独立通用声学模型和商业显微镜同步传感的说话人识别

“商业显微镜”通过传感器网络感知组织中知识型员工的面对面交流，将他们之间的互动可视化。为了详细分析工作人员的交流，需要对每个节点进行说话人识别。在传统的研究中，需要特定的说话人相关训练样本和声学模型来识别每个说话人。本文提出了基于独立于说话人的通用声学模型的说话人识别方法。该方法利用传感器的同步感知提取声通道的倒谱差异，并允许系统中的所有扬声器使用相同的单一声学模型。采用41通道滤波器组MFCC和大尺寸LBG码本构建的通用声学模型，在4个扬声器0.2s的测试输入下，识别准确率达到97.32%。由于传感器节点间的同步误差(120毫秒)，因此识别精度的下降幅度小于2点。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2009 4th International Symposium on Wireless Pervasive Computing

自引率

0.00%

发文量