Speaker Recognition using Speaker-independent Universal Acoustic Model and Synchronous Sensing for Business Microscope

J. Nishimura, T. Kuroda
{"title":"Speaker Recognition using Speaker-independent Universal Acoustic Model and Synchronous Sensing for Business Microscope","authors":"J. Nishimura, T. Kuroda","doi":"10.1109/ISWPC.2009.4800609","DOIUrl":null,"url":null,"abstract":"\"Business Microscope\" visualizes interactions among knowledge workers in organization by sensing their face-to-face communication using sensornet. To analyze the workers communication in detail, speaker recognition for each node is needed. In the conventional studies, specific speaker-dependent training samples and acoustic model are required to recognize each speaker. In this work, speaker recognition using speaker-independent universal acoustic model is proposed. This method utilizes synchronous sensing of sensornet to extract the cepstral difference in acoustic channel and allows all speakers in the system to use same single acoustic model. The universal acoustic model constructed from 41 channel filterbank MFCC and large-sized LBG codebook achieved speaker recognition accuracy of 97.32% on test inputs of 0.2s for four speakers. With the synchronization error (≪ 120ms) among sensor nodes, the drop in recognition accuracy of less than 2 pts is observed.","PeriodicalId":383593,"journal":{"name":"2009 4th International Symposium on Wireless Pervasive Computing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2009 4th International Symposium on Wireless Pervasive Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISWPC.2009.4800609","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

Abstract

"Business Microscope" visualizes interactions among knowledge workers in organization by sensing their face-to-face communication using sensornet. To analyze the workers communication in detail, speaker recognition for each node is needed. In the conventional studies, specific speaker-dependent training samples and acoustic model are required to recognize each speaker. In this work, speaker recognition using speaker-independent universal acoustic model is proposed. This method utilizes synchronous sensing of sensornet to extract the cepstral difference in acoustic channel and allows all speakers in the system to use same single acoustic model. The universal acoustic model constructed from 41 channel filterbank MFCC and large-sized LBG codebook achieved speaker recognition accuracy of 97.32% on test inputs of 0.2s for four speakers. With the synchronization error (≪ 120ms) among sensor nodes, the drop in recognition accuracy of less than 2 pts is observed.
基于说话人独立通用声学模型和商业显微镜同步传感的说话人识别
“商业显微镜”通过传感器网络感知组织中知识型员工的面对面交流,将他们之间的互动可视化。为了详细分析工作人员的交流,需要对每个节点进行说话人识别。在传统的研究中,需要特定的说话人相关训练样本和声学模型来识别每个说话人。本文提出了基于独立于说话人的通用声学模型的说话人识别方法。该方法利用传感器的同步感知提取声通道的倒谱差异,并允许系统中的所有扬声器使用相同的单一声学模型。采用41通道滤波器组MFCC和大尺寸LBG码本构建的通用声学模型,在4个扬声器0.2s的测试输入下,识别准确率达到97.32%。由于传感器节点间的同步误差(120毫秒),因此识别精度的下降幅度小于2点。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信