{"title":"Speaker Recognition using Speaker-independent Universal Acoustic Model and Synchronous Sensing for Business Microscope","authors":"J. Nishimura, T. Kuroda","doi":"10.1109/ISWPC.2009.4800609","DOIUrl":null,"url":null,"abstract":"\"Business Microscope\" visualizes interactions among knowledge workers in organization by sensing their face-to-face communication using sensornet. To analyze the workers communication in detail, speaker recognition for each node is needed. In the conventional studies, specific speaker-dependent training samples and acoustic model are required to recognize each speaker. In this work, speaker recognition using speaker-independent universal acoustic model is proposed. This method utilizes synchronous sensing of sensornet to extract the cepstral difference in acoustic channel and allows all speakers in the system to use same single acoustic model. The universal acoustic model constructed from 41 channel filterbank MFCC and large-sized LBG codebook achieved speaker recognition accuracy of 97.32% on test inputs of 0.2s for four speakers. With the synchronization error (≪ 120ms) among sensor nodes, the drop in recognition accuracy of less than 2 pts is observed.","PeriodicalId":383593,"journal":{"name":"2009 4th International Symposium on Wireless Pervasive Computing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2009 4th International Symposium on Wireless Pervasive Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISWPC.2009.4800609","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
"Business Microscope" visualizes interactions among knowledge workers in organization by sensing their face-to-face communication using sensornet. To analyze the workers communication in detail, speaker recognition for each node is needed. In the conventional studies, specific speaker-dependent training samples and acoustic model are required to recognize each speaker. In this work, speaker recognition using speaker-independent universal acoustic model is proposed. This method utilizes synchronous sensing of sensornet to extract the cepstral difference in acoustic channel and allows all speakers in the system to use same single acoustic model. The universal acoustic model constructed from 41 channel filterbank MFCC and large-sized LBG codebook achieved speaker recognition accuracy of 97.32% on test inputs of 0.2s for four speakers. With the synchronization error (≪ 120ms) among sensor nodes, the drop in recognition accuracy of less than 2 pts is observed.