2006 IEEE Odyssey - The Speaker and Language Recognition Workshop最新文献

筛选
英文 中文
NIST Speaker Recognition Evaluation Chronicles - Part 2 NIST说话人识别评估编年史-第2部分
2006 IEEE Odyssey - The Speaker and Language Recognition Workshop Pub Date : 2006-06-28 DOI: 10.1109/ODYSSEY.2006.248120
Mark A. Przybocki, Alvin F. Martin, Audrey N. Le
{"title":"NIST Speaker Recognition Evaluation Chronicles - Part 2","authors":"Mark A. Przybocki, Alvin F. Martin, Audrey N. Le","doi":"10.1109/ODYSSEY.2006.248120","DOIUrl":"https://doi.org/10.1109/ODYSSEY.2006.248120","url":null,"abstract":"NIST has coordinated annual evaluations of text-independent speaker recognition since 1996. This update to an Odyssey 2004 paper concentrates on the past two years of the NIST evaluations. We discuss in particular the results of the 2004 and 2005 evaluations, and how they compare to earlier evaluation results. We also discuss the preparation and planning for the 2006 evaluation, which concludes with the evaluation workshop in San Juan, Puerto Rico, in June 2006","PeriodicalId":215883,"journal":{"name":"2006 IEEE Odyssey - The Speaker and Language Recognition Workshop","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131020430","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 69
An Evaluation of "Commercial Off-The-Shelf" Speaker Verification Systems “商用现货”说话人验证系统的评估
2006 IEEE Odyssey - The Speaker and Language Recognition Workshop Pub Date : 2006-06-28 DOI: 10.1109/ODYSSEY.2006.248085
M. Wagner, C. Summerfield, T. Dunstone, R. Summerfield, J. Moss
{"title":"An Evaluation of \"Commercial Off-The-Shelf\" Speaker Verification Systems","authors":"M. Wagner, C. Summerfield, T. Dunstone, R. Summerfield, J. Moss","doi":"10.1109/ODYSSEY.2006.248085","DOIUrl":"https://doi.org/10.1109/ODYSSEY.2006.248085","url":null,"abstract":"An evaluation of commercial off-the-shelf speaker verification systems is reported. The performance of several systems, which were offered for testing, is analyzed against criteria designed to identify strengths and weaknesses that would determine their suitability for the use by government service agencies. Results for three text-dependent systems by Nuance, Persay and Scansoft are presented in this paper","PeriodicalId":215883,"journal":{"name":"2006 IEEE Odyssey - The Speaker and Language Recognition Workshop","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127840003","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
The 2005 AFRL/HEC One-Speaker Detection Systems 2005 AFRL/HEC单扬声器检测系统
2006 IEEE Odyssey - The Speaker and Language Recognition Workshop Pub Date : 2006-06-28 DOI: 10.1109/ODYSSEY.2006.248119
Raymond E. Slyh, Eric G. Hansen, Brian M. Ore
{"title":"The 2005 AFRL/HEC One-Speaker Detection Systems","authors":"Raymond E. Slyh, Eric G. Hansen, Brian M. Ore","doi":"10.1109/ODYSSEY.2006.248119","DOIUrl":"https://doi.org/10.1109/ODYSSEY.2006.248119","url":null,"abstract":"This paper describes the one-speaker detection systems submitted by AFRL/HEC for several of the training and testing conditions in the 2005 NIST speaker recognition evaluation. For each condition, the overall system score was the weighted combination of scores from several component systems. The component systems were based on (1) mel-frequency cepstral coefficients (MFCCs) and Gaussian mixture models (GMMs); (2) MFCCs and phoneme-specific GMMs (PS-GMMs); (3) linear-prediction-based cepstral coefficients (LPCCs) from closed-phase analysis; (4) formant center frequencies, formant bandwidths, and fundamental frequency (FMBWF0); and (5) word language modeling (WLM). The score combination was done using single-layer perceptrons, with the grouping of the component systems depending on the lengths of the training and testing files. For some of the testing and/or training conditions involving ten-second speech files, the system performance improved from the inclusion of the FMBWFO and LPCC systems, while the MFCC/PS-GMM system provided additional benefits in the one-conversation testing conditions involving larger amounts of training data","PeriodicalId":215883,"journal":{"name":"2006 IEEE Odyssey - The Speaker and Language Recognition Workshop","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126146240","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Weighted Measure of Similarity for Speaker Tracking 一种用于说话人跟踪的加权相似度度量
2006 IEEE Odyssey - The Speaker and Language Recognition Workshop Pub Date : 2006-06-28 DOI: 10.1109/ODYSSEY.2006.248124
Mikaël Collet, Delphine Charlet, F. Bimbot
{"title":"A Weighted Measure of Similarity for Speaker Tracking","authors":"Mikaël Collet, Delphine Charlet, F. Bimbot","doi":"10.1109/ODYSSEY.2006.248124","DOIUrl":"https://doi.org/10.1109/ODYSSEY.2006.248124","url":null,"abstract":"In this paper, we present a speaker tracking system entirely based on anchor models approach. This speaker tracking system is combined with a speaker clustering module which improves speaker detection performances. However, speaker clustering errors generate new speaker detection errors due to segments grouped into wrong clusters. The aim of this article is to introduce a weighted measure of similarity between target speakers and segments based on a measure of similarity between segments belonging to a same cluster. Evaluation is done on the audio database of the ESTER evaluation campaign for the rich transcription of French broadcast news. Results show that the weighted measure of similarity improved the speaker tracking performances. This improvement manifests itself as an improved precision rate on segments grouped into wrong clusters","PeriodicalId":215883,"journal":{"name":"2006 IEEE Odyssey - The Speaker and Language Recognition Workshop","volume":"89 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124572748","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
UBM-GMM Driven Discriminative Approach for Speaker Verification 基于UBM-GMM驱动的说话人验证判别方法
2006 IEEE Odyssey - The Speaker and Language Recognition Workshop Pub Date : 2006-06-28 DOI: 10.1109/ODYSSEY.2006.248127
N. Scheffer, J. Bonastre
{"title":"UBM-GMM Driven Discriminative Approach for Speaker Verification","authors":"N. Scheffer, J. Bonastre","doi":"10.1109/ODYSSEY.2006.248127","DOIUrl":"https://doi.org/10.1109/ODYSSEY.2006.248127","url":null,"abstract":"In the past few years, discriminative approaches to perform speaker detection have shown good results and an increasing interest. Among these methods, SVM based systems have lots of advantages, especially their ability to deal with a high dimension feature space. Generative systems such as UBM-GMM systems show the greatest performance among other systems in speaker verification tasks. Combination of generative and discriminative approaches is not a new idea and has been studied several times by mapping a whole speech utterance onto a fixed length vector. This paper presents a straight-forward, cost friendly method to combine the two approaches with the use of a UBM model only to drive the experiment. We show that the use of the TFLLR kernel, while closely related to a reduced form of the Fisher mapping, implies a performance that is close to a standard GMM/UBM based speaker detection system. Moreover, we show that a combination of both outperforms the systems taken independently","PeriodicalId":215883,"journal":{"name":"2006 IEEE Odyssey - The Speaker and Language Recognition Workshop","volume":"269 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121334180","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Addressing Channel Mismatch through Speaker Discriminative Transforms 通过说话人判别变换解决信道失配问题
2006 IEEE Odyssey - The Speaker and Language Recognition Workshop Pub Date : 2006-06-28 DOI: 10.1109/ODYSSEY.2006.248111
Jason W. Pelecanos, Jirí Navrátil, G. Ramaswamy
{"title":"Addressing Channel Mismatch through Speaker Discriminative Transforms","authors":"Jason W. Pelecanos, Jirí Navrátil, G. Ramaswamy","doi":"10.1109/ODYSSEY.2006.248111","DOIUrl":"https://doi.org/10.1109/ODYSSEY.2006.248111","url":null,"abstract":"This paper presents a discriminative criterion applied to Gaussian mixture models (GMMs) to reduce handset mismatch. The criterion is related to the log-likelihood-ratio (LLR) scoring approach commonly used in GMMs for speaker recognition. The algorithm attempts to perform a direct mapping of features from one channel type to an assumed undistorted target channel but with the goal of maximizing speaker discrimination using the transform. The transform attempts to maximize the posterior probability of a group of speaker models given their corresponding speech observations recorded on a different channel","PeriodicalId":215883,"journal":{"name":"2006 IEEE Odyssey - The Speaker and Language Recognition Workshop","volume":"121 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126726755","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Improved Multi-Modal Recognition Interface for Intelligent HCI Based on Speech and the KSSL Recognition 基于语音和KSSL识别的智能人机交互多模态识别接口改进
2006 IEEE Odyssey - The Speaker and Language Recognition Workshop Pub Date : 2006-06-28 DOI: 10.1109/ODYSSEY.2006.248108
Jung-hyun Kim, Kwang-seok Hong
{"title":"Improved Multi-Modal Recognition Interface for Intelligent HCI Based on Speech and the KSSL Recognition","authors":"Jung-hyun Kim, Kwang-seok Hong","doi":"10.1109/ODYSSEY.2006.248108","DOIUrl":"https://doi.org/10.1109/ODYSSEY.2006.248108","url":null,"abstract":"Desktop PC and wire communications net-based traditional studies on pattern recognition and multimodal interaction have some restrictions (e.g. limitation of motion, conditionality in space and so on) and general problems according to using of the vision technologies for recognition and representation of the hap tic-gesture information. In this paper, we propose and implement multi-modal recognition interface (hereinafter, MMRI) integrating speech using voice-XML based on WWW and the post wearable PC-based gesture, it have purposes that recognizes and represents the Korean Standard Sign Language (hereinafter, KSSL) which is a dialogue system and interactive elements in the Korean deaf communities. The advantages of our approach are as follows: 1) it improves efficiency of the MMRI input module according to the technology of wireless communication, 2) it shows higher recognition performance than uni-modal recognition system (using gesture or speech), 3) it recognizes and represents continuous sign language of users with flexibility in real time and can offer to user a wider range of personalized and differentiated information using the MMRI more effectively. Experimental results, the MMRI deduces an average recognition rate of 96.1% about significant, dynamic and continuous the KSSL and speech of various users","PeriodicalId":215883,"journal":{"name":"2006 IEEE Odyssey - The Speaker and Language Recognition Workshop","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134610216","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Speaker Characterization with MLSFs 用mlsf表征说话人
2006 IEEE Odyssey - The Speaker and Language Recognition Workshop Pub Date : 2006-06-28 DOI: 10.1109/ODYSSEY.2006.248113
Hugo Cordeiro, C. Ribeiro
{"title":"Speaker Characterization with MLSFs","authors":"Hugo Cordeiro, C. Ribeiro","doi":"10.1109/ODYSSEY.2006.248113","DOIUrl":"https://doi.org/10.1109/ODYSSEY.2006.248113","url":null,"abstract":"The work described in this paper concerns the analysis of an alternative feature for speaker characterization, in the context of speaker recognition: line spectrum frequencies (LSF), but derived from mel-filter bank energies. This new feature, that we denominate mel-LSFs (MLSFs), shows similar performance comparing to MFCCs for male speakers, one of the most common feature found in speaker recognition, but for female speakers MLSFs performs better than MFCCs. When combined with mel-LSFs differences, MLSFs feature overcomes the performance of the MFCCs for male and female speakers, even with temporal delta, AMFCCs, included. Performance is measured in the context of speaker verification, using EER and minimum HTER. Detection error threshold (DET) curves are also presented, as well as HTER curves. The main objective of this study is to compare different features performances with a common framework, from what a standard support vector machine recogniser was developed. Tests are based on the cellular component of the \"2002 NIST Speaker Recognition Evaluation Corpus\"","PeriodicalId":215883,"journal":{"name":"2006 IEEE Odyssey - The Speaker and Language Recognition Workshop","volume":"54 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116124508","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
How to Deal with Multiple-Targets in Speaker Identification Systems? 说话人识别系统中如何处理多目标?
2006 IEEE Odyssey - The Speaker and Language Recognition Workshop Pub Date : 2006-06-28 DOI: 10.1109/ODYSSEY.2006.248116
Y. Zigel, M. Wasserblat
{"title":"How to Deal with Multiple-Targets in Speaker Identification Systems?","authors":"Y. Zigel, M. Wasserblat","doi":"10.1109/ODYSSEY.2006.248116","DOIUrl":"https://doi.org/10.1109/ODYSSEY.2006.248116","url":null,"abstract":"In open-set speaker identification systems a known phenomenon is that the false alarm (accept) error rate increases dramatically when increasing the number of registered speakers (models). In this paper, we demonstrate this phenomenon and suggest a solution using a new model-dependent score-normalization technique, called top-norm. The top-norm method was specifically developed to improve results of open-set speaker identification systems. Also, we suggest a score-normalization parameter adaptation technique. Experiments performed using speaker recognition corpora are described and demonstrate that the new method outperforms other normalization methods","PeriodicalId":215883,"journal":{"name":"2006 IEEE Odyssey - The Speaker and Language Recognition Workshop","volume":"148 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116865344","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 29
Experiments in Speaker Adaptation for Factor Analysis Based Speaker Verification 基于因子分析的说话人验证自适应实验
2006 IEEE Odyssey - The Speaker and Language Recognition Workshop Pub Date : 2006-06-28 DOI: 10.1109/ODYSSEY.2006.248130
Shou-Chun Yin, P. Kenny, R. Rose
{"title":"Experiments in Speaker Adaptation for Factor Analysis Based Speaker Verification","authors":"Shou-Chun Yin, P. Kenny, R. Rose","doi":"10.1109/ODYSSEY.2006.248130","DOIUrl":"https://doi.org/10.1109/ODYSSEY.2006.248130","url":null,"abstract":"This paper presents methods for supervised and unsupervised speaker adaptation of Gaussian mixture speaker models in text-independent speaker verification. The methods are based on an approach which is able to decompose speaker and channel variability so that progressive updating of speaker models can be performed while minimizing the influence of the channel variability associated with the adaptation utterances. This approach relies on a joint factor analysis model of intrinsic speaker variability and session variability where inter-session variation is assumed to result primarily from the effects of the channel. These adaptation methods have been evaluated under the adaptation paradigm defined under the NIST 2005 speaker recognition evaluation plan which is based on conversational telephone speech. It was found that when both target speaker model training and speaker verification trials were performed using a five minute excerpt from a single conversation, an equal error rate (EER) of 4.5% and minimum detection cost function (DCF) of 0.013 were obtained when performing unsupervised speaker adaptation during evaluation. It will be shown that this performance is comparable to that obtained by state of the art speaker verification systems that rely on a larger set of features and are trained from as many as eight conversations from the target speaker","PeriodicalId":215883,"journal":{"name":"2006 IEEE Odyssey - The Speaker and Language Recognition Workshop","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114693445","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信