2006 IEEE Odyssey - The Speaker and Language Recognition Workshop最新文献_第2页

NIST Speaker Recognition Evaluation Chronicles - Part 2 NIST说话人识别评估编年史-第2部分

2006 IEEE Odyssey - The Speaker and Language Recognition Workshop Pub Date : 2006-06-28 DOI: 10.1109/ODYSSEY.2006.248120

Mark A. Przybocki, Alvin F. Martin, Audrey N. Le

引用次数: 69

An Evaluation of "Commercial Off-The-Shelf" Speaker Verification Systems “商用现货”说话人验证系统的评估

2006 IEEE Odyssey - The Speaker and Language Recognition Workshop Pub Date : 2006-06-28 DOI: 10.1109/ODYSSEY.2006.248085

M. Wagner, C. Summerfield, T. Dunstone, R. Summerfield, J. Moss

引用次数: 6

The 2005 AFRL/HEC One-Speaker Detection Systems 2005 AFRL/HEC单扬声器检测系统

2006 IEEE Odyssey - The Speaker and Language Recognition Workshop Pub Date : 2006-06-28 DOI: 10.1109/ODYSSEY.2006.248119

Raymond E. Slyh, Eric G. Hansen, Brian M. Ore

{"title":"The 2005 AFRL/HEC One-Speaker Detection Systems","authors":"Raymond E. Slyh, Eric G. Hansen, Brian M. Ore","doi":"10.1109/ODYSSEY.2006.248119","DOIUrl":"https://doi.org/10.1109/ODYSSEY.2006.248119","url":null,"abstract":"This paper describes the one-speaker detection systems submitted by AFRL/HEC for several of the training and testing conditions in the 2005 NIST speaker recognition evaluation. For each condition, the overall system score was the weighted combination of scores from several component systems. The component systems were based on (1) mel-frequency cepstral coefficients (MFCCs) and Gaussian mixture models (GMMs); (2) MFCCs and phoneme-specific GMMs (PS-GMMs); (3) linear-prediction-based cepstral coefficients (LPCCs) from closed-phase analysis; (4) formant center frequencies, formant bandwidths, and fundamental frequency (FMBWF0); and (5) word language modeling (WLM). The score combination was done using single-layer perceptrons, with the grouping of the component systems depending on the lengths of the training and testing files. For some of the testing and/or training conditions involving ten-second speech files, the system performance improved from the inclusion of the FMBWFO and LPCC systems, while the MFCC/PS-GMM system provided additional benefits in the one-conversation testing conditions involving larger amounts of training data","PeriodicalId":215883,"journal":{"name":"2006 IEEE Odyssey - The Speaker and Language Recognition Workshop","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126146240","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A Weighted Measure of Similarity for Speaker Tracking 一种用于说话人跟踪的加权相似度度量

2006 IEEE Odyssey - The Speaker and Language Recognition Workshop Pub Date : 2006-06-28 DOI: 10.1109/ODYSSEY.2006.248124

Mikaël Collet, Delphine Charlet, F. Bimbot

引用次数: 1

UBM-GMM Driven Discriminative Approach for Speaker Verification 基于UBM-GMM驱动的说话人验证判别方法

2006 IEEE Odyssey - The Speaker and Language Recognition Workshop Pub Date : 2006-06-28 DOI: 10.1109/ODYSSEY.2006.248127

N. Scheffer, J. Bonastre

引用次数: 14

Addressing Channel Mismatch through Speaker Discriminative Transforms 通过说话人判别变换解决信道失配问题

2006 IEEE Odyssey - The Speaker and Language Recognition Workshop Pub Date : 2006-06-28 DOI: 10.1109/ODYSSEY.2006.248111

Jason W. Pelecanos, Jirí Navrátil, G. Ramaswamy

引用次数: 3

Improved Multi-Modal Recognition Interface for Intelligent HCI Based on Speech and the KSSL Recognition 基于语音和KSSL识别的智能人机交互多模态识别接口改进

2006 IEEE Odyssey - The Speaker and Language Recognition Workshop Pub Date : 2006-06-28 DOI: 10.1109/ODYSSEY.2006.248108

Jung-hyun Kim, Kwang-seok Hong

{"title":"Improved Multi-Modal Recognition Interface for Intelligent HCI Based on Speech and the KSSL Recognition","authors":"Jung-hyun Kim, Kwang-seok Hong","doi":"10.1109/ODYSSEY.2006.248108","DOIUrl":"https://doi.org/10.1109/ODYSSEY.2006.248108","url":null,"abstract":"Desktop PC and wire communications net-based traditional studies on pattern recognition and multimodal interaction have some restrictions (e.g. limitation of motion, conditionality in space and so on) and general problems according to using of the vision technologies for recognition and representation of the hap tic-gesture information. In this paper, we propose and implement multi-modal recognition interface (hereinafter, MMRI) integrating speech using voice-XML based on WWW and the post wearable PC-based gesture, it have purposes that recognizes and represents the Korean Standard Sign Language (hereinafter, KSSL) which is a dialogue system and interactive elements in the Korean deaf communities. The advantages of our approach are as follows: 1) it improves efficiency of the MMRI input module according to the technology of wireless communication, 2) it shows higher recognition performance than uni-modal recognition system (using gesture or speech), 3) it recognizes and represents continuous sign language of users with flexibility in real time and can offer to user a wider range of personalized and differentiated information using the MMRI more effectively. Experimental results, the MMRI deduces an average recognition rate of 96.1% about significant, dynamic and continuous the KSSL and speech of various users","PeriodicalId":215883,"journal":{"name":"2006 IEEE Odyssey - The Speaker and Language Recognition Workshop","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134610216","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Speaker Characterization with MLSFs 用mlsf表征说话人

2006 IEEE Odyssey - The Speaker and Language Recognition Workshop Pub Date : 2006-06-28 DOI: 10.1109/ODYSSEY.2006.248113

Hugo Cordeiro, C. Ribeiro

引用次数: 17

How to Deal with Multiple-Targets in Speaker Identification Systems? 说话人识别系统中如何处理多目标?

2006 IEEE Odyssey - The Speaker and Language Recognition Workshop Pub Date : 2006-06-28 DOI: 10.1109/ODYSSEY.2006.248116

Y. Zigel, M. Wasserblat

引用次数: 29

Experiments in Speaker Adaptation for Factor Analysis Based Speaker Verification 基于因子分析的说话人验证自适应实验

2006 IEEE Odyssey - The Speaker and Language Recognition Workshop Pub Date : 2006-06-28 DOI: 10.1109/ODYSSEY.2006.248130

Shou-Chun Yin, P. Kenny, R. Rose

{"title":"Experiments in Speaker Adaptation for Factor Analysis Based Speaker Verification","authors":"Shou-Chun Yin, P. Kenny, R. Rose","doi":"10.1109/ODYSSEY.2006.248130","DOIUrl":"https://doi.org/10.1109/ODYSSEY.2006.248130","url":null,"abstract":"This paper presents methods for supervised and unsupervised speaker adaptation of Gaussian mixture speaker models in text-independent speaker verification. The methods are based on an approach which is able to decompose speaker and channel variability so that progressive updating of speaker models can be performed while minimizing the influence of the channel variability associated with the adaptation utterances. This approach relies on a joint factor analysis model of intrinsic speaker variability and session variability where inter-session variation is assumed to result primarily from the effects of the channel. These adaptation methods have been evaluated under the adaptation paradigm defined under the NIST 2005 speaker recognition evaluation plan which is based on conversational telephone speech. It was found that when both target speaker model training and speaker verification trials were performed using a five minute excerpt from a single conversation, an equal error rate (EER) of 4.5% and minimum detection cost function (DCF) of 0.013 were obtained when performing unsupervised speaker adaptation during evaluation. It will be shown that this performance is comparable to that obtained by state of the art speaker verification systems that rely on a larger set of features and are trained from as many as eight conversations from the target speaker","PeriodicalId":215883,"journal":{"name":"2006 IEEE Odyssey - The Speaker and Language Recognition Workshop","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114693445","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 6