Combination of system and source characteristics for speaker verification under limited data condition

2016 IEEE 12th International Colloquium on Signal Processing & Its Applications (CSPA) Pub Date : 2016-03-04 DOI:10.1109/CSPA.2016.7515823

T. R. Jayanthi Kumari, H. S. Jayanna

{"title":"Combination of system and source characteristics for speaker verification under limited data condition","authors":"T. R. Jayanthi Kumari, H. S. Jayanna","doi":"10.1109/CSPA.2016.7515823","DOIUrl":null,"url":null,"abstract":"There is immense potential for speaker verification system under limited data condition in several real life applications. This paper explains how the combined dissimilar characteristics of voice data improve the performance of speaker verification when training and testing data lengths are reduced (less than 15 sec). To carry out this work, Mel-Frequency Cepstral Coefficients (MFCC), Linear Prediction Cepstral Coefficient (LPCC), Linear Prediction Residual (LPR) and Linear Prediction Residual Phase (LPRP) features are considered. The features are extracted from these extraction techniques are studied individually and pooled them to acquire better verification performance. The experimental evaluation is made by different classifiers of Gaussian mixture model (GMM) and GMM-universal background model (GMM-UBM). The NIST-2003 dataset is used for conducting experiments. The combined dissimilar features provide relatively improved performance compared to all individual features. The GMM-UBM classifier comparatively gives reduced equal error rate (EER) compared to GMM.","PeriodicalId":314829,"journal":{"name":"2016 IEEE 12th International Colloquium on Signal Processing & Its Applications (CSPA)","volume":"67 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE 12th International Colloquium on Signal Processing & Its Applications (CSPA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CSPA.2016.7515823","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

There is immense potential for speaker verification system under limited data condition in several real life applications. This paper explains how the combined dissimilar characteristics of voice data improve the performance of speaker verification when training and testing data lengths are reduced (less than 15 sec). To carry out this work, Mel-Frequency Cepstral Coefficients (MFCC), Linear Prediction Cepstral Coefficient (LPCC), Linear Prediction Residual (LPR) and Linear Prediction Residual Phase (LPRP) features are considered. The features are extracted from these extraction techniques are studied individually and pooled them to acquire better verification performance. The experimental evaluation is made by different classifiers of Gaussian mixture model (GMM) and GMM-universal background model (GMM-UBM). The NIST-2003 dataset is used for conducting experiments. The combined dissimilar features provide relatively improved performance compared to all individual features. The GMM-UBM classifier comparatively gives reduced equal error rate (EER) compared to GMM.

查看原文本刊更多论文

有限数据条件下系统与声源特性相结合的说话人验证

在实际应用中，有限数据条件下的说话人验证系统具有巨大的应用潜力。本文解释了当训练和测试数据长度减少(小于15秒)时，语音数据的不同特征是如何提高说话人验证的性能的。为了开展这项工作，考虑了Mel-Frequency倒谱系数(MFCC)、线性预测倒谱系数(LPCC)、线性预测残差(LPR)和线性预测残差相位(LPRP)特征。为了获得更好的验证性能，对这些提取技术提取的特征进行了单独研究和汇总。采用不同的分类器对高斯混合模型(GMM)和通用背景模型(GMM- ubm)进行了实验评价。NIST-2003数据集用于进行实验。与所有单独的特性相比，组合的不同特性提供了相对更好的性能。与GMM相比，GMM- ubm分类器的等错误率(EER)相对较低。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2016 IEEE 12th International Colloquium on Signal Processing & Its Applications (CSPA)

自引率

0.00%

发文量