基于语音生物特征和i-向量模型的说话人识别评估，使用TIMIT和NTIMIT数据库

2017 5th International Workshop on Biometrics and Forensics (IWBF) Pub Date : 2017-04-01 DOI:10.1109/IWBF.2017.7935102

Musab T. S. Al-Kaltakchi, W. L. Woo, S. Dlay, J. Chambers

{"title":"基于语音生物特征和i-向量模型的说话人识别评估，使用TIMIT和NTIMIT数据库","authors":"Musab T. S. Al-Kaltakchi, W. L. Woo, S. Dlay, J. Chambers","doi":"10.1109/IWBF.2017.7935102","DOIUrl":null,"url":null,"abstract":"Physiological and behavioural human characteristics are exploited in biometrics and performance metrics are used to measure some characteristic of an individual. The measure might lead to a one-to-one match, which is called authentication or one-from-N, and a match represents identification. In this paper, we exploit a speech biometric I-vector with low and fixed dimension of 100 to identify speakers. The main structure of the system consists of an I-vector with three fusion methods. It has low complexity and is efficient due to using an Extreme Learning Machine (ELM) classifier. The system is evaluated with 120 speakers from dialect regions one and four from both the TIMIT and NTIMIT databases in order to provide a fair comparison with our previous study based on the traditional Gaussian Mixture Model-Universal Background Model (GMM-UBM) with a Maximum Likelihood (ML) classifier system. The system shows identification rate improvement compared with the classical GMM-UBM.","PeriodicalId":111316,"journal":{"name":"2017 5th International Workshop on Biometrics and Forensics (IWBF)","volume":"44 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":"{\"title\":\"Speaker identification evaluation based on the speech biometric and i-vector model using the TIMIT and NTIMIT databases\",\"authors\":\"Musab T. S. Al-Kaltakchi, W. L. Woo, S. Dlay, J. Chambers\",\"doi\":\"10.1109/IWBF.2017.7935102\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Physiological and behavioural human characteristics are exploited in biometrics and performance metrics are used to measure some characteristic of an individual. The measure might lead to a one-to-one match, which is called authentication or one-from-N, and a match represents identification. In this paper, we exploit a speech biometric I-vector with low and fixed dimension of 100 to identify speakers. The main structure of the system consists of an I-vector with three fusion methods. It has low complexity and is efficient due to using an Extreme Learning Machine (ELM) classifier. The system is evaluated with 120 speakers from dialect regions one and four from both the TIMIT and NTIMIT databases in order to provide a fair comparison with our previous study based on the traditional Gaussian Mixture Model-Universal Background Model (GMM-UBM) with a Maximum Likelihood (ML) classifier system. The system shows identification rate improvement compared with the classical GMM-UBM.\",\"PeriodicalId\":111316,\"journal\":{\"name\":\"2017 5th International Workshop on Biometrics and Forensics (IWBF)\",\"volume\":\"44 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-04-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"10\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 5th International Workshop on Biometrics and Forensics (IWBF)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IWBF.2017.7935102\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 5th International Workshop on Biometrics and Forensics (IWBF)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IWBF.2017.7935102","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 10

摘要

生理和行为的人类特征是利用生物计量学和性能指标来衡量个人的一些特征。该度量可能导致一对一的匹配，这称为身份验证或1 -from- n，匹配表示身份。本文利用低维、固定维数为100的语音生物特征i向量来识别说话人。该系统的主要结构由一个i向量和三种融合方法组成。由于使用了极限学习机(ELM)分类器，它具有低复杂度和高效率。为了与我们之前基于传统高斯混合模型-通用背景模型(GMM-UBM)和最大似然(ML)分类器系统的研究进行公平比较，我们使用了来自TIMIT和NTIMIT数据库中方言1区和方言4区的120名说话者对系统进行了评估。与经典的GMM-UBM相比，该系统的识别率有所提高。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Speaker identification evaluation based on the speech biometric and i-vector model using the TIMIT and NTIMIT databases

Physiological and behavioural human characteristics are exploited in biometrics and performance metrics are used to measure some characteristic of an individual. The measure might lead to a one-to-one match, which is called authentication or one-from-N, and a match represents identification. In this paper, we exploit a speech biometric I-vector with low and fixed dimension of 100 to identify speakers. The main structure of the system consists of an I-vector with three fusion methods. It has low complexity and is efficient due to using an Extreme Learning Machine (ELM) classifier. The system is evaluated with 120 speakers from dialect regions one and four from both the TIMIT and NTIMIT databases in order to provide a fair comparison with our previous study based on the traditional Gaussian Mixture Model-Universal Background Model (GMM-UBM) with a Maximum Likelihood (ML) classifier system. The system shows identification rate improvement compared with the classical GMM-UBM.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2017 5th International Workshop on Biometrics and Forensics (IWBF)

自引率

0.00%

发文量