基于语音生物特征和i-向量模型的说话人识别评估,使用TIMIT和NTIMIT数据库

Musab T. S. Al-Kaltakchi, W. L. Woo, S. Dlay, J. Chambers
{"title":"基于语音生物特征和i-向量模型的说话人识别评估,使用TIMIT和NTIMIT数据库","authors":"Musab T. S. Al-Kaltakchi, W. L. Woo, S. Dlay, J. Chambers","doi":"10.1109/IWBF.2017.7935102","DOIUrl":null,"url":null,"abstract":"Physiological and behavioural human characteristics are exploited in biometrics and performance metrics are used to measure some characteristic of an individual. The measure might lead to a one-to-one match, which is called authentication or one-from-N, and a match represents identification. In this paper, we exploit a speech biometric I-vector with low and fixed dimension of 100 to identify speakers. The main structure of the system consists of an I-vector with three fusion methods. It has low complexity and is efficient due to using an Extreme Learning Machine (ELM) classifier. The system is evaluated with 120 speakers from dialect regions one and four from both the TIMIT and NTIMIT databases in order to provide a fair comparison with our previous study based on the traditional Gaussian Mixture Model-Universal Background Model (GMM-UBM) with a Maximum Likelihood (ML) classifier system. The system shows identification rate improvement compared with the classical GMM-UBM.","PeriodicalId":111316,"journal":{"name":"2017 5th International Workshop on Biometrics and Forensics (IWBF)","volume":"44 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":"{\"title\":\"Speaker identification evaluation based on the speech biometric and i-vector model using the TIMIT and NTIMIT databases\",\"authors\":\"Musab T. S. Al-Kaltakchi, W. L. Woo, S. Dlay, J. Chambers\",\"doi\":\"10.1109/IWBF.2017.7935102\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Physiological and behavioural human characteristics are exploited in biometrics and performance metrics are used to measure some characteristic of an individual. The measure might lead to a one-to-one match, which is called authentication or one-from-N, and a match represents identification. In this paper, we exploit a speech biometric I-vector with low and fixed dimension of 100 to identify speakers. The main structure of the system consists of an I-vector with three fusion methods. It has low complexity and is efficient due to using an Extreme Learning Machine (ELM) classifier. The system is evaluated with 120 speakers from dialect regions one and four from both the TIMIT and NTIMIT databases in order to provide a fair comparison with our previous study based on the traditional Gaussian Mixture Model-Universal Background Model (GMM-UBM) with a Maximum Likelihood (ML) classifier system. The system shows identification rate improvement compared with the classical GMM-UBM.\",\"PeriodicalId\":111316,\"journal\":{\"name\":\"2017 5th International Workshop on Biometrics and Forensics (IWBF)\",\"volume\":\"44 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-04-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"10\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 5th International Workshop on Biometrics and Forensics (IWBF)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IWBF.2017.7935102\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 5th International Workshop on Biometrics and Forensics (IWBF)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IWBF.2017.7935102","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 10

摘要

生理和行为的人类特征是利用生物计量学和性能指标来衡量个人的一些特征。该度量可能导致一对一的匹配,这称为身份验证或1 -from- n,匹配表示身份。本文利用低维、固定维数为100的语音生物特征i向量来识别说话人。该系统的主要结构由一个i向量和三种融合方法组成。由于使用了极限学习机(ELM)分类器,它具有低复杂度和高效率。为了与我们之前基于传统高斯混合模型-通用背景模型(GMM-UBM)和最大似然(ML)分类器系统的研究进行公平比较,我们使用了来自TIMIT和NTIMIT数据库中方言1区和方言4区的120名说话者对系统进行了评估。与经典的GMM-UBM相比,该系统的识别率有所提高。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Speaker identification evaluation based on the speech biometric and i-vector model using the TIMIT and NTIMIT databases
Physiological and behavioural human characteristics are exploited in biometrics and performance metrics are used to measure some characteristic of an individual. The measure might lead to a one-to-one match, which is called authentication or one-from-N, and a match represents identification. In this paper, we exploit a speech biometric I-vector with low and fixed dimension of 100 to identify speakers. The main structure of the system consists of an I-vector with three fusion methods. It has low complexity and is efficient due to using an Extreme Learning Machine (ELM) classifier. The system is evaluated with 120 speakers from dialect regions one and four from both the TIMIT and NTIMIT databases in order to provide a fair comparison with our previous study based on the traditional Gaussian Mixture Model-Universal Background Model (GMM-UBM) with a Maximum Likelihood (ML) classifier system. The system shows identification rate improvement compared with the classical GMM-UBM.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信