一种基于置信度的分数融合技术，将MFCC和Pitch融合到说话者验证中

2011 3rd International Conference on Electronics Computer Technology Pub Date : 2011-04-08 DOI:10.1109/ICECTECH.2011.5941763

S. Pandiaraj, H. N. R. Keziah, D. Vinothini, L. Gloria, K. R. S. Kumar

{"title":"一种基于置信度的分数融合技术，将MFCC和Pitch融合到说话者验证中","authors":"S. Pandiaraj, H. N. R. Keziah, D. Vinothini, L. Gloria, K. R. S. Kumar","doi":"10.1109/ICECTECH.2011.5941763","DOIUrl":null,"url":null,"abstract":"The objective of this paper is to evaluate the effectiveness of complementary speech features extracted from a speaker for verification. Traditionally, speaker verification systems use a single feature for representing speaker-specific information. In this work extraction of segmental and suprasegmental features is proposed which shows a significant improvement in the performance of verification. The size and shape assumed by the vocal tract while producing various sound units is generated by Mel Frequency Cepstral Coefficient (MFCC) which is a segmental feature. Pitch information contributes to the uniqueness of the speaker's voice at the suprasegmental feature which spans for a longer duration than the frames used for short term spectral analysis. The scores obtained using MFCC and Pitch based systems are fused using a confidence measure. Speaker Verification experiments were carried out on the CHAINS corpus database. The equal error rate (EER) obtained for the MFCC system is 12.8%. The MFCC system outperforms the system based on Pitch alone. The integration MFCC and Pitch for speaker verification using a confidence measure gives an EER of 11.2%.","PeriodicalId":184011,"journal":{"name":"2011 3rd International Conference on Electronics Computer Technology","volume":"14 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"A confidence measure based — Score fusion technique to integrate MFCC and Pitch for speaker verification\",\"authors\":\"S. Pandiaraj, H. N. R. Keziah, D. Vinothini, L. Gloria, K. R. S. Kumar\",\"doi\":\"10.1109/ICECTECH.2011.5941763\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The objective of this paper is to evaluate the effectiveness of complementary speech features extracted from a speaker for verification. Traditionally, speaker verification systems use a single feature for representing speaker-specific information. In this work extraction of segmental and suprasegmental features is proposed which shows a significant improvement in the performance of verification. The size and shape assumed by the vocal tract while producing various sound units is generated by Mel Frequency Cepstral Coefficient (MFCC) which is a segmental feature. Pitch information contributes to the uniqueness of the speaker's voice at the suprasegmental feature which spans for a longer duration than the frames used for short term spectral analysis. The scores obtained using MFCC and Pitch based systems are fused using a confidence measure. Speaker Verification experiments were carried out on the CHAINS corpus database. The equal error rate (EER) obtained for the MFCC system is 12.8%. The MFCC system outperforms the system based on Pitch alone. The integration MFCC and Pitch for speaker verification using a confidence measure gives an EER of 11.2%.\",\"PeriodicalId\":184011,\"journal\":{\"name\":\"2011 3rd International Conference on Electronics Computer Technology\",\"volume\":\"14 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2011-04-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2011 3rd International Conference on Electronics Computer Technology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICECTECH.2011.5941763\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 3rd International Conference on Electronics Computer Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICECTECH.2011.5941763","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 5

摘要

本文的目的是评估从说话人身上提取的互补语音特征的有效性。传统上，说话人验证系统使用单个特征来表示说话人特定的信息。本文提出了分段和超分段特征的提取方法，大大提高了验证的性能。声道在产生各种声音单元时所假定的大小和形状是由Mel频率倒谱系数(MFCC)产生的，它是一种节段特征。音高信息有助于说话人声音在超分段特征上的唯一性，这种特征的持续时间比用于短期频谱分析的帧长。使用MFCC和基于Pitch的系统获得的分数使用置信度度量进行融合。在CHAINS语料库上进行了说话人验证实验。MFCC系统的等效误差率(EER)为12.8%。MFCC系统优于单独基于Pitch的系统。集成MFCC和Pitch用于使用置信度度量的说话人验证，其EER为11.2%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A confidence measure based — Score fusion technique to integrate MFCC and Pitch for speaker verification

The objective of this paper is to evaluate the effectiveness of complementary speech features extracted from a speaker for verification. Traditionally, speaker verification systems use a single feature for representing speaker-specific information. In this work extraction of segmental and suprasegmental features is proposed which shows a significant improvement in the performance of verification. The size and shape assumed by the vocal tract while producing various sound units is generated by Mel Frequency Cepstral Coefficient (MFCC) which is a segmental feature. Pitch information contributes to the uniqueness of the speaker's voice at the suprasegmental feature which spans for a longer duration than the frames used for short term spectral analysis. The scores obtained using MFCC and Pitch based systems are fused using a confidence measure. Speaker Verification experiments were carried out on the CHAINS corpus database. The equal error rate (EER) obtained for the MFCC system is 12.8%. The MFCC system outperforms the system based on Pitch alone. The integration MFCC and Pitch for speaker verification using a confidence measure gives an EER of 11.2%.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2011 3rd International Conference on Electronics Computer Technology

自引率

0.00%

发文量