基于基音包络的帧级评分重加权情感鲁棒识别算法

2009 3rd International Conference on Affective Computing and Intelligent Interaction and Workshops Pub Date : 2009-12-08 DOI:10.1109/ACII.2009.5349589

Dongdong Li, Yingchun Yang, Ting Huang

{"title":"基于基音包络的帧级评分重加权情感鲁棒识别算法","authors":"Dongdong Li, Yingchun Yang, Ting Huang","doi":"10.1109/ACII.2009.5349589","DOIUrl":null,"url":null,"abstract":"Speech with various emotions aggravates the performance of speaker recognition systems. In this paper, a novel score normalization approach called pitch envelope based frame level score reweighted (PFLSR) algorithm is introduced to compensate the influence of the affective speech on speaker recognition. The approach assumes that the maximum likelihood model is not easily changed with the expressive corruption for most of the frames. Thus the test frames are divided into two parts according to F0, the heavily affected ones and the slightly affected ones. The confidences of the slightly affected frames are reweighted into new scores to strengthen their confidence, and to optimize the final accumulated frame scores over the whole test utterance. The experiments are conducted on the Mandarin Affective Speech Corpus. An improvement of 15.1% in identification rate over the traditional speaker recognition is achieved.","PeriodicalId":330737,"journal":{"name":"2009 3rd International Conference on Affective Computing and Intelligent Interaction and Workshops","volume":"29 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Pitch envelope based frame level score reweighed algorithm for emotion robust speaker recognition\",\"authors\":\"Dongdong Li, Yingchun Yang, Ting Huang\",\"doi\":\"10.1109/ACII.2009.5349589\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Speech with various emotions aggravates the performance of speaker recognition systems. In this paper, a novel score normalization approach called pitch envelope based frame level score reweighted (PFLSR) algorithm is introduced to compensate the influence of the affective speech on speaker recognition. The approach assumes that the maximum likelihood model is not easily changed with the expressive corruption for most of the frames. Thus the test frames are divided into two parts according to F0, the heavily affected ones and the slightly affected ones. The confidences of the slightly affected frames are reweighted into new scores to strengthen their confidence, and to optimize the final accumulated frame scores over the whole test utterance. The experiments are conducted on the Mandarin Affective Speech Corpus. An improvement of 15.1% in identification rate over the traditional speaker recognition is achieved.\",\"PeriodicalId\":330737,\"journal\":{\"name\":\"2009 3rd International Conference on Affective Computing and Intelligent Interaction and Workshops\",\"volume\":\"29 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2009-12-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2009 3rd International Conference on Affective Computing and Intelligent Interaction and Workshops\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ACII.2009.5349589\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2009 3rd International Conference on Affective Computing and Intelligent Interaction and Workshops","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ACII.2009.5349589","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 4

摘要

带有各种情绪的言语会使说话人识别系统的性能恶化。本文提出了一种新的评分归一化方法——基于基音包络的帧级评分重加权(PFLSR)算法来补偿情感语音对说话人识别的影响。该方法假定最大似然模型不容易随着大多数帧的表达性损坏而改变。据此，将试验框架按F0分为受影响较重和受影响较轻两部分。轻微影响的帧的置信度被重新加权为新的分数，以增强它们的置信度，并优化整个测试话语的最终累计帧分数。实验在汉语情感语音语料库上进行。与传统的说话人识别相比，识别率提高了15.1%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Pitch envelope based frame level score reweighed algorithm for emotion robust speaker recognition

Speech with various emotions aggravates the performance of speaker recognition systems. In this paper, a novel score normalization approach called pitch envelope based frame level score reweighted (PFLSR) algorithm is introduced to compensate the influence of the affective speech on speaker recognition. The approach assumes that the maximum likelihood model is not easily changed with the expressive corruption for most of the frames. Thus the test frames are divided into two parts according to F0, the heavily affected ones and the slightly affected ones. The confidences of the slightly affected frames are reweighted into new scores to strengthen their confidence, and to optimize the final accumulated frame scores over the whole test utterance. The experiments are conducted on the Mandarin Affective Speech Corpus. An improvement of 15.1% in identification rate over the traditional speaker recognition is achieved.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2009 3rd International Conference on Affective Computing and Intelligent Interaction and Workshops

自引率

0.00%

发文量