基于自适应正弦模型的情绪言语分析

2014 22nd European Signal Processing Conference (EUSIPCO) Pub Date : 2014-11-13 DOI:10.5281/ZENODO.44181

George P. Kafentzis, Theodora Yakoumaki, A. Mouchtaris, Y. Stylianou

{"title":"基于自适应正弦模型的情绪言语分析","authors":"George P. Kafentzis, Theodora Yakoumaki, A. Mouchtaris, Y. Stylianou","doi":"10.5281/ZENODO.44181","DOIUrl":null,"url":null,"abstract":"Processing of emotional (or expressive) speech has gained attention over recent years in the speech community due to its numerous applications. In this paper, an adaptive sinusoidal model (aSM), dubbed extended adaptive Quasi-Harmonic Model - eaQHM, is employed to analyze emotional speech in accurate, robust, continuous, timevarying parameters (amplitude, frequency, and phase). It is shown that these parameters can adequately and accurately represent emotional speech content. Using a well known database of narrowband expressive speech (SUSAS) we show that very high Signal-to-Reconstruction-Error Ratio (SRER) values can be obtained, compared to the standard sinusoidal model (SM). Formal listening tests on a smaller wideband speech database show that the eaQHM outperforms SM from a perceptual resynthesis quality point of view. Finally, preliminary emotion classification tests show that the parameters obtained from the adaptive model lead to a higher classification score, compared to the standard SM parameters.","PeriodicalId":198408,"journal":{"name":"2014 22nd European Signal Processing Conference (EUSIPCO)","volume":"600 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":"{\"title\":\"Analysis of emotional speech using an adaptive sinusoidal model\",\"authors\":\"George P. Kafentzis, Theodora Yakoumaki, A. Mouchtaris, Y. Stylianou\",\"doi\":\"10.5281/ZENODO.44181\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Processing of emotional (or expressive) speech has gained attention over recent years in the speech community due to its numerous applications. In this paper, an adaptive sinusoidal model (aSM), dubbed extended adaptive Quasi-Harmonic Model - eaQHM, is employed to analyze emotional speech in accurate, robust, continuous, timevarying parameters (amplitude, frequency, and phase). It is shown that these parameters can adequately and accurately represent emotional speech content. Using a well known database of narrowband expressive speech (SUSAS) we show that very high Signal-to-Reconstruction-Error Ratio (SRER) values can be obtained, compared to the standard sinusoidal model (SM). Formal listening tests on a smaller wideband speech database show that the eaQHM outperforms SM from a perceptual resynthesis quality point of view. Finally, preliminary emotion classification tests show that the parameters obtained from the adaptive model lead to a higher classification score, compared to the standard SM parameters.\",\"PeriodicalId\":198408,\"journal\":{\"name\":\"2014 22nd European Signal Processing Conference (EUSIPCO)\",\"volume\":\"600 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-11-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"7\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2014 22nd European Signal Processing Conference (EUSIPCO)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.5281/ZENODO.44181\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 22nd European Signal Processing Conference (EUSIPCO)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5281/ZENODO.44181","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 7

摘要

近年来，情绪(或表达)语音的处理由于其众多的应用而引起了语言界的关注。本文采用一种自适应正弦模型(aSM)，即扩展自适应准谐波模型(eaQHM)，对情绪语音进行精确、鲁棒、连续、时变的参数(幅度、频率和相位)分析。结果表明，这些参数能够充分、准确地表征情感语音的内容。使用一个著名的窄带表达语音(SUSAS)数据库，我们证明了与标准正弦模型(SM)相比，可以获得非常高的信号重建误差率(SRER)值。在较小的宽带语音数据库上进行的正式听力测试表明，从感知重合成质量的角度来看，eaQHM优于SM。最后，初步的情绪分类测试表明，与标准SM参数相比，自适应模型获得的参数导致更高的分类得分。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Analysis of emotional speech using an adaptive sinusoidal model

Processing of emotional (or expressive) speech has gained attention over recent years in the speech community due to its numerous applications. In this paper, an adaptive sinusoidal model (aSM), dubbed extended adaptive Quasi-Harmonic Model - eaQHM, is employed to analyze emotional speech in accurate, robust, continuous, timevarying parameters (amplitude, frequency, and phase). It is shown that these parameters can adequately and accurately represent emotional speech content. Using a well known database of narrowband expressive speech (SUSAS) we show that very high Signal-to-Reconstruction-Error Ratio (SRER) values can be obtained, compared to the standard sinusoidal model (SM). Formal listening tests on a smaller wideband speech database show that the eaQHM outperforms SM from a perceptual resynthesis quality point of view. Finally, preliminary emotion classification tests show that the parameters obtained from the adaptive model lead to a higher classification score, compared to the standard SM parameters.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2014 22nd European Signal Processing Conference (EUSIPCO)

自引率

0.00%

发文量