Analysis and Modeling of Affective Audio Visual Speech Based on PAD Emotion Space

2008 6th International Symposium on Chinese Spoken Language Processing Pub Date : 2008-12-30 DOI:10.1109/CHINSL.2008.ECP.82

Shen Zhang, Yingjin Xu, Jia Jia, Lianhong Cai

引用次数: 5

Abstract

This paper analyzes acoustic and visual features for affective audio-visual speech based on PAD (Pleasure-Arousal- Dominance) emotion space. The selected acoustic features include FO maximum, FO minimum, duration and energy. A set of Partial Expression Parameters (PEP) is proposed as visual features to describe affective facial movement on talking face. This paper explores the connection between PAD emotion space and acoustic/visual features respectively. The variation of acoustic features is predicted by PAD values, and a PAD-PEP mapping function for facial expression synthesis is built. Experimental result shows that PAD could be properly applied in describing emotional state as well as predicting the acoustic/visual features for affective audiovisual speech synthesis.

查看原文本刊更多论文

基于PAD情感空间的情感视听语音分析与建模

本文分析了基于PAD (Pleasure-Arousal- Dominance)情感空间的情感性视听语音的声学和视觉特征。所选的声学特征包括最大FO值、最小FO值、持续时间和能量。提出了一组部分表情参数作为描述说话面部情感运动的视觉特征。本文分别探讨了PAD情感空间与听觉/视觉特征之间的联系。利用PAD值预测声特征的变化，建立了用于人脸表情合成的PAD- pep映射函数。实验结果表明，PAD可以很好地用于情感视听语音合成的情绪状态描述和声视特征预测。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2008 6th International Symposium on Chinese Spoken Language Processing

自引率

0.00%

发文量