Emotional speech classification with prosodic prameters by using neural networks

The Seventh Australian and New Zealand Intelligent Information Systems Conference, 2001 Pub Date : 1900-01-01 DOI:10.1109/ANZIIS.2001.974111

H. Sato, Y. Mitsukura, M. Fukumi, N. Akamatsu

引用次数: 16

Abstract

Interestingly, in order to achieve a new Human Interface such that digital computers can deal with the KASEI information, the study of the KANSEI information processing recently has been approached. In this paper, we propose a new classification method of emotional speech by analyzing feature parameters obtained from the emotional speech and by learning them using neural networks, which is regarded as a KANSEI information processing. In the present research, KANSEI information is usually human emotion. The emotion is classified broadly into four patterns such as neutral, anger, sad and joy. The pitch as one of feature parameters governs voice modulation, and can be sensitive to change of emotion. The pitch is extracted from each emotional speech by the cepstrum method. Input values of neural networks (NNs) are then emotional pitch patterns, which are time-varying. It is shown that NNs can achieve classification of emotion by learning each emotional pitch pattern by means of computer simulations.

查看原文本刊更多论文

基于神经网络的带韵律参数的情绪语音分类

有趣的是，为了实现一种新的人机界面，使数字计算机可以处理KASEI信息，最近已经开始研究KANSEI信息处理。本文提出了一种新的情感语音分类方法，通过分析情感语音的特征参数，并利用神经网络对其进行学习，将其视为一种感性信息处理。在目前的研究中，感性信息通常是人类的情感。这种情绪大致分为四种类型，如中性、愤怒、悲伤和快乐。音高作为控制声音调制的特征参数之一，对情绪的变化非常敏感。用倒谱法从每个情感言语中提取音高。神经网络(nn)的输入值是时变的情绪音高模式。研究表明，神经网络可以通过计算机模拟来学习每种情绪音高模式，从而实现情绪的分类。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

The Seventh Australian and New Zealand Intelligent Information Systems Conference, 2001

自引率

0.00%

发文量