Context-Aware Attention Mechanism for Speech Emotion Recognition

2018 IEEE Spoken Language Technology Workshop (SLT) Pub Date : 2018-12-01 DOI:10.1109/SLT.2018.8639633

Gaetan Ramet, Philip N. Garner, Michael Baeriswyl, Alexandros Lazaridis

引用次数: 37

Abstract

In this work, we study the use of attention mechanisms to enhance the performance of the state-of-the-art deep learning model in Speech Emotion Recognition (SER). We introduce a new Long Short-Term Memory (LSTM)-based neural network attention model which is able to take into account the temporal information in speech during the computation of the attention vector. The proposed LSTM-based model is evaluated on the IEMOCAP dataset using a 5-fold cross-validation scheme and achieved 68.8% weighted accuracy on 4 classes, which outperforms the state-of-the-art models.

查看原文本刊更多论文

语音情绪识别的语境感知注意机制

在这项工作中，我们研究了使用注意力机制来提高语音情感识别(SER)中最先进的深度学习模型的性能。提出了一种新的基于长短期记忆的神经网络注意模型，该模型在计算注意向量时能够考虑语音中的时间信息。基于lstm的模型在IEMOCAP数据集上使用5倍交叉验证方案进行评估，在4个类别上达到68.8%的加权准确率，优于目前最先进的模型。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2018 IEEE Spoken Language Technology Workshop (SLT)

自引率

0.00%

发文量