Le Yang, Yan Li, Haifeng Chen, D. Jiang, Meshia Cédric Oveneke, H. Sahli
{"title":"Bipolar Disorder Recognition with Histogram Features of Arousal and Body Gestures","authors":"Le Yang, Yan Li, Haifeng Chen, D. Jiang, Meshia Cédric Oveneke, H. Sahli","doi":"10.1145/3266302.3266308","DOIUrl":null,"url":null,"abstract":"This paper targets the Bipolar Disorder Challenge (BDC) task of Audio Visual Emotion Challenge (AVEC) 2018. Firstly, two novel features are proposed: 1) a histogram based arousal feature, in which the continuous arousal values are estimated from the audio cues by a Long Short-Term Memory Recurrent Neural Network (LSTM-RNN) model; 2) a Histogram of Displacement (HDR) based upper body posture feature, which characterizes the displacement and velocity of the key body points in the video segment. In addition, we propose a multi-stream bipolar disorder classification framework with Deep Neural Networks (DNNs) and a Random Forest, and adopt the ensemble learning strategy to alleviate the possible over-fitting problem due to the limited training data. Experimental results show that the proposed arousal feature and upper body posture feature are discriminative for different bipolar episodes, and our proposed framework achieves promising classification results on the development set, with the unweighted average recall (UAR) of 0.714, which is higher than the baseline result 0.635. On test set evaluation, our system obtains the same UAR (0.574) as the challenge baseline.","PeriodicalId":123523,"journal":{"name":"Proceedings of the 2018 on Audio/Visual Emotion Challenge and Workshop","volume":"231 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"22","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2018 on Audio/Visual Emotion Challenge and Workshop","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3266302.3266308","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 22
Abstract
This paper targets the Bipolar Disorder Challenge (BDC) task of Audio Visual Emotion Challenge (AVEC) 2018. Firstly, two novel features are proposed: 1) a histogram based arousal feature, in which the continuous arousal values are estimated from the audio cues by a Long Short-Term Memory Recurrent Neural Network (LSTM-RNN) model; 2) a Histogram of Displacement (HDR) based upper body posture feature, which characterizes the displacement and velocity of the key body points in the video segment. In addition, we propose a multi-stream bipolar disorder classification framework with Deep Neural Networks (DNNs) and a Random Forest, and adopt the ensemble learning strategy to alleviate the possible over-fitting problem due to the limited training data. Experimental results show that the proposed arousal feature and upper body posture feature are discriminative for different bipolar episodes, and our proposed framework achieves promising classification results on the development set, with the unweighted average recall (UAR) of 0.714, which is higher than the baseline result 0.635. On test set evaluation, our system obtains the same UAR (0.574) as the challenge baseline.
本文针对2018视听情感挑战赛(AVEC)的双相情感障碍挑战(BDC)任务。首先,提出了两个新的特征:1)基于直方图的唤醒特征,该特征通过长短期记忆递归神经网络(LSTM-RNN)模型从音频线索中估计连续唤醒值;2)基于位移直方图(Histogram of Displacement, HDR)的上半身姿态特征,表征视频片段中关键身体点的位移和速度。此外,我们提出了一个基于深度神经网络(dnn)和随机森林的多流双相情感障碍分类框架,并采用集成学习策略来缓解由于训练数据有限而可能出现的过拟合问题。实验结果表明,提出的唤醒特征和上半身姿势特征对不同的双相情绪发作具有区别性,在发展集上取得了很好的分类效果,未加权平均召回率(UAR)为0.714,高于基线结果0.635。在测试集评估上,我们的系统获得与挑战基线相同的UAR(0.574)。