Learning Music Emotion Primitives via Supervised Dynamic Clustering

Proceedings of the 24th ACM international conference on Multimedia Pub Date : 2016-10-01 DOI:10.1145/2964284.2967215

Yang Liu, Yan Liu, Xiang Zhang, Gong Chen, Ke-jun Zhang

{"title":"Learning Music Emotion Primitives via Supervised Dynamic Clustering","authors":"Yang Liu, Yan Liu, Xiang Zhang, Gong Chen, Ke-jun Zhang","doi":"10.1145/2964284.2967215","DOIUrl":null,"url":null,"abstract":"This paper explores a fundamental problem in music emotion analysis, i.e., how to segment the music sequence into a set of basic emotive units, which are named as emotion primitives. Current works on music emotion analysis are mainly based on the fixed-length music segments, which often leads to the difficulty of accurate emotion recognition. Short music segment, such as an individual music frame, may fail to evoke emotion response. Long music segment, such as an entire song, may convey various emotions over time. Moreover, the minimum length of music segment varies depending on the types of the emotions. To address these problems, we propose a novel method dubbed supervised dynamic clustering (SDC) to automatically decompose the music sequence into meaningful segments with various lengths. First, the music sequence is represented by a set of music frames. Then, the music frames are clustered according to the valence-arousal values in the emotion space. The clustering results are used to initialize the music segmentation. After that, a dynamic programming scheme is employed to jointly optimize the subsequent segmentation and grouping in the music feature space. Experimental results on standard dataset show both the effectiveness and the rationality of the proposed method.","PeriodicalId":140670,"journal":{"name":"Proceedings of the 24th ACM international conference on Multimedia","volume":"16 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 24th ACM international conference on Multimedia","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2964284.2967215","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

This paper explores a fundamental problem in music emotion analysis, i.e., how to segment the music sequence into a set of basic emotive units, which are named as emotion primitives. Current works on music emotion analysis are mainly based on the fixed-length music segments, which often leads to the difficulty of accurate emotion recognition. Short music segment, such as an individual music frame, may fail to evoke emotion response. Long music segment, such as an entire song, may convey various emotions over time. Moreover, the minimum length of music segment varies depending on the types of the emotions. To address these problems, we propose a novel method dubbed supervised dynamic clustering (SDC) to automatically decompose the music sequence into meaningful segments with various lengths. First, the music sequence is represented by a set of music frames. Then, the music frames are clustered according to the valence-arousal values in the emotion space. The clustering results are used to initialize the music segmentation. After that, a dynamic programming scheme is employed to jointly optimize the subsequent segmentation and grouping in the music feature space. Experimental results on standard dataset show both the effectiveness and the rationality of the proposed method.

查看原文本刊更多论文

基于监督动态聚类的音乐情感基元学习

本文探讨了音乐情感分析中的一个基本问题，即如何将音乐序列分割成一组基本的情感单元，这些单元被称为情感原语。目前的音乐情感分析工作主要基于固定长度的音乐片段，这往往导致难以准确识别情感。短的音乐片段，比如一个单独的音乐框架，可能无法唤起情感反应。较长的音乐片段，例如整首歌，可能会随着时间的推移传达各种情绪。此外，音乐片段的最小长度根据情绪的类型而变化。为了解决这些问题，我们提出了一种新的方法，称为监督动态聚类(SDC)，将音乐序列自动分解为不同长度的有意义的片段。首先，音乐序列由一组音乐帧表示。然后，根据情感空间的效价唤醒值对音乐帧进行聚类。聚类结果用于初始化音乐分割。然后，采用动态规划方案对音乐特征空间的后续分割分组进行联合优化。在标准数据集上的实验结果表明了该方法的有效性和合理性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 24th ACM international conference on Multimedia

自引率

0.00%

发文量