Prediction of Time-Varying Musical Mood Distributions Using Kalman Filtering

2010 Ninth International Conference on Machine Learning and Applications Pub Date : 2010-12-12 DOI:10.1109/ICMLA.2010.101

Erik M. Schmidt, Youngmoo E. Kim

引用次数: 60

Abstract

The medium of music has evolved specifically for the expression of emotions, and it is natural for us to organize music in terms of its emotional associations. In previous work, we have modeled human response labels to music in the arousal-valence (A-V) representation of affect as a time-varying, stochastic distribution reflecting the ambiguous nature of the perception of mood. These distributions are used to predict A-V responses from acoustic features of the music alone via multi-variate regression. In this paper, we extend our framework to account for multiple regression mappings contingent upon a general location in A-V space. Furthermore, we model A-V state as the latent variable of a linear dynamical system, more explicitly capturing the dynamics of musical mood. We validate this extension using a "genie-bounded" approach, in which we assume that a piece of music is correctly clustered in A-V space a priori, demonstrating significantly higher theoretical performance than the previous single-regressor approach.

查看原文本刊更多论文

利用卡尔曼滤波预测时变音乐情绪分布

音乐的媒介是专门为表达情感而进化的，我们很自然地根据情感联系来组织音乐。在之前的工作中，我们已经在情感的觉醒价(a -v)表示中模拟了人类对音乐的反应标签，作为一个时变的随机分布，反映了情绪感知的模糊性。这些分布被用来通过多变量回归预测音乐声学特征的A-V响应。在本文中，我们扩展了我们的框架，以考虑基于a - v空间中一般位置的多重回归映射。此外，我们将a - v状态建模为线性动力系统的潜在变量，更明确地捕捉音乐情绪的动态。我们使用“基因边界”方法验证了这一扩展，在这种方法中，我们假设一段音乐先验地正确聚集在a - v空间中，证明了比之前的单回归方法更高的理论性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2010 Ninth International Conference on Machine Learning and Applications

自引率

0.00%

发文量