Using quadratic programming to estimate feature relevance in structural analyses of music

Proceedings of the 21st ACM international conference on Multimedia Pub Date : 2013-10-21 DOI:10.1145/2502081.2502124

Jordan B. L. Smith, E. Chew

{"title":"Using quadratic programming to estimate feature relevance in structural analyses of music","authors":"Jordan B. L. Smith, E. Chew","doi":"10.1145/2502081.2502124","DOIUrl":null,"url":null,"abstract":"To identify repeated patterns and contrasting sections in music, it is common to use self-similarity matrices (SSMs) to visualize and estimate structure. We introduce a novel application for SSMs derived from audio recordings: using them to learn about the potential reasoning behind a listener's annotation. We use SSMs generated by musically-motivated audio features at various timescales to represent contributions to a structural annotation. Since a listener's attention can shift among musical features (e.g., rhythm, timbre, and harmony) throughout a piece, we further break down the SSMs into section-wise components and use quadratic programming (QP) to minimize the distance between a linear sum of these components and the annotated description. We posit that the optimal section-wise weights on the feature components may indicate the features to which a listener attended when annotating a piece, and thus may help us to understand why two listeners disagreed about a piece's structure. We discuss some examples that substantiate the claim that feature relevance varies throughout a piece, using our method to investigate differences between listeners' interpretations, and lastly propose some variations on our method.","PeriodicalId":20448,"journal":{"name":"Proceedings of the 21st ACM international conference on Multimedia","volume":"365 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2013-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 21st ACM international conference on Multimedia","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2502081.2502124","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 10

Abstract

To identify repeated patterns and contrasting sections in music, it is common to use self-similarity matrices (SSMs) to visualize and estimate structure. We introduce a novel application for SSMs derived from audio recordings: using them to learn about the potential reasoning behind a listener's annotation. We use SSMs generated by musically-motivated audio features at various timescales to represent contributions to a structural annotation. Since a listener's attention can shift among musical features (e.g., rhythm, timbre, and harmony) throughout a piece, we further break down the SSMs into section-wise components and use quadratic programming (QP) to minimize the distance between a linear sum of these components and the annotated description. We posit that the optimal section-wise weights on the feature components may indicate the features to which a listener attended when annotating a piece, and thus may help us to understand why two listeners disagreed about a piece's structure. We discuss some examples that substantiate the claim that feature relevance varies throughout a piece, using our method to investigate differences between listeners' interpretations, and lastly propose some variations on our method.

查看原文本刊更多论文

用二次规划估计音乐结构分析中的特征相关性

为了识别音乐中的重复模式和对比部分，通常使用自相似矩阵(ssm)来可视化和估计结构。我们介绍了一种来自录音的ssm的新应用:使用它们来了解听者注释背后的潜在推理。我们使用由音乐驱动的音频特征在不同时间尺度上生成的ssm来表示对结构注释的贡献。由于听众的注意力可以在整个作品的音乐特征(如节奏、音色和和声)之间转移，我们进一步将ssm分解为分段组件，并使用二次规划(QP)最小化这些组件的线性总和与注释描述之间的距离。我们假设特征组件上的最佳分段明智权重可以指示听众在注释一首乐曲时所关注的特征，从而可以帮助我们理解为什么两个听众对一首乐曲的结构意见不一致。我们讨论了一些例子，证实了特征相关性在整个作品中有所不同的说法，使用我们的方法来调查听众之间解释的差异，最后提出了我们方法的一些变化。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 21st ACM international conference on Multimedia

自引率

0.00%

发文量