在没有乐谱信息的情况下对已演奏音乐的节奏和动态进行端到端贝叶斯分割和相似性评估

Music & Science Pub Date : 2024-01-01 DOI:10.1177/20592043241233411

Corentin Guichaoua, Paul Lascabettes, Elaine Chew

{"title":"在没有乐谱信息的情况下对已演奏音乐的节奏和动态进行端到端贝叶斯分割和相似性评估","authors":"Corentin Guichaoua, Paul Lascabettes, Elaine Chew","doi":"10.1177/20592043241233411","DOIUrl":null,"url":null,"abstract":"Segmenting continuous sensory input into coherent segments and subsegments is an important part of perception. Music is no exception. By shaping the acoustic properties of music during performance, musicians can strongly influence the perceived segmentation. Two main techniques musicians employ are the modulation of tempo and dynamics. Such variations carry important information for segmentation and lend themselves well to numerical analysis methods. In this article, based on tempo or loudness modulations alone, we propose a novel end-to-end Bayesian framework using dynamic programming to retrieve a musician's expressed segmentation. The method computes the credence of all possible segmentations of the recorded performance. The output is summarized in two forms: as a beat-by-beat profile revealing the posterior credence of plausible boundaries, and as expanded credence segment maps, a novel representation that converts readily to a segmentation lattice but retains information about the posterior uncertainty on the exact position of segments’ endpoints. To compare any two segmentation profiles, we introduce a method based on unbalanced optimal transport. Experimental results on the MazurkaBL dataset show that despite the drastic dimension reduction from the input data, the segmentation recovery is sufficient for deriving musical insights from comparative examination of recorded performances. This Bayesian segmentation method thus offers an alternative to binary boundary detection and finds multiple hypotheses fitting information from recorded music performances.","PeriodicalId":436334,"journal":{"name":"Music & Science","volume":"47 5","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"End-to-End Bayesian Segmentation and Similarity Assessment of Performed Music Tempo and Dynamics without Score Information\",\"authors\":\"Corentin Guichaoua, Paul Lascabettes, Elaine Chew\",\"doi\":\"10.1177/20592043241233411\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Segmenting continuous sensory input into coherent segments and subsegments is an important part of perception. Music is no exception. By shaping the acoustic properties of music during performance, musicians can strongly influence the perceived segmentation. Two main techniques musicians employ are the modulation of tempo and dynamics. Such variations carry important information for segmentation and lend themselves well to numerical analysis methods. In this article, based on tempo or loudness modulations alone, we propose a novel end-to-end Bayesian framework using dynamic programming to retrieve a musician's expressed segmentation. The method computes the credence of all possible segmentations of the recorded performance. The output is summarized in two forms: as a beat-by-beat profile revealing the posterior credence of plausible boundaries, and as expanded credence segment maps, a novel representation that converts readily to a segmentation lattice but retains information about the posterior uncertainty on the exact position of segments’ endpoints. To compare any two segmentation profiles, we introduce a method based on unbalanced optimal transport. Experimental results on the MazurkaBL dataset show that despite the drastic dimension reduction from the input data, the segmentation recovery is sufficient for deriving musical insights from comparative examination of recorded performances. This Bayesian segmentation method thus offers an alternative to binary boundary detection and finds multiple hypotheses fitting information from recorded music performances.\",\"PeriodicalId\":436334,\"journal\":{\"name\":\"Music & Science\",\"volume\":\"47 5\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Music & Science\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1177/20592043241233411\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Music & Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1177/20592043241233411","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

将连续的感觉输入分割成连贯的片段和子片段是感知的重要组成部分。音乐也不例外。音乐家通过在演奏过程中塑造音乐的声学特性，可以极大地影响感知的分段。音乐家采用的两种主要技术是节奏和动态的调节。这些变化蕴含着重要的分段信息，非常适合数值分析方法。在本文中，我们仅根据节奏或响度的调节，提出了一种新颖的端到端贝叶斯框架，利用动态编程检索音乐家表达的分段。该方法可计算录音表演所有可能分段的可信度。输出结果以两种形式汇总：一种是逐拍剖面图，显示可信边界的后验可信度；另一种是扩展可信度分段图，这是一种新颖的表示方法，可轻松转换为分段网格，但保留了分段端点确切位置的后验不确定性信息。为了比较任意两个分割图，我们引入了一种基于非平衡最优传输的方法。在 MazurkaBL 数据集上的实验结果表明，尽管输入数据的维度急剧下降，但分段恢复足以从录音表演的比较检查中获得音乐见解。因此，这种贝叶斯分割方法提供了一种二元边界检测的替代方法，并能从录制的音乐表演中找到符合信息的多种假设。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

End-to-End Bayesian Segmentation and Similarity Assessment of Performed Music Tempo and Dynamics without Score Information

Segmenting continuous sensory input into coherent segments and subsegments is an important part of perception. Music is no exception. By shaping the acoustic properties of music during performance, musicians can strongly influence the perceived segmentation. Two main techniques musicians employ are the modulation of tempo and dynamics. Such variations carry important information for segmentation and lend themselves well to numerical analysis methods. In this article, based on tempo or loudness modulations alone, we propose a novel end-to-end Bayesian framework using dynamic programming to retrieve a musician's expressed segmentation. The method computes the credence of all possible segmentations of the recorded performance. The output is summarized in two forms: as a beat-by-beat profile revealing the posterior credence of plausible boundaries, and as expanded credence segment maps, a novel representation that converts readily to a segmentation lattice but retains information about the posterior uncertainty on the exact position of segments’ endpoints. To compare any two segmentation profiles, we introduce a method based on unbalanced optimal transport. Experimental results on the MazurkaBL dataset show that despite the drastic dimension reduction from the input data, the segmentation recovery is sufficient for deriving musical insights from comparative examination of recorded performances. This Bayesian segmentation method thus offers an alternative to binary boundary detection and finds multiple hypotheses fitting information from recorded music performances.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Music & Science

自引率

0.00%

发文量