学习、概率与逻辑:基于内容的音乐信息检索的统一方法

Frontiers Digit. Humanit. Pub Date : 2019-04-16 DOI:10.3389/fdigh.2019.00006

H. Crayencour, Carmine-Emanuele Cella

{"title":"学习、概率与逻辑:基于内容的音乐信息检索的统一方法","authors":"H. Crayencour, Carmine-Emanuele Cella","doi":"10.3389/fdigh.2019.00006","DOIUrl":null,"url":null,"abstract":"Within the last fifteen years, the field of Music Information Retrieval (MIR) has made tremendous progress in the development of algorithms for organizing and analyzing the ever-increasing large and varied amount of music and music-related data available digitally. However, the development of content-based methods to enable or improve multimedia retrieval still remains a central challenge. In this perspective paper, we critically look at the problem of automatic chord estimation from audio recordings as a case study of content-based algorithms, and point out several bottlenecks in current approaches: expressiveness and flexibility are obtained to the expense of robustness and vice-versa; available multimodal sources of information are little exploited; modeling multi-faceted and strongly interrelated musical information is limited with current architectures; models are typically restricted to short-term analysis that does not account for the hierarchical temporal structure of musical signals. Dealing with music data requires the ability to handle both uncertainty and complex relational structure at multiple levels of representation. Traditional approaches have generally treated these two aspects separately, probability and learning being the standard way to represent uncertainty in knowledge, while logical representation being the standard way to represent knowledge and complex relational information. We advocate that the identified hurdles of current approaches could be overcome by recent developments in the area of Statistical Relational Artificial Intelligence (StarAI) that unifies probability, logic and (deep) learning. We show that existing approaches used in MIR find powerful extensions and unifications in StarAI, and we explain why we think it is time to consider the new perspectives offered by this promising research field.","PeriodicalId":227954,"journal":{"name":"Frontiers Digit. Humanit.","volume":"243 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Learning, Probability and Logic: Toward a Unified Approach for Content-Based Music Information Retrieval\",\"authors\":\"H. Crayencour, Carmine-Emanuele Cella\",\"doi\":\"10.3389/fdigh.2019.00006\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Within the last fifteen years, the field of Music Information Retrieval (MIR) has made tremendous progress in the development of algorithms for organizing and analyzing the ever-increasing large and varied amount of music and music-related data available digitally. However, the development of content-based methods to enable or improve multimedia retrieval still remains a central challenge. In this perspective paper, we critically look at the problem of automatic chord estimation from audio recordings as a case study of content-based algorithms, and point out several bottlenecks in current approaches: expressiveness and flexibility are obtained to the expense of robustness and vice-versa; available multimodal sources of information are little exploited; modeling multi-faceted and strongly interrelated musical information is limited with current architectures; models are typically restricted to short-term analysis that does not account for the hierarchical temporal structure of musical signals. Dealing with music data requires the ability to handle both uncertainty and complex relational structure at multiple levels of representation. Traditional approaches have generally treated these two aspects separately, probability and learning being the standard way to represent uncertainty in knowledge, while logical representation being the standard way to represent knowledge and complex relational information. We advocate that the identified hurdles of current approaches could be overcome by recent developments in the area of Statistical Relational Artificial Intelligence (StarAI) that unifies probability, logic and (deep) learning. We show that existing approaches used in MIR find powerful extensions and unifications in StarAI, and we explain why we think it is time to consider the new perspectives offered by this promising research field.\",\"PeriodicalId\":227954,\"journal\":{\"name\":\"Frontiers Digit. Humanit.\",\"volume\":\"243 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-04-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Frontiers Digit. Humanit.\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3389/fdigh.2019.00006\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Frontiers Digit. Humanit.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3389/fdigh.2019.00006","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

在过去的15年里，音乐信息检索(MIR)领域在组织和分析数字音乐和音乐相关数据的算法开发方面取得了巨大的进步。然而，开发基于内容的方法来支持或改进多媒体检索仍然是一个核心挑战。在这篇观点论文中，我们以基于内容的算法为例，批判性地审视了音频记录的自动和弦估计问题，并指出了当前方法中的几个瓶颈:以鲁棒性为代价获得表达性和灵活性，反之亦然;现有的多式联运信息来源很少得到利用;多面性和强相关性的音乐信息建模受到当前架构的限制;模型通常限于短期分析，不能解释音乐信号的分层时间结构。处理音乐数据需要在多个表示层次上处理不确定性和复杂关系结构的能力。传统方法一般将这两个方面分开处理，概率和学习是表示知识不确定性的标准方式，而逻辑表示是表示知识和复杂关系信息的标准方式。我们主张，当前方法所确定的障碍可以通过统计关系人工智能(StarAI)领域的最新发展来克服，该领域统一了概率、逻辑和(深度)学习。我们展示了MIR中使用的现有方法在StarAI中找到了强大的扩展和统一，并解释了为什么我们认为是时候考虑这个有前途的研究领域提供的新视角了。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Learning, Probability and Logic: Toward a Unified Approach for Content-Based Music Information Retrieval

Within the last fifteen years, the field of Music Information Retrieval (MIR) has made tremendous progress in the development of algorithms for organizing and analyzing the ever-increasing large and varied amount of music and music-related data available digitally. However, the development of content-based methods to enable or improve multimedia retrieval still remains a central challenge. In this perspective paper, we critically look at the problem of automatic chord estimation from audio recordings as a case study of content-based algorithms, and point out several bottlenecks in current approaches: expressiveness and flexibility are obtained to the expense of robustness and vice-versa; available multimodal sources of information are little exploited; modeling multi-faceted and strongly interrelated musical information is limited with current architectures; models are typically restricted to short-term analysis that does not account for the hierarchical temporal structure of musical signals. Dealing with music data requires the ability to handle both uncertainty and complex relational structure at multiple levels of representation. Traditional approaches have generally treated these two aspects separately, probability and learning being the standard way to represent uncertainty in knowledge, while logical representation being the standard way to represent knowledge and complex relational information. We advocate that the identified hurdles of current approaches could be overcome by recent developments in the area of Statistical Relational Artificial Intelligence (StarAI) that unifies probability, logic and (deep) learning. We show that existing approaches used in MIR find powerful extensions and unifications in StarAI, and we explain why we think it is time to consider the new perspectives offered by this promising research field.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Frontiers Digit. Humanit.

自引率

0.00%

发文量