Audio Content-Based Music Retrieval

Peter Grosche, Meinard Müller, J. Serrà
{"title":"Audio Content-Based Music Retrieval","authors":"Peter Grosche, Meinard Müller, J. Serrà","doi":"10.4230/DFU.Vol3.11041.157","DOIUrl":null,"url":null,"abstract":"The rapidly growing corpus of digital audio material requires novel \nretrieval strategies for exploring large music collections. Traditional retrieval strategies rely on metadata that describe the actual audio content in words. In the case that such textual descriptions are not available, one requires content-based retrieval strategies which only utilize the raw audio material. In this contribution, we discuss content-based retrieval strategies that \nfollow the query-by-example paradigm: given an audio query, the task is to retrieve all documents that are somehow similar or related to the query from a music collection. Such strategies can be loosely classified according to their \"specificity\", which refers to the degree of similarity between the query and the database documents. Here, high specificity refers to a strict notion of similarity, whereas low specificity to a rather vague one. Furthermore, we introduce a second classification principle based on \"granularity\", where one distinguishes between fragment-level and document-level retrieval. Using a classification scheme based on specificity and granularity, we identify various classes of retrieval scenarios, which comprise \"audio identification\", \"audio matching\", and \"version \nidentification\". For these three important classes, we give an overview of representative state-of-the-art approaches, which also illustrate the sometimes subtle but crucial differences between the retrieval scenarios. Finally, we give an outlook on a user-oriented retrieval system, which combines the various retrieval strategies in a unified framework.","PeriodicalId":400865,"journal":{"name":"Multimodal Music Processing","volume":"27 10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-04-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"54","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Multimodal Music Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.4230/DFU.Vol3.11041.157","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 54

Abstract

The rapidly growing corpus of digital audio material requires novel retrieval strategies for exploring large music collections. Traditional retrieval strategies rely on metadata that describe the actual audio content in words. In the case that such textual descriptions are not available, one requires content-based retrieval strategies which only utilize the raw audio material. In this contribution, we discuss content-based retrieval strategies that follow the query-by-example paradigm: given an audio query, the task is to retrieve all documents that are somehow similar or related to the query from a music collection. Such strategies can be loosely classified according to their "specificity", which refers to the degree of similarity between the query and the database documents. Here, high specificity refers to a strict notion of similarity, whereas low specificity to a rather vague one. Furthermore, we introduce a second classification principle based on "granularity", where one distinguishes between fragment-level and document-level retrieval. Using a classification scheme based on specificity and granularity, we identify various classes of retrieval scenarios, which comprise "audio identification", "audio matching", and "version identification". For these three important classes, we give an overview of representative state-of-the-art approaches, which also illustrate the sometimes subtle but crucial differences between the retrieval scenarios. Finally, we give an outlook on a user-oriented retrieval system, which combines the various retrieval strategies in a unified framework.
基于音频内容的音乐检索
快速增长的数字音频材料语料库需要新的检索策略来探索大型音乐收藏。传统的检索策略依赖于用文字描述实际音频内容的元数据。在这种情况下,这样的文本描述是不可用的,人们需要基于内容的检索策略,只利用原始音频材料。在本文中,我们讨论了遵循按示例查询范式的基于内容的检索策略:给定一个音频查询,任务是从音乐集合中检索与该查询在某种程度上相似或相关的所有文档。这些策略可以根据它们的“专一性”进行粗略分类,专一性指的是查询和数据库文档之间的相似程度。在这里,高特异性是指严格的相似性概念,而低特异性是指相当模糊的概念。此外,我们引入了基于“粒度”的第二种分类原则,其中可以区分片段级检索和文档级检索。采用基于特异性和粒度的分类方案,将检索场景划分为“音频识别”、“音频匹配”和“版本识别”三类。对于这三个重要的类,我们概述了具有代表性的最先进的方法,这些方法也说明了检索场景之间有时微妙但至关重要的差异。最后,我们展望了面向用户的检索系统,该系统将各种检索策略结合在一个统一的框架中。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信