基于描述的音乐查询作为一个多类学习问题

2002 IEEE Workshop on Multimedia Signal Processing. Pub Date : 2002-12-09 DOI:10.1109/MMSP.2002.1203270

B. Whitman, R. Rifkin

{"title":"基于描述的音乐查询作为一个多类学习问题","authors":"B. Whitman, R. Rifkin","doi":"10.1109/MMSP.2002.1203270","DOIUrl":null,"url":null,"abstract":"We present the query-by-description (QBD) component of \"Kandem\", a time-aware music retrieval system. The QBD system we describe learns a relation between descriptive text concerning a musical artist and their actual acoustic output, making such queries as \"Play me something loud with an electronic beat\" possible by merely analyzing the audio content of a database. We show a novel machine learning technique based on regularized least-squares classification (RLSC) that can quickly and efficiently learn the non-linear relation between descriptive language and audio features by treating the problem as a large number of possible output classes linked to the same set or input features. We show how the RLSC training can easily eliminate irrelevant labels.","PeriodicalId":398813,"journal":{"name":"2002 IEEE Workshop on Multimedia Signal Processing.","volume":"63 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2002-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"51","resultStr":"{\"title\":\"Musical query-by-description as a multiclass learning problem\",\"authors\":\"B. Whitman, R. Rifkin\",\"doi\":\"10.1109/MMSP.2002.1203270\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We present the query-by-description (QBD) component of \\\"Kandem\\\", a time-aware music retrieval system. The QBD system we describe learns a relation between descriptive text concerning a musical artist and their actual acoustic output, making such queries as \\\"Play me something loud with an electronic beat\\\" possible by merely analyzing the audio content of a database. We show a novel machine learning technique based on regularized least-squares classification (RLSC) that can quickly and efficiently learn the non-linear relation between descriptive language and audio features by treating the problem as a large number of possible output classes linked to the same set or input features. We show how the RLSC training can easily eliminate irrelevant labels.\",\"PeriodicalId\":398813,\"journal\":{\"name\":\"2002 IEEE Workshop on Multimedia Signal Processing.\",\"volume\":\"63 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2002-12-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"51\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2002 IEEE Workshop on Multimedia Signal Processing.\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/MMSP.2002.1203270\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2002 IEEE Workshop on Multimedia Signal Processing.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MMSP.2002.1203270","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 51

摘要

提出了“Kandem”的时间感知音乐检索系统的描述查询(QBD)组件。我们所描述的QBD系统学习了关于音乐艺术家的描述文本与他们的实际声音输出之间的关系，使诸如“给我播放一些带有电子节拍的大声内容”这样的查询仅通过分析数据库的音频内容就可以实现。我们展示了一种基于正则化最小二乘分类(RLSC)的新型机器学习技术，该技术可以快速有效地学习描述性语言和音频特征之间的非线性关系，方法是将问题视为与相同集合或输入特征相关联的大量可能的输出类。我们展示了RLSC训练如何轻松地消除不相关的标签。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Musical query-by-description as a multiclass learning problem

We present the query-by-description (QBD) component of "Kandem", a time-aware music retrieval system. The QBD system we describe learns a relation between descriptive text concerning a musical artist and their actual acoustic output, making such queries as "Play me something loud with an electronic beat" possible by merely analyzing the audio content of a database. We show a novel machine learning technique based on regularized least-squares classification (RLSC) that can quickly and efficiently learn the non-linear relation between descriptive language and audio features by treating the problem as a large number of possible output classes linked to the same set or input features. We show how the RLSC training can easily eliminate irrelevant labels.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2002 IEEE Workshop on Multimedia Signal Processing.

自引率

0.00%

发文量