{"title":"Searching for dominant high-level features for Music Information Retrieval","authors":"M. Zanoni, Daniele Ciminieri, A. Sarti, S. Tubaro","doi":"10.5281/ZENODO.52248","DOIUrl":null,"url":null,"abstract":"Music Information Retrieval systems are often based on the analysis of a large number of low-level audio features. When dealing with problems of musical genre description and visualization, however, it would be desirable to work with a very limited number of highly informative and discriminant macro-descriptors. In this paper we focus on a specific class of training-based descriptors, which are obtained as the log-likelihood of a Gaussian Mixture Model trained with short musical excerpts that selectively exhibit a certain semantic homogeneity. As these descriptors are critically dependent on the training sets, we approach the problem of how to automatically generate suitable training sets and optimize the associated macro-features in terms of discriminant power and informative impact. We then show the application of a set of three identified macro-features to genre visualization, tracking and classification.","PeriodicalId":201182,"journal":{"name":"2012 Proceedings of the 20th European Signal Processing Conference (EUSIPCO)","volume":"61 4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"13","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 Proceedings of the 20th European Signal Processing Conference (EUSIPCO)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5281/ZENODO.52248","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 13
Abstract
Music Information Retrieval systems are often based on the analysis of a large number of low-level audio features. When dealing with problems of musical genre description and visualization, however, it would be desirable to work with a very limited number of highly informative and discriminant macro-descriptors. In this paper we focus on a specific class of training-based descriptors, which are obtained as the log-likelihood of a Gaussian Mixture Model trained with short musical excerpts that selectively exhibit a certain semantic homogeneity. As these descriptors are critically dependent on the training sets, we approach the problem of how to automatically generate suitable training sets and optimize the associated macro-features in terms of discriminant power and informative impact. We then show the application of a set of three identified macro-features to genre visualization, tracking and classification.