{"title":"Deploying Deep Belief Nets for content based audio music similarity","authors":"Aggelos Gkiokas, V. Katsouros, G. Carayannis","doi":"10.1109/IISA.2014.6878797","DOIUrl":null,"url":null,"abstract":"In this paper a method for computing an audio based similarity between music excerpts is presented. The method consists of three main parts, with the first step being feature extraction, which involves the calculation of three feature sets that correspond to music timbre, rhythm and harmony. Next, for each feature set a Deep Belief Network was trained without supervision on a large music collection. The respective distances of the output units of the Deep Belief Networks between two music excerpts are computed, normalized and finally combined to form the distance measure. The proposed method was evaluated on the MIREX 2013 Audio Music Similarity task. Results are encouraging, however, they indicate that the harmonic similarity component degrades the performance.","PeriodicalId":298835,"journal":{"name":"IISA 2014, The 5th International Conference on Information, Intelligence, Systems and Applications","volume":"196 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-07-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IISA 2014, The 5th International Conference on Information, Intelligence, Systems and Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IISA.2014.6878797","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
In this paper a method for computing an audio based similarity between music excerpts is presented. The method consists of three main parts, with the first step being feature extraction, which involves the calculation of three feature sets that correspond to music timbre, rhythm and harmony. Next, for each feature set a Deep Belief Network was trained without supervision on a large music collection. The respective distances of the output units of the Deep Belief Networks between two music excerpts are computed, normalized and finally combined to form the distance measure. The proposed method was evaluated on the MIREX 2013 Audio Music Similarity task. Results are encouraging, however, they indicate that the harmonic similarity component degrades the performance.