{"title":"残差神经网络的音乐体裁识别","authors":"Dipjyoti Bisharad, R. Laskar","doi":"10.1109/TENCON.2019.8929406","DOIUrl":null,"url":null,"abstract":"Genre is an abstract, yet a characteristic feature of music. Existing works for automatic genre classification compute a set of features from the audio and design a classifier on top of it. Such models, in general, compute these features over a relatively long duration of the audio. In this paper, a residual neural network based model is proposed for genre classification which is trained on short clips of just 3 seconds duration. Also, traditional genre classification algorithms will assign a single genre to an audio clip. However, it is well established that different genres have overlapping characteristics. Considering this ambiguous nature of the genre, the model proposed in this work can assign three genre labels to a music clip, with each genre associated with some probability. The proposed model has an error rate of 18%, 9%, and 5.5% while predicting into top-1, top-2 and top-3 genres for a music clip respectively. We demonstrate in this work that the predictions made by the classifier align with the broader understood meaning of genre in a realistic setting.","PeriodicalId":36690,"journal":{"name":"Platonic Investigations","volume":"111 1","pages":"2063-2068"},"PeriodicalIF":0.0000,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":"{\"title\":\"Music Genre Recognition Using Residual Neural Networks\",\"authors\":\"Dipjyoti Bisharad, R. Laskar\",\"doi\":\"10.1109/TENCON.2019.8929406\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Genre is an abstract, yet a characteristic feature of music. Existing works for automatic genre classification compute a set of features from the audio and design a classifier on top of it. Such models, in general, compute these features over a relatively long duration of the audio. In this paper, a residual neural network based model is proposed for genre classification which is trained on short clips of just 3 seconds duration. Also, traditional genre classification algorithms will assign a single genre to an audio clip. However, it is well established that different genres have overlapping characteristics. Considering this ambiguous nature of the genre, the model proposed in this work can assign three genre labels to a music clip, with each genre associated with some probability. The proposed model has an error rate of 18%, 9%, and 5.5% while predicting into top-1, top-2 and top-3 genres for a music clip respectively. We demonstrate in this work that the predictions made by the classifier align with the broader understood meaning of genre in a realistic setting.\",\"PeriodicalId\":36690,\"journal\":{\"name\":\"Platonic Investigations\",\"volume\":\"111 1\",\"pages\":\"2063-2068\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"7\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Platonic Investigations\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/TENCON.2019.8929406\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"Arts and Humanities\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Platonic Investigations","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/TENCON.2019.8929406","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"Arts and Humanities","Score":null,"Total":0}
Music Genre Recognition Using Residual Neural Networks
Genre is an abstract, yet a characteristic feature of music. Existing works for automatic genre classification compute a set of features from the audio and design a classifier on top of it. Such models, in general, compute these features over a relatively long duration of the audio. In this paper, a residual neural network based model is proposed for genre classification which is trained on short clips of just 3 seconds duration. Also, traditional genre classification algorithms will assign a single genre to an audio clip. However, it is well established that different genres have overlapping characteristics. Considering this ambiguous nature of the genre, the model proposed in this work can assign three genre labels to a music clip, with each genre associated with some probability. The proposed model has an error rate of 18%, 9%, and 5.5% while predicting into top-1, top-2 and top-3 genres for a music clip respectively. We demonstrate in this work that the predictions made by the classifier align with the broader understood meaning of genre in a realistic setting.