J. Shana, N. Priyadharshini Jayadurga, Pratiba K R, P. R
{"title":"JUXTAPOSITION BETWEEN TWO CONVOLUTIONAL NEURAL NETWORK FOR MUSIC GENRE CLASSIFICATION","authors":"J. Shana, N. Priyadharshini Jayadurga, Pratiba K R, P. R","doi":"10.37896/ymer21.07/89","DOIUrl":null,"url":null,"abstract":"Music genre classification is the fundamental step involved in building a strong recommendation system. If music classification has to be carried out manually, then one has to listen to numerous songs and then select the genre. This process is not only time-consuming, it is quite a tedious task. The music industry has seen an excellent flow of latest channels to browse and distribute music. This doesn't return without drawbacks. With the increase in data, manual curation has become a difficult task. Audio files have a plethora of features that could be used to make parts of this process a lot easier. Advancements in technology have made it possible to extract the features of audio files. However, the most effective way to handle these for various tasks is unknown. This paper compared the two deep learning models of convolutional neural networks namely Alex-net and Res-net for the purpose of music genre classification using mel-spectrogram images for training. These aforementioned models were tested on GTZAN datasets. It was found that the results showed 56.0% accuracy for the res-net model which was outperformed by alex-net with an accuracy of 80.5%. Keywords: GTZAN Dataset, Alex-net, Res-net, Convolutional Neural Network","PeriodicalId":23848,"journal":{"name":"YMER Digital","volume":"18 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2022-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"YMER Digital","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.37896/ymer21.07/89","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Music genre classification is the fundamental step involved in building a strong recommendation system. If music classification has to be carried out manually, then one has to listen to numerous songs and then select the genre. This process is not only time-consuming, it is quite a tedious task. The music industry has seen an excellent flow of latest channels to browse and distribute music. This doesn't return without drawbacks. With the increase in data, manual curation has become a difficult task. Audio files have a plethora of features that could be used to make parts of this process a lot easier. Advancements in technology have made it possible to extract the features of audio files. However, the most effective way to handle these for various tasks is unknown. This paper compared the two deep learning models of convolutional neural networks namely Alex-net and Res-net for the purpose of music genre classification using mel-spectrogram images for training. These aforementioned models were tested on GTZAN datasets. It was found that the results showed 56.0% accuracy for the res-net model which was outperformed by alex-net with an accuracy of 80.5%. Keywords: GTZAN Dataset, Alex-net, Res-net, Convolutional Neural Network