两种卷积神经网络并置的音乐体裁分类

YMER Digital Pub Date : 2022-07-28 DOI:10.37896/ymer21.07/89

J. Shana, N. Priyadharshini Jayadurga, Pratiba K R, P. R

{"title":"两种卷积神经网络并置的音乐体裁分类","authors":"J. Shana, N. Priyadharshini Jayadurga, Pratiba K R, P. R","doi":"10.37896/ymer21.07/89","DOIUrl":null,"url":null,"abstract":"Music genre classification is the fundamental step involved in building a strong recommendation system. If music classification has to be carried out manually, then one has to listen to numerous songs and then select the genre. This process is not only time-consuming, it is quite a tedious task. The music industry has seen an excellent flow of latest channels to browse and distribute music. This doesn't return without drawbacks. With the increase in data, manual curation has become a difficult task. Audio files have a plethora of features that could be used to make parts of this process a lot easier. Advancements in technology have made it possible to extract the features of audio files. However, the most effective way to handle these for various tasks is unknown. This paper compared the two deep learning models of convolutional neural networks namely Alex-net and Res-net for the purpose of music genre classification using mel-spectrogram images for training. These aforementioned models were tested on GTZAN datasets. It was found that the results showed 56.0% accuracy for the res-net model which was outperformed by alex-net with an accuracy of 80.5%. Keywords: GTZAN Dataset, Alex-net, Res-net, Convolutional Neural Network","PeriodicalId":23848,"journal":{"name":"YMER Digital","volume":"18 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2022-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"JUXTAPOSITION BETWEEN TWO CONVOLUTIONAL NEURAL NETWORK FOR MUSIC GENRE CLASSIFICATION\",\"authors\":\"J. Shana, N. Priyadharshini Jayadurga, Pratiba K R, P. R\",\"doi\":\"10.37896/ymer21.07/89\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Music genre classification is the fundamental step involved in building a strong recommendation system. If music classification has to be carried out manually, then one has to listen to numerous songs and then select the genre. This process is not only time-consuming, it is quite a tedious task. The music industry has seen an excellent flow of latest channels to browse and distribute music. This doesn't return without drawbacks. With the increase in data, manual curation has become a difficult task. Audio files have a plethora of features that could be used to make parts of this process a lot easier. Advancements in technology have made it possible to extract the features of audio files. However, the most effective way to handle these for various tasks is unknown. This paper compared the two deep learning models of convolutional neural networks namely Alex-net and Res-net for the purpose of music genre classification using mel-spectrogram images for training. These aforementioned models were tested on GTZAN datasets. It was found that the results showed 56.0% accuracy for the res-net model which was outperformed by alex-net with an accuracy of 80.5%. Keywords: GTZAN Dataset, Alex-net, Res-net, Convolutional Neural Network\",\"PeriodicalId\":23848,\"journal\":{\"name\":\"YMER Digital\",\"volume\":\"18 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-07-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"YMER Digital\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.37896/ymer21.07/89\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"YMER Digital","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.37896/ymer21.07/89","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

音乐类型分类是构建强大推荐系统的基本步骤。如果必须手动进行音乐分类，那么就必须听大量的歌曲，然后选择流派。这个过程不仅耗时，而且是一项相当乏味的任务。音乐产业已经看到了浏览和分发音乐的最新渠道的优秀流。这并非没有缺点。随着数据量的增加，人工管理已成为一项艰巨的任务。音频文件有很多功能，可以让这个过程变得更简单。技术的进步使得提取音频文件的特征成为可能。然而，处理各种任务的最有效方法是未知的。本文比较了卷积神经网络的两种深度学习模型，即Alex-net和Res-net，目的是使用mel- spectrum图像进行音乐类型分类训练。上述模型在GTZAN数据集上进行了测试。结果表明，res-net模型的准确率为56.0%，而alexnet模型的准确率为80.5%。关键词:GTZAN数据集，Alex-net, Res-net，卷积神经网络

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

JUXTAPOSITION BETWEEN TWO CONVOLUTIONAL NEURAL NETWORK FOR MUSIC GENRE CLASSIFICATION

Music genre classification is the fundamental step involved in building a strong recommendation system. If music classification has to be carried out manually, then one has to listen to numerous songs and then select the genre. This process is not only time-consuming, it is quite a tedious task. The music industry has seen an excellent flow of latest channels to browse and distribute music. This doesn't return without drawbacks. With the increase in data, manual curation has become a difficult task. Audio files have a plethora of features that could be used to make parts of this process a lot easier. Advancements in technology have made it possible to extract the features of audio files. However, the most effective way to handle these for various tasks is unknown. This paper compared the two deep learning models of convolutional neural networks namely Alex-net and Res-net for the purpose of music genre classification using mel-spectrogram images for training. These aforementioned models were tested on GTZAN datasets. It was found that the results showed 56.0% accuracy for the res-net model which was outperformed by alex-net with an accuracy of 80.5%. Keywords: GTZAN Dataset, Alex-net, Res-net, Convolutional Neural Network

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

YMER Digital

自引率

0.00%

发文量