MusicNeXt: Addressing category bias in fused music using musical features and genre-sensitive adjustment layer

IF 0.8 4区计算机科学 Q4 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Intelligent Data Analysis Pub Date : 2023-11-16 DOI:10.3233/ida-230428

Shiting Meng, Qingbo Hao, Yingyuan Xiao, Wenguang Zheng

{"title":"MusicNeXt: Addressing category bias in fused music using musical features and genre-sensitive adjustment layer","authors":"Shiting Meng, Qingbo Hao, Yingyuan Xiao, Wenguang Zheng","doi":"10.3233/ida-230428","DOIUrl":null,"url":null,"abstract":"Convolutional neural networks (CNNs) have been successfully applied to music genre classification tasks. With the development of diverse music, genre fusion has become common. Fused music exhibits multiple similar musical features such as rhythm, timbre, and structure, which typically arise from the temporal information in the spectrum. However, traditional CNNs cannot effectively capture temporal information, leading to difficulties in distinguishing fused music. To address this issue, this study proposes a CNN model called MusicNeXt for music genre classification. Its goal is to enhance the feature extraction method to increase focus on musical features, and increase the distinctiveness between different genres, thereby reducing classification result bias. Specifically, we construct the feature extraction module which can fully utilize temporal information, thereby enhancing its focus on music features. It exhibits an improved understanding of the complexity of fused music. Additionally, we introduce a genre-sensitive adjustment layer that strengthens the learning of differences between different genres through within-class angle constraints. This leads to increased distinctiveness between genres and provides interpretability for the classification results. Experimental results demonstrate that our proposed MusicNeXt model outperforms baseline networks and other state-of-the-art methods in music genre classification tasks, without generating category bias in the classification results.","PeriodicalId":50355,"journal":{"name":"Intelligent Data Analysis","volume":"31 1","pages":""},"PeriodicalIF":0.8000,"publicationDate":"2023-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Intelligent Data Analysis","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.3233/ida-230428","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Convolutional neural networks (CNNs) have been successfully applied to music genre classification tasks. With the development of diverse music, genre fusion has become common. Fused music exhibits multiple similar musical features such as rhythm, timbre, and structure, which typically arise from the temporal information in the spectrum. However, traditional CNNs cannot effectively capture temporal information, leading to difficulties in distinguishing fused music. To address this issue, this study proposes a CNN model called MusicNeXt for music genre classification. Its goal is to enhance the feature extraction method to increase focus on musical features, and increase the distinctiveness between different genres, thereby reducing classification result bias. Specifically, we construct the feature extraction module which can fully utilize temporal information, thereby enhancing its focus on music features. It exhibits an improved understanding of the complexity of fused music. Additionally, we introduce a genre-sensitive adjustment layer that strengthens the learning of differences between different genres through within-class angle constraints. This leads to increased distinctiveness between genres and provides interpretability for the classification results. Experimental results demonstrate that our proposed MusicNeXt model outperforms baseline networks and other state-of-the-art methods in music genre classification tasks, without generating category bias in the classification results.

查看原文本刊更多论文

MusicNeXt：利用音乐特征和体裁敏感调整层解决融合音乐中的类别偏差问题

卷积神经网络（CNN）已成功应用于音乐流派分类任务。随着音乐的多样化发展，流派融合已成为一种普遍现象。融合音乐表现出多种相似的音乐特征，如节奏、音色和结构，这些特征通常来自于频谱中的时间信息。然而，传统的 CNN 无法有效捕捉时间信息，导致难以区分融合音乐。针对这一问题，本研究提出了一种用于音乐流派分类的 CNN 模型 MusicNeXt。其目标是改进特征提取方法，更加关注音乐特征，提高不同流派之间的区分度，从而减少分类结果的偏差。具体来说，我们构建的特征提取模块可以充分利用时间信息，从而提高对音乐特征的关注度。它能更好地理解融合音乐的复杂性。此外，我们还引入了流派敏感调整层，通过类内角度约束加强对不同流派之间差异的学习。这将提高流派之间的区别，并为分类结果提供可解释性。实验结果表明，我们提出的 MusicNeXt 模型在音乐流派分类任务中的表现优于基线网络和其他最先进的方法，而且不会在分类结果中产生类别偏差。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Intelligent Data Analysis 工程技术-计算机：人工智能

CiteScore

2.20

自引率

5.90%

发文量

审稿时长

3.3 months

期刊介绍： Intelligent Data Analysis provides a forum for the examination of issues related to the research and applications of Artificial Intelligence techniques in data analysis across a variety of disciplines. These techniques include (but are not limited to): all areas of data visualization, data pre-processing (fusion, editing, transformation, filtering, sampling), data engineering, database mining techniques, tools and applications, use of domain knowledge in data analysis, big data applications, evolutionary algorithms, machine learning, neural nets, fuzzy logic, statistical pattern recognition, knowledge filtering, and post-processing. In particular, papers are preferred that discuss development of new AI related data analysis architectures, methodologies, and techniques and their applications to various domains.