无线通信网络中自适应音乐质量分类的大模型增强CNN-Transformer架构

IF 0.5 Q4 TELECOMMUNICATIONS

Internet Technology Letters Pub Date : 2025-10-04 DOI:10.1002/itl2.70156

Tianyu Chen

{"title":"无线通信网络中自适应音乐质量分类的大模型增强CNN-Transformer架构","authors":"Tianyu Chen","doi":"10.1002/itl2.70156","DOIUrl":null,"url":null,"abstract":"<div>\n \n <p>This letter presents an enhanced convolutional neural network (CNN)–transformer architecture integrated with large model (LM) capabilities for adaptive music quality classification in wireless communication networks (WCNs). The proposed approach combines the global feature learning strength of transformer encoders with the local pattern recognition abilities of CNNs while leveraging LM knowledge for improved audio signal understanding. To enhance classification accuracy, we first preprocess the music signal data through channel-aware normalization and feature standardization. Subsequently, we employ a multi-head attention mechanism from transformer networks to capture long-range dependencies in music features affected by wireless transmission while utilizing CNN layers to extract localized audio patterns. Finally, we incorporate inception modules to achieve multi-scale feature fusion and complete the music quality classification task. Experimental validation on the MusicCaps dataset demonstrates that our model achieves 97.8% classification accuracy, with precision, recall, and F1-score all exceeding 97.5%, outperforming existing approaches for music quality assessment in wireless environments.</p>\n </div>","PeriodicalId":100725,"journal":{"name":"Internet Technology Letters","volume":"8 6","pages":""},"PeriodicalIF":0.5000,"publicationDate":"2025-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Large Model-Enhanced CNN–Transformer Architecture for Adaptive Music Quality Classification in Wireless Communication Networks\",\"authors\":\"Tianyu Chen\",\"doi\":\"10.1002/itl2.70156\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div>\\n \\n <p>This letter presents an enhanced convolutional neural network (CNN)–transformer architecture integrated with large model (LM) capabilities for adaptive music quality classification in wireless communication networks (WCNs). The proposed approach combines the global feature learning strength of transformer encoders with the local pattern recognition abilities of CNNs while leveraging LM knowledge for improved audio signal understanding. To enhance classification accuracy, we first preprocess the music signal data through channel-aware normalization and feature standardization. Subsequently, we employ a multi-head attention mechanism from transformer networks to capture long-range dependencies in music features affected by wireless transmission while utilizing CNN layers to extract localized audio patterns. Finally, we incorporate inception modules to achieve multi-scale feature fusion and complete the music quality classification task. Experimental validation on the MusicCaps dataset demonstrates that our model achieves 97.8% classification accuracy, with precision, recall, and F1-score all exceeding 97.5%, outperforming existing approaches for music quality assessment in wireless environments.</p>\\n </div>\",\"PeriodicalId\":100725,\"journal\":{\"name\":\"Internet Technology Letters\",\"volume\":\"8 6\",\"pages\":\"\"},\"PeriodicalIF\":0.5000,\"publicationDate\":\"2025-10-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Internet Technology Letters\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1002/itl2.70156\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"TELECOMMUNICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Internet Technology Letters","FirstCategoryId":"1085","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/itl2.70156","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"TELECOMMUNICATIONS","Score":null,"Total":0}

引用次数: 0

摘要

这封信提出了一种增强的卷积神经网络(CNN) -变压器架构，集成了用于无线通信网络（WCNs）中自适应音乐质量分类的大模型（LM）功能。该方法将变压器编码器的全局特征学习强度与cnn的局部模式识别能力相结合，同时利用LM知识提高音频信号的理解能力。为了提高分类精度，我们首先对音乐信号数据进行了通道感知归一化和特征标准化预处理。随后，我们采用来自变压器网络的多头注意机制来捕获受无线传输影响的音乐特征中的远程依赖关系，同时利用CNN层提取局部音频模式。最后结合初始化模块实现多尺度特征融合，完成音质分类任务。在MusicCaps数据集上的实验验证表明，我们的模型达到了97.8%的分类准确率，精度、召回率和f1分数都超过了97.5%，优于现有的无线环境下音乐质量评估方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

Large Model-Enhanced CNN–Transformer Architecture for Adaptive Music Quality Classification in Wireless Communication Networks

查看原文本刊更多论文

Large Model-Enhanced CNN–Transformer Architecture for Adaptive Music Quality Classification in Wireless Communication Networks

This letter presents an enhanced convolutional neural network (CNN)–transformer architecture integrated with large model (LM) capabilities for adaptive music quality classification in wireless communication networks (WCNs). The proposed approach combines the global feature learning strength of transformer encoders with the local pattern recognition abilities of CNNs while leveraging LM knowledge for improved audio signal understanding. To enhance classification accuracy, we first preprocess the music signal data through channel-aware normalization and feature standardization. Subsequently, we employ a multi-head attention mechanism from transformer networks to capture long-range dependencies in music features affected by wireless transmission while utilizing CNN layers to extract localized audio patterns. Finally, we incorporate inception modules to achieve multi-scale feature fusion and complete the music quality classification task. Experimental validation on the MusicCaps dataset demonstrates that our model achieves 97.8% classification accuracy, with precision, recall, and F1-score all exceeding 97.5%, outperforming existing approaches for music quality assessment in wireless environments.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Internet Technology Letters

CiteScore

3.10

自引率

0.00%

发文量