{"title":"无线通信网络中自适应音乐质量分类的大模型增强CNN-Transformer架构","authors":"Tianyu Chen","doi":"10.1002/itl2.70156","DOIUrl":null,"url":null,"abstract":"<div>\n \n <p>This letter presents an enhanced convolutional neural network (CNN)–transformer architecture integrated with large model (LM) capabilities for adaptive music quality classification in wireless communication networks (WCNs). The proposed approach combines the global feature learning strength of transformer encoders with the local pattern recognition abilities of CNNs while leveraging LM knowledge for improved audio signal understanding. To enhance classification accuracy, we first preprocess the music signal data through channel-aware normalization and feature standardization. Subsequently, we employ a multi-head attention mechanism from transformer networks to capture long-range dependencies in music features affected by wireless transmission while utilizing CNN layers to extract localized audio patterns. Finally, we incorporate inception modules to achieve multi-scale feature fusion and complete the music quality classification task. Experimental validation on the MusicCaps dataset demonstrates that our model achieves 97.8% classification accuracy, with precision, recall, and F1-score all exceeding 97.5%, outperforming existing approaches for music quality assessment in wireless environments.</p>\n </div>","PeriodicalId":100725,"journal":{"name":"Internet Technology Letters","volume":"8 6","pages":""},"PeriodicalIF":0.5000,"publicationDate":"2025-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Large Model-Enhanced CNN–Transformer Architecture for Adaptive Music Quality Classification in Wireless Communication Networks\",\"authors\":\"Tianyu Chen\",\"doi\":\"10.1002/itl2.70156\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div>\\n \\n <p>This letter presents an enhanced convolutional neural network (CNN)–transformer architecture integrated with large model (LM) capabilities for adaptive music quality classification in wireless communication networks (WCNs). The proposed approach combines the global feature learning strength of transformer encoders with the local pattern recognition abilities of CNNs while leveraging LM knowledge for improved audio signal understanding. To enhance classification accuracy, we first preprocess the music signal data through channel-aware normalization and feature standardization. Subsequently, we employ a multi-head attention mechanism from transformer networks to capture long-range dependencies in music features affected by wireless transmission while utilizing CNN layers to extract localized audio patterns. Finally, we incorporate inception modules to achieve multi-scale feature fusion and complete the music quality classification task. Experimental validation on the MusicCaps dataset demonstrates that our model achieves 97.8% classification accuracy, with precision, recall, and F1-score all exceeding 97.5%, outperforming existing approaches for music quality assessment in wireless environments.</p>\\n </div>\",\"PeriodicalId\":100725,\"journal\":{\"name\":\"Internet Technology Letters\",\"volume\":\"8 6\",\"pages\":\"\"},\"PeriodicalIF\":0.5000,\"publicationDate\":\"2025-10-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Internet Technology Letters\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1002/itl2.70156\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"TELECOMMUNICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Internet Technology Letters","FirstCategoryId":"1085","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/itl2.70156","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"TELECOMMUNICATIONS","Score":null,"Total":0}
Large Model-Enhanced CNN–Transformer Architecture for Adaptive Music Quality Classification in Wireless Communication Networks
This letter presents an enhanced convolutional neural network (CNN)–transformer architecture integrated with large model (LM) capabilities for adaptive music quality classification in wireless communication networks (WCNs). The proposed approach combines the global feature learning strength of transformer encoders with the local pattern recognition abilities of CNNs while leveraging LM knowledge for improved audio signal understanding. To enhance classification accuracy, we first preprocess the music signal data through channel-aware normalization and feature standardization. Subsequently, we employ a multi-head attention mechanism from transformer networks to capture long-range dependencies in music features affected by wireless transmission while utilizing CNN layers to extract localized audio patterns. Finally, we incorporate inception modules to achieve multi-scale feature fusion and complete the music quality classification task. Experimental validation on the MusicCaps dataset demonstrates that our model achieves 97.8% classification accuracy, with precision, recall, and F1-score all exceeding 97.5%, outperforming existing approaches for music quality assessment in wireless environments.