{"title":"基于聚类量化知识蒸馏的资源约束环境下深度学习模型优化","authors":"Niaz Ashraf Khan, A. M. Saadman Rafat","doi":"10.1002/eng2.70187","DOIUrl":null,"url":null,"abstract":"<p>Deep convolutional neural networks (CNNs) are highly effective in computer vision tasks but remain challenging to deploy in resource-constrained environments due to their high computational and memory requirements. Conventional model compression techniques, such as pruning and post-training quantization, often compromise model accuracy by decoupling compression from training. Furthermore, traditional knowledge distillation approaches rely on full-precision teacher models, limiting their effectiveness in compressed settings. To address these issues, we propose Cluster-Quantized Knowledge Distillation (CQKD), a novel framework that integrates structured pruning with knowledge distillation, incorporating cluster-based weight quantization directly into the training loop. Unlike existing methods, CQKD applies quantization to both the teacher and student models, ensuring a more effective transfer of compressed knowledge. By leveraging layer-wise K-means clustering, our approach achieves extreme model compression while maintaining high accuracy. Experimental results on CIFAR-10 and CIFAR-100 demonstrate the effectiveness of CQKD, achieving compression ratios of 34,000× while preserving competitive accuracy—97.9% on CIFAR-10 and 91.2% on CIFAR-100. These results highlight the feasibility of CQKD for efficient deep learning model deployment in low-resource environments.</p>","PeriodicalId":72922,"journal":{"name":"Engineering reports : open access","volume":"7 5","pages":""},"PeriodicalIF":1.8000,"publicationDate":"2025-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/eng2.70187","citationCount":"0","resultStr":"{\"title\":\"Optimizing Deep Learning Models for Resource-Constrained Environments With Cluster-Quantized Knowledge Distillation\",\"authors\":\"Niaz Ashraf Khan, A. M. Saadman Rafat\",\"doi\":\"10.1002/eng2.70187\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Deep convolutional neural networks (CNNs) are highly effective in computer vision tasks but remain challenging to deploy in resource-constrained environments due to their high computational and memory requirements. Conventional model compression techniques, such as pruning and post-training quantization, often compromise model accuracy by decoupling compression from training. Furthermore, traditional knowledge distillation approaches rely on full-precision teacher models, limiting their effectiveness in compressed settings. To address these issues, we propose Cluster-Quantized Knowledge Distillation (CQKD), a novel framework that integrates structured pruning with knowledge distillation, incorporating cluster-based weight quantization directly into the training loop. Unlike existing methods, CQKD applies quantization to both the teacher and student models, ensuring a more effective transfer of compressed knowledge. By leveraging layer-wise K-means clustering, our approach achieves extreme model compression while maintaining high accuracy. Experimental results on CIFAR-10 and CIFAR-100 demonstrate the effectiveness of CQKD, achieving compression ratios of 34,000× while preserving competitive accuracy—97.9% on CIFAR-10 and 91.2% on CIFAR-100. These results highlight the feasibility of CQKD for efficient deep learning model deployment in low-resource environments.</p>\",\"PeriodicalId\":72922,\"journal\":{\"name\":\"Engineering reports : open access\",\"volume\":\"7 5\",\"pages\":\"\"},\"PeriodicalIF\":1.8000,\"publicationDate\":\"2025-05-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://onlinelibrary.wiley.com/doi/epdf/10.1002/eng2.70187\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Engineering reports : open access\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1002/eng2.70187\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Engineering reports : open access","FirstCategoryId":"1085","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/eng2.70187","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
Optimizing Deep Learning Models for Resource-Constrained Environments With Cluster-Quantized Knowledge Distillation
Deep convolutional neural networks (CNNs) are highly effective in computer vision tasks but remain challenging to deploy in resource-constrained environments due to their high computational and memory requirements. Conventional model compression techniques, such as pruning and post-training quantization, often compromise model accuracy by decoupling compression from training. Furthermore, traditional knowledge distillation approaches rely on full-precision teacher models, limiting their effectiveness in compressed settings. To address these issues, we propose Cluster-Quantized Knowledge Distillation (CQKD), a novel framework that integrates structured pruning with knowledge distillation, incorporating cluster-based weight quantization directly into the training loop. Unlike existing methods, CQKD applies quantization to both the teacher and student models, ensuring a more effective transfer of compressed knowledge. By leveraging layer-wise K-means clustering, our approach achieves extreme model compression while maintaining high accuracy. Experimental results on CIFAR-10 and CIFAR-100 demonstrate the effectiveness of CQKD, achieving compression ratios of 34,000× while preserving competitive accuracy—97.9% on CIFAR-10 and 91.2% on CIFAR-100. These results highlight the feasibility of CQKD for efficient deep learning model deployment in low-resource environments.