{"title":"利用多粒剪枝改进神经网络结构压缩","authors":"Kevin Kollek, M. Aguilar, Marco Braun, A. Kummert","doi":"10.1109/PIC53636.2021.9687071","DOIUrl":null,"url":null,"abstract":"Pruning techniques for neural networks are applied to achieve superior model compression while maintaining accuracy. Common pruning approaches rely on single granularity (e.g., weights, channels, or layers) compression techniques and miss valuable optimization potential. This major limitation results in a sequence of obsolete layers with a small number of channels or highly sparse weights. In this paper, we present a novel pruning approach to address this issue. More precisely, in this work, a Multi-Grain Pruning (MGP) framework is proposed to optimize neural network architectures from coarse to fine in up to four different granularities. Besides the traditional pruning granularities, a new granularity is introduced on so-called blocks, which consist of multiple layers. By combining multiple pruning granularities, models can be optimized even further. We evaluated the proposed framework with VGG-19 on CIFAR-10 and CIFAR-100 as well as ResNet-56 on CIFAR-10 and ResNet-50 on ImageNet. The results show that our technique achieves from 31.9x up to 185.3x model compression rates with an accuracy drop from 0.08% up to 1.73% with VGG-19 on CIFAR-10.","PeriodicalId":297239,"journal":{"name":"2021 IEEE International Conference on Progress in Informatics and Computing (PIC)","volume":"323 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Improving Neural Network Architecture Compression by Multi-Grain Pruning\",\"authors\":\"Kevin Kollek, M. Aguilar, Marco Braun, A. Kummert\",\"doi\":\"10.1109/PIC53636.2021.9687071\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Pruning techniques for neural networks are applied to achieve superior model compression while maintaining accuracy. Common pruning approaches rely on single granularity (e.g., weights, channels, or layers) compression techniques and miss valuable optimization potential. This major limitation results in a sequence of obsolete layers with a small number of channels or highly sparse weights. In this paper, we present a novel pruning approach to address this issue. More precisely, in this work, a Multi-Grain Pruning (MGP) framework is proposed to optimize neural network architectures from coarse to fine in up to four different granularities. Besides the traditional pruning granularities, a new granularity is introduced on so-called blocks, which consist of multiple layers. By combining multiple pruning granularities, models can be optimized even further. We evaluated the proposed framework with VGG-19 on CIFAR-10 and CIFAR-100 as well as ResNet-56 on CIFAR-10 and ResNet-50 on ImageNet. The results show that our technique achieves from 31.9x up to 185.3x model compression rates with an accuracy drop from 0.08% up to 1.73% with VGG-19 on CIFAR-10.\",\"PeriodicalId\":297239,\"journal\":{\"name\":\"2021 IEEE International Conference on Progress in Informatics and Computing (PIC)\",\"volume\":\"323 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-12-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 IEEE International Conference on Progress in Informatics and Computing (PIC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/PIC53636.2021.9687071\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE International Conference on Progress in Informatics and Computing (PIC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/PIC53636.2021.9687071","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Improving Neural Network Architecture Compression by Multi-Grain Pruning
Pruning techniques for neural networks are applied to achieve superior model compression while maintaining accuracy. Common pruning approaches rely on single granularity (e.g., weights, channels, or layers) compression techniques and miss valuable optimization potential. This major limitation results in a sequence of obsolete layers with a small number of channels or highly sparse weights. In this paper, we present a novel pruning approach to address this issue. More precisely, in this work, a Multi-Grain Pruning (MGP) framework is proposed to optimize neural network architectures from coarse to fine in up to four different granularities. Besides the traditional pruning granularities, a new granularity is introduced on so-called blocks, which consist of multiple layers. By combining multiple pruning granularities, models can be optimized even further. We evaluated the proposed framework with VGG-19 on CIFAR-10 and CIFAR-100 as well as ResNet-56 on CIFAR-10 and ResNet-50 on ImageNet. The results show that our technique achieves from 31.9x up to 185.3x model compression rates with an accuracy drop from 0.08% up to 1.73% with VGG-19 on CIFAR-10.