Soyoung Lee, Kyungho Kim, Jonghoon Kwak, Eunchong Lee, Sang-Seol Lee
{"title":"卷积神经网络中一种有效的npu感知滤波剪枝","authors":"Soyoung Lee, Kyungho Kim, Jonghoon Kwak, Eunchong Lee, Sang-Seol Lee","doi":"10.1109/ICEIC57457.2023.10049954","DOIUrl":null,"url":null,"abstract":"The neural processing unit (NPU)is a high-performance and low-power acceleration specialized in implementing artificial intelligence (AI) such as training and inference. The NPU needs a compressed network because it is used with low power and low latency to process the convolutional neural network (CNN). Therefore, in this paper, we propose an efficient NPU-aware filter pruning method for CNN to increase the efficiency of NPU. NPU-aware filter pruning is performed in multiples of the channel unit size, which is the operation unit of the NPU to reduce unnecessary computation and save memory storage space. In the experimental results with VGGNet-16 and ResNet-18 on the CIFAR10 dataset, the proposed method reduced hardware inefficient space and unnecessary computation by 1.86~6.78% compared to general pruning method without loss of accuracy.","PeriodicalId":373752,"journal":{"name":"2023 International Conference on Electronics, Information, and Communication (ICEIC)","volume":"54 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"An Efficient NPU-Aware Filter Pruning in Convolutional Neural Network\",\"authors\":\"Soyoung Lee, Kyungho Kim, Jonghoon Kwak, Eunchong Lee, Sang-Seol Lee\",\"doi\":\"10.1109/ICEIC57457.2023.10049954\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The neural processing unit (NPU)is a high-performance and low-power acceleration specialized in implementing artificial intelligence (AI) such as training and inference. The NPU needs a compressed network because it is used with low power and low latency to process the convolutional neural network (CNN). Therefore, in this paper, we propose an efficient NPU-aware filter pruning method for CNN to increase the efficiency of NPU. NPU-aware filter pruning is performed in multiples of the channel unit size, which is the operation unit of the NPU to reduce unnecessary computation and save memory storage space. In the experimental results with VGGNet-16 and ResNet-18 on the CIFAR10 dataset, the proposed method reduced hardware inefficient space and unnecessary computation by 1.86~6.78% compared to general pruning method without loss of accuracy.\",\"PeriodicalId\":373752,\"journal\":{\"name\":\"2023 International Conference on Electronics, Information, and Communication (ICEIC)\",\"volume\":\"54 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-02-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 International Conference on Electronics, Information, and Communication (ICEIC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICEIC57457.2023.10049954\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 International Conference on Electronics, Information, and Communication (ICEIC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICEIC57457.2023.10049954","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
An Efficient NPU-Aware Filter Pruning in Convolutional Neural Network
The neural processing unit (NPU)is a high-performance and low-power acceleration specialized in implementing artificial intelligence (AI) such as training and inference. The NPU needs a compressed network because it is used with low power and low latency to process the convolutional neural network (CNN). Therefore, in this paper, we propose an efficient NPU-aware filter pruning method for CNN to increase the efficiency of NPU. NPU-aware filter pruning is performed in multiples of the channel unit size, which is the operation unit of the NPU to reduce unnecessary computation and save memory storage space. In the experimental results with VGGNet-16 and ResNet-18 on the CIFAR10 dataset, the proposed method reduced hardware inefficient space and unnecessary computation by 1.86~6.78% compared to general pruning method without loss of accuracy.