Seongwoo Kim, Yongjun Kim, Gwang-Jun Byeon, Seokin Hong
{"title":"CAESAR:利用稀疏性和冗余模式的CNN加速器","authors":"Seongwoo Kim, Yongjun Kim, Gwang-Jun Byeon, Seokin Hong","doi":"10.1109/ITC-CSCC58803.2023.10212679","DOIUrl":null,"url":null,"abstract":"Convolutional Neural Networks (CNN) have shown outstanding performance in many computer vision applications. However, CNN Inference on mobile and edge devices is challenging due to high computation demands. Recently, many prior studies have tried to address this challenge by reducing the data precision with quantization techniques, leading to abundant redundancy in the CNN models. This paper proposes CAESAR, a CNN accelerator that eliminates redundant computations to reduce the computation demands of CNN inference. By analyzing the computation pattern of the convolution layer, CAESAR predicts the location where the redundant computations occur and removes them in the executions. After that, CAESAR remaps the remaining effectual computations on the processing elements originally mapped to the redundant computations so that all processing elements are fully utilized. Based on our evaluation with a cycle-level microarchitecture simulator, CAESAR achieves an overall speedup of up to 2.13x and saves energy by 78% over the TPU-like baseline accelerator.","PeriodicalId":220939,"journal":{"name":"2023 International Technical Conference on Circuits/Systems, Computers, and Communications (ITC-CSCC)","volume":"77 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"CAESAR: A CNN Accelerator Exploiting Sparsity and Redundancy Pattern\",\"authors\":\"Seongwoo Kim, Yongjun Kim, Gwang-Jun Byeon, Seokin Hong\",\"doi\":\"10.1109/ITC-CSCC58803.2023.10212679\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Convolutional Neural Networks (CNN) have shown outstanding performance in many computer vision applications. However, CNN Inference on mobile and edge devices is challenging due to high computation demands. Recently, many prior studies have tried to address this challenge by reducing the data precision with quantization techniques, leading to abundant redundancy in the CNN models. This paper proposes CAESAR, a CNN accelerator that eliminates redundant computations to reduce the computation demands of CNN inference. By analyzing the computation pattern of the convolution layer, CAESAR predicts the location where the redundant computations occur and removes them in the executions. After that, CAESAR remaps the remaining effectual computations on the processing elements originally mapped to the redundant computations so that all processing elements are fully utilized. Based on our evaluation with a cycle-level microarchitecture simulator, CAESAR achieves an overall speedup of up to 2.13x and saves energy by 78% over the TPU-like baseline accelerator.\",\"PeriodicalId\":220939,\"journal\":{\"name\":\"2023 International Technical Conference on Circuits/Systems, Computers, and Communications (ITC-CSCC)\",\"volume\":\"77 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-06-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 International Technical Conference on Circuits/Systems, Computers, and Communications (ITC-CSCC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ITC-CSCC58803.2023.10212679\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 International Technical Conference on Circuits/Systems, Computers, and Communications (ITC-CSCC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ITC-CSCC58803.2023.10212679","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
CAESAR: A CNN Accelerator Exploiting Sparsity and Redundancy Pattern
Convolutional Neural Networks (CNN) have shown outstanding performance in many computer vision applications. However, CNN Inference on mobile and edge devices is challenging due to high computation demands. Recently, many prior studies have tried to address this challenge by reducing the data precision with quantization techniques, leading to abundant redundancy in the CNN models. This paper proposes CAESAR, a CNN accelerator that eliminates redundant computations to reduce the computation demands of CNN inference. By analyzing the computation pattern of the convolution layer, CAESAR predicts the location where the redundant computations occur and removes them in the executions. After that, CAESAR remaps the remaining effectual computations on the processing elements originally mapped to the redundant computations so that all processing elements are fully utilized. Based on our evaluation with a cycle-level microarchitecture simulator, CAESAR achieves an overall speedup of up to 2.13x and saves energy by 78% over the TPU-like baseline accelerator.