{"title":"寻找性能和节能的CNN加速器","authors":"S. Sedukhin, Yoichi Tomioka, Kohei Yamamoto","doi":"10.1109/COOLCHIPS52128.2021.9410350","DOIUrl":null,"url":null,"abstract":"In this paper, starting from the algorithm, a performance- and energy-efficient 3D structure or shape of the Tensor Processing Engine (TPE) for CNN acceleration is systematically searched and evaluated. An optimal accelerator's shape maximizes the number of concurrent MAC operations per clock cycle while minimizes the number of redundant operations. The proposed 3D vector-parallel TPE architecture with an optimal shape can be very efficiently used for considerable CNN acceleration. Due to inter-block image data independency, it is possible to use multiple of such TPEs for the additional CNN acceleration. Moreover, it was shown that proposed TPE can also be uniformly used for acceleration of the different CNN models such as VGG, ResNet, YOLO and SSD.","PeriodicalId":103337,"journal":{"name":"2021 IEEE Symposium in Low-Power and High-Speed Chips (COOL CHIPS)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"In Search of the Performance- and Energy-Efficient CNN Accelerators\",\"authors\":\"S. Sedukhin, Yoichi Tomioka, Kohei Yamamoto\",\"doi\":\"10.1109/COOLCHIPS52128.2021.9410350\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, starting from the algorithm, a performance- and energy-efficient 3D structure or shape of the Tensor Processing Engine (TPE) for CNN acceleration is systematically searched and evaluated. An optimal accelerator's shape maximizes the number of concurrent MAC operations per clock cycle while minimizes the number of redundant operations. The proposed 3D vector-parallel TPE architecture with an optimal shape can be very efficiently used for considerable CNN acceleration. Due to inter-block image data independency, it is possible to use multiple of such TPEs for the additional CNN acceleration. Moreover, it was shown that proposed TPE can also be uniformly used for acceleration of the different CNN models such as VGG, ResNet, YOLO and SSD.\",\"PeriodicalId\":103337,\"journal\":{\"name\":\"2021 IEEE Symposium in Low-Power and High-Speed Chips (COOL CHIPS)\",\"volume\":\"7 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-04-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 IEEE Symposium in Low-Power and High-Speed Chips (COOL CHIPS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/COOLCHIPS52128.2021.9410350\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE Symposium in Low-Power and High-Speed Chips (COOL CHIPS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/COOLCHIPS52128.2021.9410350","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
In Search of the Performance- and Energy-Efficient CNN Accelerators
In this paper, starting from the algorithm, a performance- and energy-efficient 3D structure or shape of the Tensor Processing Engine (TPE) for CNN acceleration is systematically searched and evaluated. An optimal accelerator's shape maximizes the number of concurrent MAC operations per clock cycle while minimizes the number of redundant operations. The proposed 3D vector-parallel TPE architecture with an optimal shape can be very efficiently used for considerable CNN acceleration. Due to inter-block image data independency, it is possible to use multiple of such TPEs for the additional CNN acceleration. Moreover, it was shown that proposed TPE can also be uniformly used for acceleration of the different CNN models such as VGG, ResNet, YOLO and SSD.