Liao-Chuan Chen, Zhaofang Li, Yi-Jhen Lin, Kuang Lee, K. Tang
{"title":"基于SRAM访问优化的可重构处理单元阵列1.93TOPS/W深度学习处理器","authors":"Liao-Chuan Chen, Zhaofang Li, Yi-Jhen Lin, Kuang Lee, K. Tang","doi":"10.1109/APCCAS55924.2022.10090334","DOIUrl":null,"url":null,"abstract":"Deep convolutional neural networks feature numerous parameters, causing data movement to usually dominate the power consumed when computing inferences. This paper proposes an on-chip buffer access optimization method and high-data-reuse architecture that can reduce the power consumed by an on-chip buffer by up to 67.8%. The chip is designed in a TSMC 40 nm process running at 200 MHz and achieves energy efficiency of 1.93 TOPS/W.","PeriodicalId":243739,"journal":{"name":"2022 IEEE Asia Pacific Conference on Circuits and Systems (APCCAS)","volume":"2002 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A 1.93TOPS/W Deep Learning Processor with a Reconfigurable Processing Element Array Based on SRAM Access Optimization\",\"authors\":\"Liao-Chuan Chen, Zhaofang Li, Yi-Jhen Lin, Kuang Lee, K. Tang\",\"doi\":\"10.1109/APCCAS55924.2022.10090334\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Deep convolutional neural networks feature numerous parameters, causing data movement to usually dominate the power consumed when computing inferences. This paper proposes an on-chip buffer access optimization method and high-data-reuse architecture that can reduce the power consumed by an on-chip buffer by up to 67.8%. The chip is designed in a TSMC 40 nm process running at 200 MHz and achieves energy efficiency of 1.93 TOPS/W.\",\"PeriodicalId\":243739,\"journal\":{\"name\":\"2022 IEEE Asia Pacific Conference on Circuits and Systems (APCCAS)\",\"volume\":\"2002 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-11-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE Asia Pacific Conference on Circuits and Systems (APCCAS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/APCCAS55924.2022.10090334\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE Asia Pacific Conference on Circuits and Systems (APCCAS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/APCCAS55924.2022.10090334","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A 1.93TOPS/W Deep Learning Processor with a Reconfigurable Processing Element Array Based on SRAM Access Optimization
Deep convolutional neural networks feature numerous parameters, causing data movement to usually dominate the power consumed when computing inferences. This paper proposes an on-chip buffer access optimization method and high-data-reuse architecture that can reduce the power consumed by an on-chip buffer by up to 67.8%. The chip is designed in a TSMC 40 nm process running at 200 MHz and achieves energy efficiency of 1.93 TOPS/W.