Phil C. Knag, Gregory K. Chen, H. Sumbul, Raghavan Kumar, M. Anders, Himanshu Kaul, S. Hsu, A. Agarwal, Monodeep Kar, Seongjong Kim, R. Krishnamurthy
{"title":"A 617 TOPS/W All Digital Binary Neural Network Accelerator in 10nm FinFET CMOS","authors":"Phil C. Knag, Gregory K. Chen, H. Sumbul, Raghavan Kumar, M. Anders, Himanshu Kaul, S. Hsu, A. Agarwal, Monodeep Kar, Seongjong Kim, R. Krishnamurthy","doi":"10.1109/VLSICircuits18222.2020.9162949","DOIUrl":null,"url":null,"abstract":"A 10nm digital Binary Neural Network (BNN) chip implements 1b activations and weights for compute density of 418TOPS/mm2 and memory density of 414KB/mm2. The chip achieves an energy efficiency of 617TOPS/W by leveraging Compute Near Memory (CNM), parallel inner product compute, and Near-Threshold Voltage (NTV) operation. The digital BNN design approaches the energy efficiency of analog in-memory techniques while also ensuring deterministic, scalable, and precise operation.","PeriodicalId":252787,"journal":{"name":"2020 IEEE Symposium on VLSI Circuits","volume":"85 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"13","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE Symposium on VLSI Circuits","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/VLSICircuits18222.2020.9162949","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 13
Abstract
A 10nm digital Binary Neural Network (BNN) chip implements 1b activations and weights for compute density of 418TOPS/mm2 and memory density of 414KB/mm2. The chip achieves an energy efficiency of 617TOPS/W by leveraging Compute Near Memory (CNM), parallel inner product compute, and Near-Threshold Voltage (NTV) operation. The digital BNN design approaches the energy efficiency of analog in-memory techniques while also ensuring deterministic, scalable, and precise operation.