Zhongyuan Feng, Bo Wang, Zhaoyang Zhang, An Guo, Xin Si
{"title":"基于展台的数字内存计算Marco处理变压器模型","authors":"Zhongyuan Feng, Bo Wang, Zhaoyang Zhang, An Guo, Xin Si","doi":"10.1109/APCCAS55924.2022.10090256","DOIUrl":null,"url":null,"abstract":"Transformer model has achieved excellent results in many fields, owing of its huge data volume and high precision requirements, the traditional analog compute-in-memory circuit can no longer meet its needs. To solve this dilemma, this paper proposes a digital compute-in-memory circuit based on the improved Booth algorithm. The 6T SRAM array stores the multiplicand, and the multiplier is encoded by the booth encoder, and then, local computing cell (LCC) read the corresponding value from the array according to the encoding result. These values are finally sent to the dual-mode shift and add module (DMSA) to obtain the computation results. The proposed circuit achieved energy efficiency of 33.11TOPS/W@INT8 and 8.3 TOPS/W@INT16. And the proposed circuit achieved 1.92+ better energy efficiency compared with previous works.","PeriodicalId":243739,"journal":{"name":"2022 IEEE Asia Pacific Conference on Circuits and Systems (APCCAS)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Booth-based Digital Compute-in-Memory Marco for Processing Transformer Model\",\"authors\":\"Zhongyuan Feng, Bo Wang, Zhaoyang Zhang, An Guo, Xin Si\",\"doi\":\"10.1109/APCCAS55924.2022.10090256\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Transformer model has achieved excellent results in many fields, owing of its huge data volume and high precision requirements, the traditional analog compute-in-memory circuit can no longer meet its needs. To solve this dilemma, this paper proposes a digital compute-in-memory circuit based on the improved Booth algorithm. The 6T SRAM array stores the multiplicand, and the multiplier is encoded by the booth encoder, and then, local computing cell (LCC) read the corresponding value from the array according to the encoding result. These values are finally sent to the dual-mode shift and add module (DMSA) to obtain the computation results. The proposed circuit achieved energy efficiency of 33.11TOPS/W@INT8 and 8.3 TOPS/W@INT16. And the proposed circuit achieved 1.92+ better energy efficiency compared with previous works.\",\"PeriodicalId\":243739,\"journal\":{\"name\":\"2022 IEEE Asia Pacific Conference on Circuits and Systems (APCCAS)\",\"volume\":\"5 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-11-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE Asia Pacific Conference on Circuits and Systems (APCCAS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/APCCAS55924.2022.10090256\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE Asia Pacific Conference on Circuits and Systems (APCCAS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/APCCAS55924.2022.10090256","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A Booth-based Digital Compute-in-Memory Marco for Processing Transformer Model
Transformer model has achieved excellent results in many fields, owing of its huge data volume and high precision requirements, the traditional analog compute-in-memory circuit can no longer meet its needs. To solve this dilemma, this paper proposes a digital compute-in-memory circuit based on the improved Booth algorithm. The 6T SRAM array stores the multiplicand, and the multiplier is encoded by the booth encoder, and then, local computing cell (LCC) read the corresponding value from the array according to the encoding result. These values are finally sent to the dual-mode shift and add module (DMSA) to obtain the computation results. The proposed circuit achieved energy efficiency of 33.11TOPS/W@INT8 and 8.3 TOPS/W@INT16. And the proposed circuit achieved 1.92+ better energy efficiency compared with previous works.