{"title":"通过即时上下文切换和延长功率门控时间实现低功耗神经网络处理的 3-D 存储器系统","authors":"Kouhei Toyotaka;Yuto Yakubo;Kazuma Furutani;Haruki Katagiri;Masashi Fujita;Yoshinori Ando;Toru Nakura;Shunpei Yamazaki","doi":"10.1109/JEDS.2024.3418036","DOIUrl":null,"url":null,"abstract":"Using a 3-D monolithic stacking memory technology of crystalline oxide semiconductor (OS) transistors, we fabricated a test chip having AI accelerator (ACC) memory for weight data of a neural network (NN), backup memory of flip-flops (FF), and CPU memory storing instructions and data. These memories are composed of two-layer OS transistors on Si CMOS, where memories in each layer correspond to a bank. In this structure, bank switching of the ACC memory and the FF backup memory work together, and thus inference of different NNs is switched with low latency and low power so that the power gating standby time can be extended. Consequently, a 92% reduction in power consumption is achieved in inference at a frame rate of 60 fps as compared with a chip using static random access memory (SRAM) as the ACC memory.","PeriodicalId":2,"journal":{"name":"ACS Applied Bio Materials","volume":null,"pages":null},"PeriodicalIF":4.6000,"publicationDate":"2024-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10568946","citationCount":"0","resultStr":"{\"title\":\"A 3-D Bank Memory System for Low-Power Neural Network Processing Achieved by Instant Context Switching and Extended Power Gating Time\",\"authors\":\"Kouhei Toyotaka;Yuto Yakubo;Kazuma Furutani;Haruki Katagiri;Masashi Fujita;Yoshinori Ando;Toru Nakura;Shunpei Yamazaki\",\"doi\":\"10.1109/JEDS.2024.3418036\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Using a 3-D monolithic stacking memory technology of crystalline oxide semiconductor (OS) transistors, we fabricated a test chip having AI accelerator (ACC) memory for weight data of a neural network (NN), backup memory of flip-flops (FF), and CPU memory storing instructions and data. These memories are composed of two-layer OS transistors on Si CMOS, where memories in each layer correspond to a bank. In this structure, bank switching of the ACC memory and the FF backup memory work together, and thus inference of different NNs is switched with low latency and low power so that the power gating standby time can be extended. Consequently, a 92% reduction in power consumption is achieved in inference at a frame rate of 60 fps as compared with a chip using static random access memory (SRAM) as the ACC memory.\",\"PeriodicalId\":2,\"journal\":{\"name\":\"ACS Applied Bio Materials\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":4.6000,\"publicationDate\":\"2024-06-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10568946\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACS Applied Bio Materials\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10568946/\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"MATERIALS SCIENCE, BIOMATERIALS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACS Applied Bio Materials","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10568946/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MATERIALS SCIENCE, BIOMATERIALS","Score":null,"Total":0}
引用次数: 0
摘要
利用晶体氧化物半导体(OS)晶体管的三维单片堆叠存储器技术,我们制造出了一款测试芯片,其中包括用于神经网络(NN)权重数据的人工智能加速器(ACC)存储器、触发器(FF)备份存储器以及存储指令和数据的 CPU 存储器。这些存储器由 Si CMOS 上的两层 OS 晶体管组成,每一层的存储器对应一个组。在这种结构中,ACC 存储器和 FF 备用存储器的组切换是协同工作的,因此不同 NN 的推理切换具有低延迟和低功耗的特点,从而延长了电源门控的待机时间。因此,与使用静态随机存取存储器(SRAM)作为 ACC 存储器的芯片相比,在帧速率为 60 fps 的推理过程中,功耗降低了 92%。
A 3-D Bank Memory System for Low-Power Neural Network Processing Achieved by Instant Context Switching and Extended Power Gating Time
Using a 3-D monolithic stacking memory technology of crystalline oxide semiconductor (OS) transistors, we fabricated a test chip having AI accelerator (ACC) memory for weight data of a neural network (NN), backup memory of flip-flops (FF), and CPU memory storing instructions and data. These memories are composed of two-layer OS transistors on Si CMOS, where memories in each layer correspond to a bank. In this structure, bank switching of the ACC memory and the FF backup memory work together, and thus inference of different NNs is switched with low latency and low power so that the power gating standby time can be extended. Consequently, a 92% reduction in power consumption is achieved in inference at a frame rate of 60 fps as compared with a chip using static random access memory (SRAM) as the ACC memory.