{"title":"MINOTAUR:基于位置的0.42-0.50-TOPS /W边缘变压器推理和训练加速器","authors":"Kartik Prabhu;Robert M. Radway;Jeffrey Yu;Kai Bartolone;Massimo Giordano;Fabian Peddinghaus;Yonatan Urman;Win-San Khwa;Yu-Der Chih;Meng-Fan Chang;Subhasish Mitra;Priyanka Raina","doi":"10.1109/JSSC.2025.3545731","DOIUrl":null,"url":null,"abstract":"Transformer models have revolutionized natural language processing (NLP) and enabled many new applications, but are challenging to deploy on resource-constrained edge devices due to their high computation and memory demands. We present MINOTAUR, an edge system-on-chip (SoC) for inference and fine-tuning of Transformer models with all memory on the chip. MINOTAUR utilizes a configurable 8-bit posit-based accelerator to achieve highly accurate and efficient inference and fine-tuning. To minimize memory power, MINOTAUR employs fine-grained spatiotemporal power gating of on-chip resistive-RAM (RRAM). MINOTAUR enables on-chip fine-tuning through full-network low-rank adaptation (LoRA). MINOTAUR fabricates in a 40-nm CMOS process, achieves ResNet-18 inference in 8.1 mJ and MobileBERTTINY inference in 8.2 mJ, and performs on-chip fine-tuning with an accuracy that is within 1.7% of offline training.","PeriodicalId":13129,"journal":{"name":"IEEE Journal of Solid-state Circuits","volume":"60 4","pages":"1311-1323"},"PeriodicalIF":4.6000,"publicationDate":"2025-03-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"MINOTAUR: A Posit-Based 0.42–0.50-TOPS/W Edge Transformer Inference and Training Accelerator\",\"authors\":\"Kartik Prabhu;Robert M. Radway;Jeffrey Yu;Kai Bartolone;Massimo Giordano;Fabian Peddinghaus;Yonatan Urman;Win-San Khwa;Yu-Der Chih;Meng-Fan Chang;Subhasish Mitra;Priyanka Raina\",\"doi\":\"10.1109/JSSC.2025.3545731\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Transformer models have revolutionized natural language processing (NLP) and enabled many new applications, but are challenging to deploy on resource-constrained edge devices due to their high computation and memory demands. We present MINOTAUR, an edge system-on-chip (SoC) for inference and fine-tuning of Transformer models with all memory on the chip. MINOTAUR utilizes a configurable 8-bit posit-based accelerator to achieve highly accurate and efficient inference and fine-tuning. To minimize memory power, MINOTAUR employs fine-grained spatiotemporal power gating of on-chip resistive-RAM (RRAM). MINOTAUR enables on-chip fine-tuning through full-network low-rank adaptation (LoRA). MINOTAUR fabricates in a 40-nm CMOS process, achieves ResNet-18 inference in 8.1 mJ and MobileBERTTINY inference in 8.2 mJ, and performs on-chip fine-tuning with an accuracy that is within 1.7% of offline training.\",\"PeriodicalId\":13129,\"journal\":{\"name\":\"IEEE Journal of Solid-state Circuits\",\"volume\":\"60 4\",\"pages\":\"1311-1323\"},\"PeriodicalIF\":4.6000,\"publicationDate\":\"2025-03-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Journal of Solid-state Circuits\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10916649/\",\"RegionNum\":1,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Journal of Solid-state Circuits","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10916649/","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
MINOTAUR: A Posit-Based 0.42–0.50-TOPS/W Edge Transformer Inference and Training Accelerator
Transformer models have revolutionized natural language processing (NLP) and enabled many new applications, but are challenging to deploy on resource-constrained edge devices due to their high computation and memory demands. We present MINOTAUR, an edge system-on-chip (SoC) for inference and fine-tuning of Transformer models with all memory on the chip. MINOTAUR utilizes a configurable 8-bit posit-based accelerator to achieve highly accurate and efficient inference and fine-tuning. To minimize memory power, MINOTAUR employs fine-grained spatiotemporal power gating of on-chip resistive-RAM (RRAM). MINOTAUR enables on-chip fine-tuning through full-network low-rank adaptation (LoRA). MINOTAUR fabricates in a 40-nm CMOS process, achieves ResNet-18 inference in 8.1 mJ and MobileBERTTINY inference in 8.2 mJ, and performs on-chip fine-tuning with an accuracy that is within 1.7% of offline training.
期刊介绍:
The IEEE Journal of Solid-State Circuits publishes papers each month in the broad area of solid-state circuits with particular emphasis on transistor-level design of integrated circuits. It also provides coverage of topics such as circuits modeling, technology, systems design, layout, and testing that relate directly to IC design. Integrated circuits and VLSI are of principal interest; material related to discrete circuit design is seldom published. Experimental verification is strongly encouraged.