Meng Guo;Qi Wu;Chuanning Wang;Yangcan Zhou;Shaowei Wang;Chuan Zhang;Zhongfeng Wang;Jun Lin
{"title":"A RISC-V Domain-Specific Processor for Deep Learning-Based Channel Estimation","authors":"Meng Guo;Qi Wu;Chuanning Wang;Yangcan Zhou;Shaowei Wang;Chuan Zhang;Zhongfeng Wang;Jun Lin","doi":"10.1109/TCSI.2025.3547319","DOIUrl":null,"url":null,"abstract":"Channel estimation (CE) is a critical component in the massive multi-input multi-output (MIMO) communication systems. Compared with conventional CE algorithms, deep learning (DL)-based approach becomes a promising alternative, due to its capability of offering enhanced performance and robustness across diverse scenarios. However, efficient DL-based CE algorithms have two key properties that make them challenging for implementation in existing architectures at the edge side: the diversity of deep neural networks (DNNs) and CE strategies, and the involvements of multiple computation-intensive tasks that compass conventional signal processing, artificial intelligence (AI) inference, and online learning. To address these challenges, a domain-specific processor based on an extended RISC-V instruction set architecture (ISA) is proposed to perform these DL-based CE algorithms. First, a dedicated RISC-V ISA extension is developed to support all essential operations required by a DL-based CE algorithm, such as matrix inversion, in a flexible manner. Building on the customized ISA extension, a highly adaptable and scalable RISC-V processor is developed, featuring scalar and vector posit arithmetic units to alleviate high computational and memory demands of DNNs during both inference and training phase. Additionally, a coarse-grained matrix accelerator is integrated to expedite various matrix operations ensuring high throughput. In this way, both high flexibility and computational efficiency are achieved. Finally, our processor is implemented on a TSMC 28-nm technology. Implementation results show that the processor achieves a speedup of <inline-formula> <tex-math>$5.16\\sim 6.80\\times $ </tex-math></inline-formula> for all matrix operations compared with the state-of-the-art work. Moreover, the proposed processor provides an area efficiency improvement of <inline-formula> <tex-math>$1.61\\times $ </tex-math></inline-formula> and an energy efficiency enhancement of <inline-formula> <tex-math>$6.6\\sim 15.4\\times $ </tex-math></inline-formula> compared to the open-source vector processor Ara. Notably, this work is the first RISC-V domain-specific processor tailored for diverse DL-based CE algorithms.","PeriodicalId":13039,"journal":{"name":"IEEE Transactions on Circuits and Systems I: Regular Papers","volume":"72 5","pages":"2380-2393"},"PeriodicalIF":5.2000,"publicationDate":"2025-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Circuits and Systems I: Regular Papers","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10939000/","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0
Abstract
Channel estimation (CE) is a critical component in the massive multi-input multi-output (MIMO) communication systems. Compared with conventional CE algorithms, deep learning (DL)-based approach becomes a promising alternative, due to its capability of offering enhanced performance and robustness across diverse scenarios. However, efficient DL-based CE algorithms have two key properties that make them challenging for implementation in existing architectures at the edge side: the diversity of deep neural networks (DNNs) and CE strategies, and the involvements of multiple computation-intensive tasks that compass conventional signal processing, artificial intelligence (AI) inference, and online learning. To address these challenges, a domain-specific processor based on an extended RISC-V instruction set architecture (ISA) is proposed to perform these DL-based CE algorithms. First, a dedicated RISC-V ISA extension is developed to support all essential operations required by a DL-based CE algorithm, such as matrix inversion, in a flexible manner. Building on the customized ISA extension, a highly adaptable and scalable RISC-V processor is developed, featuring scalar and vector posit arithmetic units to alleviate high computational and memory demands of DNNs during both inference and training phase. Additionally, a coarse-grained matrix accelerator is integrated to expedite various matrix operations ensuring high throughput. In this way, both high flexibility and computational efficiency are achieved. Finally, our processor is implemented on a TSMC 28-nm technology. Implementation results show that the processor achieves a speedup of $5.16\sim 6.80\times $ for all matrix operations compared with the state-of-the-art work. Moreover, the proposed processor provides an area efficiency improvement of $1.61\times $ and an energy efficiency enhancement of $6.6\sim 15.4\times $ compared to the open-source vector processor Ara. Notably, this work is the first RISC-V domain-specific processor tailored for diverse DL-based CE algorithms.
期刊介绍:
TCAS I publishes regular papers in the field specified by the theory, analysis, design, and practical implementations of circuits, and the application of circuit techniques to systems and to signal processing. Included is the whole spectrum from basic scientific theory to industrial applications. The field of interest covered includes: - Circuits: Analog, Digital and Mixed Signal Circuits and Systems - Nonlinear Circuits and Systems, Integrated Sensors, MEMS and Systems on Chip, Nanoscale Circuits and Systems, Optoelectronic - Circuits and Systems, Power Electronics and Systems - Software for Analog-and-Logic Circuits and Systems - Control aspects of Circuits and Systems.