{"title":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems Publication Information","authors":"","doi":"10.1109/TVLSI.2025.3549990","DOIUrl":"https://doi.org/10.1109/TVLSI.2025.3549990","url":null,"abstract":"","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"33 4","pages":"C2-C2"},"PeriodicalIF":2.8,"publicationDate":"2025-03-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10937138","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143676018","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems Society Information","authors":"","doi":"10.1109/TVLSI.2025.3549993","DOIUrl":"https://doi.org/10.1109/TVLSI.2025.3549993","url":null,"abstract":"","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"33 4","pages":"C3-C3"},"PeriodicalIF":2.8,"publicationDate":"2025-03-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10937163","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143667706","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems Publication Information","authors":"","doi":"10.1109/TVLSI.2025.3539514","DOIUrl":"https://doi.org/10.1109/TVLSI.2025.3539514","url":null,"abstract":"","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"33 3","pages":"C2-C2"},"PeriodicalIF":2.8,"publicationDate":"2025-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10903159","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143489180","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Editorial: Renewed Excellence for 2025–2026","authors":"Mircea R. Stan","doi":"10.1109/TVLSI.2024.3520396","DOIUrl":"https://doi.org/10.1109/TVLSI.2024.3520396","url":null,"abstract":"I am happy and honored to have been reappointed as Editor in Chief (EiC) for the IEEE Transactions on VLSI Systems (TVLSI) for another two-year term. As I continue my efforts to improve the quality of the journal, I am grateful for the renewed trust placed in me by the three IEEE sponsoring societies (CASS, SSCS and CS) and by the VLSI community at large. Contrary to a feared slowdown due to increased difficulties with scaling, the field of Very Large Scale Integration (VLSI) has actually grown at an increasingly fast rate as it provides the hardware backbone for the insatiable AI applications which are taking over the world. The H100/200 GPUs, which are essential for AI training, are the largest “conventional” integrated circuits (IC) with 80 billion transistors, while the wafer-scale WSE2/3, which can provide significant improvements in AI inference, are absolute behemoths with 4 trillion transistors! Mr. Moore can be proud there in heaven for what our industry is able to deliver!","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"33 3","pages":"603-626"},"PeriodicalIF":2.8,"publicationDate":"2025-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10903548","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143489101","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems Society Information","authors":"","doi":"10.1109/TVLSI.2025.3539516","DOIUrl":"https://doi.org/10.1109/TVLSI.2025.3539516","url":null,"abstract":"","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"33 3","pages":"C3-C3"},"PeriodicalIF":2.8,"publicationDate":"2025-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10903516","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143496460","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Fan Yang;Nan Li;Letian Wang;Pinfeng Jiang;Xiangshui Miao;Xingsheng Wang
{"title":"ISARA: An Island-Style Systolic Array Reconfigurable Accelerator Based on Memristors for Deep Neural Networks","authors":"Fan Yang;Nan Li;Letian Wang;Pinfeng Jiang;Xiangshui Miao;Xingsheng Wang","doi":"10.1109/TVLSI.2024.3521394","DOIUrl":"https://doi.org/10.1109/TVLSI.2024.3521394","url":null,"abstract":"The demand for edge artificial intelligence (AI) is significant, particularly in revolutionary technological areas such as the Internet of Things, autonomous driving, and industrial control. However, reliable and high-performance edge AI is still constrained by computing hardware, and improving the performance and reliability of edge AI accelerators remains a key focus for researchers. This work proposes a memristor/resistive random access memory (RRAM)-based island-style systolic array reconfigurable accelerator (ISARA) that meets the reliability and performance requirements of edge AI. Inspired by the island-style architecture of FPGAs, this work proposes a flexible-tile architecture based on RRAM processing element (PE) islands, optimizing the data flow within the systolic array. The design of network-on-chip reduces data processing latency. In addition, to enhance computational efficiency, this work incorporates a bit-fusion scheme within the flexible tile, which reduces analog-to-digital converter (ADC) power consumption and addresses the conductance variation of RRAM. To date, only a few works have completed the entire process from simulation, design, and fabrication to hardware testing. This work fully realizes the design and validation of a new accelerator based on RRAM chips, demonstrating the reliability of RRAM-based systolic array accelerators for the first time. After deploying algorithms, the hardware accelerator achieved recognition rates comparable to software. Compared to similar works, ISARA’s computational efficiency exceeds theirs and has flexible reconfigurability. The same deep neural network (DNN) models are adopted for evaluation and compared to other accelerators, and ISARA’s processing latency is reduced by 200 times.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"33 4","pages":"963-975"},"PeriodicalIF":2.8,"publicationDate":"2025-02-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143676088","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"FANNS: An FPGA-Based Approximate Nearest-Neighbor Search Accelerator","authors":"Wei Yuan;Xi Jin","doi":"10.1109/TVLSI.2024.3496589","DOIUrl":"https://doi.org/10.1109/TVLSI.2024.3496589","url":null,"abstract":"Approximate nearest-neighbor search (ANNS) based on high-dimensional vectors has been extensively utilized in data science and neural networks. However, deploying ANNS in production systems requires minimal redundant computation, high recall rates, and low on-chip memory usage, which existing hardware accelerators fail to offer. We propose FANNS, a solution for ANNS based on high-dimensional vectors that can eliminate redundant computations and reuse on-chip data. Extensive evaluations show that FANNS achieves an average of <inline-formula> <tex-math>$184.1times $ </tex-math></inline-formula>, <inline-formula> <tex-math>$33.0times $ </tex-math></inline-formula>, <inline-formula> <tex-math>$2.9times $ </tex-math></inline-formula>, and <inline-formula> <tex-math>$2.5times $ </tex-math></inline-formula> better energy efficiency than CPUs, GPUs, and two state-of-the-art ANNS architectures, i.e., DF-GAS and Vstore, respectively.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"33 4","pages":"1197-1201"},"PeriodicalIF":2.8,"publicationDate":"2025-02-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143667348","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Flexible DA-Based Architecture for Computation of Inner Product of Variable Vectors","authors":"Anil Kali;Samrat L. Sabat;Pramod Kumar Meher","doi":"10.1109/TVLSI.2025.3528244","DOIUrl":"https://doi.org/10.1109/TVLSI.2025.3528244","url":null,"abstract":"The computation of inner products of any given pair of vectors is an indispensable requirement in several applications including artificial intelligence (AI), machine learning (ML), signal processing, image processing, communication, and many others. The throughput requirement of inner product computation varies widely for different applications. Moreover, the throughput of computation must match the requirements of the applications. It is therefore important to design flexible hardware for inner product computation that produces the desired throughput. Distributed arithmetic (DA) is a well-known approach for efficient inner product computation. This article presents an efficient DA-based architecture for computing the inner product of variable vectors, which could be tailored according to the throughput requirement of any given application and reused for different inner product lengths. The proposed designs could also be deployed to achieve a trade-off between throughput and area/energy consumption. In this article, we have used modified Booth encoding (MBE) to reduce the number of partial products and proposed a novel carry-save accumulator (CSA) for shortening the critical path delay. The proposed designs are synthesized by Cadence Genus using GPDK 90-nm technology library and place-and-route using Cadence Innovus for different inner product lengths and word lengths. As found from the postlayout synthesis results, the proposed designs offer savings of nearly 30% and 29% EPC and ADP over the bit-serial DA-based design on average for word lengths 8 and 16 and inner product lengths 8, 16, and 32, respectively.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"33 4","pages":"953-962"},"PeriodicalIF":2.8,"publicationDate":"2025-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143676128","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yiying Liu;Minghui Yin;Huanhuan Zhou;Yunxia You;Weihua Zhang;Hongwei Liu;Chen Wang;Yajie Zou;Zhiqiang Li
{"title":"Virtual_N2_PDK: A Predictive Process Design Kit for 2-nm Nanosheet FET Technology","authors":"Yiying Liu;Minghui Yin;Huanhuan Zhou;Yunxia You;Weihua Zhang;Hongwei Liu;Chen Wang;Yajie Zou;Zhiqiang Li","doi":"10.1109/TVLSI.2025.3529504","DOIUrl":"https://doi.org/10.1109/TVLSI.2025.3529504","url":null,"abstract":"Nanosheet FETs (NSFETs) are considered promising candidates to replace FinFETs as the dominant devices in sub-5-nm processes. To encourage further research into NSFET-based integrated circuits, we present Virtual_N2_PDK, a predictive process design kit (PDK) for 2-nm NSFET technology. All assumptions are based on publicly available sources. Ruthenium (Ru) interconnects are employed for the buried power rail (BPR) and tight-pitch layers. Wrap-around contact (WAC) is also integrated into Virtual_N2_PDK to investigate its impact on circuit performance. By calibrating the BSIM-CMG model with 3-D technology computer-aided design (TCAD) electrothermal simulation results, SPICE models that account for self-heating effects (SHEs) are generated for devices with and without WAC. The simulation results show that with the WAC structure, the energy-delay product (EDP) of standard cells is reduced by an average of 25.18%, while the frequency of a 15-stage ring oscillator circuit increases by 26.05%.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"33 4","pages":"1004-1013"},"PeriodicalIF":2.8,"publicationDate":"2025-01-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143676065","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zhang Zhang;Zhihao Chen;Jiedong Wang;Guangjun Xie;Gang Liu
{"title":"Reconfigurable 10T SRAM for Energy-Efficient CAM Operation and In-Memory Computing","authors":"Zhang Zhang;Zhihao Chen;Jiedong Wang;Guangjun Xie;Gang Liu","doi":"10.1109/TVLSI.2025.3526973","DOIUrl":"https://doi.org/10.1109/TVLSI.2025.3526973","url":null,"abstract":"The limitations of the von Neumann architecture in terms of power consumption and throughput are increasingly evident. In-memory computing is a promising computing paradigm to alleviate this limitation. This article proposes a high-speed and low-power 10T compute-static random-access memory (CSRAM) capable of conducting rowwise search operations and executing in-memory logic functions efficiently. A self-suppressed discharge scheme is implemented to curtail the power consumption of the search operation by reducing the discharge swing of the match lines (MLs). The rowwise search scheme avoids vertical data storage, enhancing the compatibility between different operation modes. The proposed 10T SRAM architecture addresses the issue of sneak currents effectively when multiple lines are activated. Additionally, decoupled read ports eliminate compute access disturbance. To validate the design, a 4Kb array is designed with a 40-nm CMOS technology. At a supply voltage (VDD) of 1.1 V, the in-memory logic operations are capable of operating at a frequency of 752 MHz, consuming 29.2 fJ/bit. In binary content-addressable memory (BCAM) search mode, the minimum energy consumption of 0.51 fJ/bit occurs at 0.8 V and 120 MHz.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"33 4","pages":"1065-1072"},"PeriodicalIF":2.8,"publicationDate":"2025-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143675978","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}