{"title":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems Society Information","authors":"","doi":"10.1109/TVLSI.2024.3435251","DOIUrl":"https://doi.org/10.1109/TVLSI.2024.3435251","url":null,"abstract":"","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":null,"pages":null},"PeriodicalIF":2.8,"publicationDate":"2024-08-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10648917","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142077622","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pranav O. Mathews;Praveen Raj Ayyappan;Afolabi Ige;Swagat Bhattacharyya;Linhao Yang;Jennifer O. Hasler
{"title":"A 65 nm CMOS Analog Programmable Standard Cell Library for Mixed-Signal Computing","authors":"Pranav O. Mathews;Praveen Raj Ayyappan;Afolabi Ige;Swagat Bhattacharyya;Linhao Yang;Jennifer O. Hasler","doi":"10.1109/TVLSI.2024.3432916","DOIUrl":"10.1109/TVLSI.2024.3432916","url":null,"abstract":"Integrated circuit (IC) design for analog computing requires similar toolflows and synthesis as large-scale digital systems, in-turn necessitating a library of general-purpose analog cells. To this end, we present a programmable, floating-gate (FG)-based analog standard cell library in a commercially available 65 nm process that allows analog IC designers to use synthesis tools with an abstracted design mindset similar to large-scale digital design. We fabricate the test cells, which include filters with programmable corners, an analog classifier, and an arbitrary waveform generator (AWG); experimentally characterize FG programming; and experimentally demonstrate the performance of the standard cells. Overall, the standard cells achieve a similar or smaller footprint than previous approaches while leveraging the benefits of FG programming at smaller technology nodes.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":null,"pages":null},"PeriodicalIF":2.8,"publicationDate":"2024-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142194482","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Unrolled, Pipelined, and Stage-Folded Architectures for Encoding of Multi-Kernel Polar Codes","authors":"Hossein Rezaei;Elham Abbasi;Nandana Rajatheva;Matti Latva-Aho","doi":"10.1109/TVLSI.2024.3436872","DOIUrl":"10.1109/TVLSI.2024.3436872","url":null,"abstract":"Over the past decade, polar codes have received significant attraction and have been selected as the coding method for the control channel in fifth-generation (5G) wireless communication systems. However, conventional polar codes are reliant solely on binary (\u0000<inline-formula> <tex-math>$2 times 2$ </tex-math></inline-formula>\u0000) kernels, which restricts their block length to being only powers of 2. In response, multi-kernel (MK) polar codes have been proposed as a viable solution to achieve increased flexibility in code length. This article proposes unrolled and pipelined architectures for encoding both systematic and nonsystematic MK polar codes, capable of high-throughput encoding of codes constructed with binary, ternary (\u0000<inline-formula> <tex-math>$3 times 3$ </tex-math></inline-formula>\u0000), or binary-ternary mixed kernels. Furthermore, two novel nonsystematic stage-folded encoders, designed to minimize resource usage, have been introduced for the encoding of pure-ternary and MK codes. The proposed MK encoders additionally provide the functionality of dynamic kernel assignment. The proposed architectures exhibit an unprecedented level of flexibility by supporting 83 different codes and offering various architectures that provide tradeoffs between throughput and resource consumption. The FPGA implementation results demonstrate that a partially pipelined polar encoder of size \u0000<inline-formula> <tex-math>$N=4096$ </tex-math></inline-formula>\u0000 operating at a frequency of 270 MHz gives a throughput of 1080 Gb/s. In addition, a new compiler scripted in Python is introduced to automatically generate HDL modules for the desired encoders. By inserting the desired parameters, a designer can simply obtain all the necessary VHDL files for FPGA implementation.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":null,"pages":null},"PeriodicalIF":2.8,"publicationDate":"2024-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142194484","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The Conjugated Current Mirrors: A General Enhancement in Transconductance Amplifiers","authors":"Meysam Akbari;Kea-Tiong Tang","doi":"10.1109/TVLSI.2024.3439525","DOIUrl":"10.1109/TVLSI.2024.3439525","url":null,"abstract":"This work presents a general enhancement in operational transconductance amplifiers (OTAs) by conjugating the diode-connected topologies of the current mirrors (CMs). The proposed conjugation method provides an internal high-impedance node, by which the transconductance of the amplifier is significantly increased. Since the central node of the conjugated CMs is virtually grounded for small differential signals, the cascode devices of the diode-connected topologies can be employed as an extra differential pair causing a further enhancement in transconductance. Moreover, the large signal behavior of the circuit shows that the conjugated CMs are capable of copying a dynamic current with a higher gain in comparison with a traditional CM amplifier. This advantage results in faster charging and discharging of the output capacitive load, which provides a larger slew rate (SR) without increasing the quiescent current. The proposed amplifier was manufactured with TSMC 0.18-\u0000<inline-formula> <tex-math>$mu $ </tex-math></inline-formula>\u0000m CMOS technology occupying a silicon area of \u0000<inline-formula> <tex-math>$55.5times 48.9~mu $ </tex-math></inline-formula>\u0000m. Experimental results at a supply voltage of 1.8 V show a gain bandwidth (GBW) of 104.9 MHz, a dc gain of 79.1 dB, and an SR of 55.7 V/\u0000<inline-formula> <tex-math>$mu $ </tex-math></inline-formula>\u0000s for a capacitive load of 10 pF, while the circuit consumes 489-\u0000<inline-formula> <tex-math>$mu $ </tex-math></inline-formula>\u0000W power.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":null,"pages":null},"PeriodicalIF":2.8,"publicationDate":"2024-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142194485","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jinming Zhang;Xuyan Wang;Yaoyao Ye;Dongxu Lyu;Guojie Xiong;Ningyi Xu;Yong Lian;Guanghui He
{"title":"M2M: A Fine-Grained Mapping Framework to Accelerate Multiple DNNs on a Multi-Chiplet Architecture","authors":"Jinming Zhang;Xuyan Wang;Yaoyao Ye;Dongxu Lyu;Guojie Xiong;Ningyi Xu;Yong Lian;Guanghui He","doi":"10.1109/TVLSI.2024.3438549","DOIUrl":"10.1109/TVLSI.2024.3438549","url":null,"abstract":"With the advancement of artificial intelligence, the collaboration of multiple deep neural networks (DNNs) has been crucial to existing embedded systems and cloud systems, especially for automatic driving applications as well as augmented and virtual reality (AR/VR) applications. To trade off between cost and performance, chiplet-based DNN accelerators have emerged as a promising solution for accelerating DNN workloads. However, most existing mapping methods for multiple DNNs target for the monolithic chip, which fail to solve the problems faced by the emerging multi-chiplet architecture, such as the problems of distributed memory access, complex heterogeneous interconnect network, and the scaling-up of computing resources. In this work, we propose M2M, a fine-grained mapping framework for accelerating multiple DNNs on a multi-chiplet architecture. It includes a temporal and spatial task scheduling for reconfigurable dataflow accelerators and a communication-aware task mapping in a heterogeneous interconnect network. To enhance communication efficiency and reduce the overall latency, we further propose a fine-tuned quality-of-service (QoS) policy for network-on-package (NoP) links. To the best of our knowledge, this is the first fine-grained mapping framework for multiple DNNs on a multi-chiplet architecture. We implemented the proposed fine-grained mapping framework using genetic algorithm and simulated annealing algorithm. Experimental results show that our work achieves 7.18%–61.09% latency reduction under vision, language, and mixed workloads when compared with the state-of-the-art related work.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":null,"pages":null},"PeriodicalIF":2.8,"publicationDate":"2024-08-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142194489","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"DLB-CNet: Difference Learning-Based Convolution Network for Building Change Detection","authors":"Zipeng Fan;Sanqian Wang;Xueting Pu;Yuting Cong;Yuan Liu;Xiubao Sui;Qian Chen","doi":"10.1109/TVLSI.2024.3438728","DOIUrl":"10.1109/TVLSI.2024.3438728","url":null,"abstract":"Change detection (CD) in remote sensing (RS) images is a technique used to analyze and characterize surface changes from remotely sensed data at different time periods. However, current deep-learning-based methods sometimes struggle with the diversity of targets in complex RS scenarios, leading to issues, such as false detections and loss of detail. To address these challenges, we propose a method called difference learning-based convolution and network (DLB-CNet) for building CD (BCD). In DLB-CNet, we use difference learning module (DLM), accomplishing the extraction of building change features by enhancing the feature differences between the two images and enhancing model robustness. Additionally, an innovative attention module called integration attention (IA) is introduced to efficiently process semantic information by jointly focusing on global representation subspaces. Our model achieves impressive results on the LEVIR-CD dataset, WHU-CD dataset, and CDD dataset, with \u0000<inline-formula> <tex-math>${F}1$ </tex-math></inline-formula>\u0000-scores of 90.56%, 92.28%, and 94.98%, respectively, demonstrating its superiority over the state-of-the-art methods.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":null,"pages":null},"PeriodicalIF":2.8,"publicationDate":"2024-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141938986","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A 20-V Pulse Driver Based on All-nMOS Charge Pump Without Reversion Loss and Overstress in 65-nm Standard CMOS Technology","authors":"Ziliang Zhou;Min Tan","doi":"10.1109/TVLSI.2024.3435974","DOIUrl":"10.1109/TVLSI.2024.3435974","url":null,"abstract":"This article proposes a high-efficiency all-nMOS bidirectional charge pump (CP) cell and constructs a CP-based high-voltage (HV) pulse driver based on it. Double-diode substrate isolation (DDSI) can extend the maximum supported voltage in a bulk CMOS process, but it requires an all-nMOS implementation of CP cells. Existing all-nMOS CPs either do not support the bidirectional charge transfer required for HV pulse drivers, or achieve it with additional penalties such as reversion charge loss and overstress on transistors. The proposed all-nMOS CP with novel gate voltage control strategies is the first one reported in the literature that can support the bidirectional charge transfer required for HV pulse drivers without suffering from reversion loss and threshold voltage loss or causing overstress on transistors. A ten-stage CP-based HV pulse driver is implemented in a 65-nm CMOS process utilizing this cell. Postlayout simulation results demonstrate that it can reliably generate 20-V HV pulses from a 2.5 V supply for a 15 pF // 200 k\u0000<inline-formula> <tex-math>$Omega $ </tex-math></inline-formula>\u0000 load at 55 kHz. The driver exhibits a peak power efficiency of 46.4% and occupies an area of 0.262 mm2.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":null,"pages":null},"PeriodicalIF":2.8,"publicationDate":"2024-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141939168","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"RosebudVirt: A High-Performance and Partially Reconfigurable FPGA Virtualization Framework for Multitenant Networks","authors":"Yiwei Chang, Zhichuan Guo","doi":"10.1109/tvlsi.2024.3436017","DOIUrl":"https://doi.org/10.1109/tvlsi.2024.3436017","url":null,"abstract":"","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":null,"pages":null},"PeriodicalIF":2.8,"publicationDate":"2024-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141939170","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}