IEEE Journal on Emerging and Selected Topics in Circuits and Systems最新文献

筛选
英文 中文
IEEE Journal on Emerging and Selected Topics in Circuits and Systems IEEE电路与系统中新兴和选定主题杂志
IF 3.7 2区 工程技术
IEEE Journal on Emerging and Selected Topics in Circuits and Systems Pub Date : 2025-06-25 DOI: 10.1109/JETCAS.2025.3573432
{"title":"IEEE Journal on Emerging and Selected Topics in Circuits and Systems","authors":"","doi":"10.1109/JETCAS.2025.3573432","DOIUrl":"https://doi.org/10.1109/JETCAS.2025.3573432","url":null,"abstract":"","PeriodicalId":48827,"journal":{"name":"IEEE Journal on Emerging and Selected Topics in Circuits and Systems","volume":"15 2","pages":"C3-C3"},"PeriodicalIF":3.7,"publicationDate":"2025-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11050017","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144481827","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
IEEE Journal on Emerging and Selected Topics in Circuits and Systems Publication Information IEEE关于电路和系统中新兴和选定主题的期刊出版信息
IF 3.7 2区 工程技术
IEEE Journal on Emerging and Selected Topics in Circuits and Systems Pub Date : 2025-06-25 DOI: 10.1109/JETCAS.2025.3573428
{"title":"IEEE Journal on Emerging and Selected Topics in Circuits and Systems Publication Information","authors":"","doi":"10.1109/JETCAS.2025.3573428","DOIUrl":"https://doi.org/10.1109/JETCAS.2025.3573428","url":null,"abstract":"","PeriodicalId":48827,"journal":{"name":"IEEE Journal on Emerging and Selected Topics in Circuits and Systems","volume":"15 2","pages":"C2-C2"},"PeriodicalIF":3.7,"publicationDate":"2025-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11050009","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144481824","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
IEEE Journal on Emerging and Selected Topics in Circuits and Systems Information for Authors IEEE关于电路和系统信息中新兴和选定主题的作者期刊
IF 3.7 2区 工程技术
IEEE Journal on Emerging and Selected Topics in Circuits and Systems Pub Date : 2025-06-25 DOI: 10.1109/JETCAS.2025.3573430
{"title":"IEEE Journal on Emerging and Selected Topics in Circuits and Systems Information for Authors","authors":"","doi":"10.1109/JETCAS.2025.3573430","DOIUrl":"https://doi.org/10.1109/JETCAS.2025.3573430","url":null,"abstract":"","PeriodicalId":48827,"journal":{"name":"IEEE Journal on Emerging and Selected Topics in Circuits and Systems","volume":"15 2","pages":"361-361"},"PeriodicalIF":3.7,"publicationDate":"2025-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11050010","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144481826","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Guest Editorial Generative Artificial Intelligence Compute: Algorithms, Implementations, and Applications to CAS 生成人工智能计算:算法、实现和在CAS中的应用
IF 3.7 2区 工程技术
IEEE Journal on Emerging and Selected Topics in Circuits and Systems Pub Date : 2025-06-25 DOI: 10.1109/JETCAS.2025.3572258
Chuan Zhang;Naigang Wang;Jongsun Park;Li Zhang
{"title":"Guest Editorial Generative Artificial Intelligence Compute: Algorithms, Implementations, and Applications to CAS","authors":"Chuan Zhang;Naigang Wang;Jongsun Park;Li Zhang","doi":"10.1109/JETCAS.2025.3572258","DOIUrl":"https://doi.org/10.1109/JETCAS.2025.3572258","url":null,"abstract":"","PeriodicalId":48827,"journal":{"name":"IEEE Journal on Emerging and Selected Topics in Circuits and Systems","volume":"15 2","pages":"144-148"},"PeriodicalIF":3.7,"publicationDate":"2025-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11050018","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144481853","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Generative AI Through CAS Lens: An Integrated Overview of Algorithmic Optimizations, Architectural Advances, and Automated Designs 通过CAS镜头生成人工智能:算法优化,架构进步和自动化设计的综合概述
IF 3.7 2区 工程技术
IEEE Journal on Emerging and Selected Topics in Circuits and Systems Pub Date : 2025-06-06 DOI: 10.1109/JETCAS.2025.3575272
Chuan Zhang;You You;Naigang Wang;Jongsun Park;Li Zhang
{"title":"Generative AI Through CAS Lens: An Integrated Overview of Algorithmic Optimizations, Architectural Advances, and Automated Designs","authors":"Chuan Zhang;You You;Naigang Wang;Jongsun Park;Li Zhang","doi":"10.1109/JETCAS.2025.3575272","DOIUrl":"https://doi.org/10.1109/JETCAS.2025.3575272","url":null,"abstract":"Generative artificial intelligence (GenAI) has emerged as a pivotal focus in global innovation agendas, revealing transformative potential that extends beyond technological applications to reshape diverse societal domains. Given the fundamental dependency of GenAI deployment on circuits and systems (CAS), a co-evolutionary approach integrating both technological paradigms becomes imperative. This synergistic framework confronts three interrelated challenges: 1) developing deployment-ready GenAI algorithms, 2) engineering implementation-efficient CAS architectures, and 3) leveraging GenAI for autonomous CAS designs - each representing critical innovations vectors. Given the rapid advancement of GenAI-CAS technologies, a comprehensive synthesis has become an urgent priority across academia and industry. Consequently, this timely review systematically analyzes current advancements, provides integrative perspectives, and identifies emerging research trajectories. This review endeavors to serve both AI and CAS communities, thereby catalyzing an innovation feedback loop: GenAI-optimized CAS architectures in turn accelerate GenAI evolution through algorithm-hardware co-empowerment.","PeriodicalId":48827,"journal":{"name":"IEEE Journal on Emerging and Selected Topics in Circuits and Systems","volume":"15 2","pages":"149-185"},"PeriodicalIF":3.7,"publicationDate":"2025-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11024158","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144481871","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
RDDM: A Rate-Distortion Guided Diffusion Model for Learned Image Compression Enhancement RDDM:用于学习图像压缩增强的速率失真引导扩散模型
IF 3.7 2区 工程技术
IEEE Journal on Emerging and Selected Topics in Circuits and Systems Pub Date : 2025-04-22 DOI: 10.1109/JETCAS.2025.3563228
Sanxin Jiang;Jiro Katto;Heming Sun
{"title":"RDDM: A Rate-Distortion Guided Diffusion Model for Learned Image Compression Enhancement","authors":"Sanxin Jiang;Jiro Katto;Heming Sun","doi":"10.1109/JETCAS.2025.3563228","DOIUrl":"https://doi.org/10.1109/JETCAS.2025.3563228","url":null,"abstract":"Currently, denoising diffusion probability models (DDPM) have achieved significant success in various image generation tasks, but their application in image compression, especially in the context of learned image compression (LIC), is quite limited. In this study, we introduce a rate-distortion (RD) guided diffusion model, referred to as RDDM, to enhance the performance of LIC. In RDDM, LIC is treated as a lossy codec function constrained by RD, dividing the input image into two parts through encoding and decoding operations: the reconstructed image and the residual image. The construction of RDDM is primarily based on two points. First, RDDM treats diffusion models as repositories of image structures and textures, built using extensive real-world datasets. Under the guidance of RD constraints, it extracts and utilizes the necessary structural and textural priors from these repositories to restore the input image. Second, RDDM employs a Bayesian network to progressively infer the input image based on the reconstructed image and its codec function. Additionally, our research reveals that RDDM’s performance declines when its codec function does not match the reconstructed image. However, using the highest bitrate codec function minimizes this performance drop. The resulting model is referred to as <inline-formula> <tex-math>$text{RDDM}^{star }$ </tex-math></inline-formula>. The experimental results indicate that both RDDM and <inline-formula> <tex-math>$text{RDDM}^{star }$ </tex-math></inline-formula> can be applied to various architectures of LICs, such as CNN, Transformer, and their hybrid. They can significantly improve the fidelity of these codecs while maintaining or even enhancing perceptual quality to some extent.","PeriodicalId":48827,"journal":{"name":"IEEE Journal on Emerging and Selected Topics in Circuits and Systems","volume":"15 2","pages":"186-199"},"PeriodicalIF":3.7,"publicationDate":"2025-04-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144481869","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Flexible Template for Edge Generative AI With High-Accuracy Accelerated Softmax and GELU 基于高精度加速Softmax和GELU的边缘生成人工智能灵活模板
IF 3.7 2区 工程技术
IEEE Journal on Emerging and Selected Topics in Circuits and Systems Pub Date : 2025-04-21 DOI: 10.1109/JETCAS.2025.3562734
Andrea Belano;Yvan Tortorella;Angelo Garofalo;Luca Benini;Davide Rossi;Francesco Conti
{"title":"A Flexible Template for Edge Generative AI With High-Accuracy Accelerated Softmax and GELU","authors":"Andrea Belano;Yvan Tortorella;Angelo Garofalo;Luca Benini;Davide Rossi;Francesco Conti","doi":"10.1109/JETCAS.2025.3562734","DOIUrl":"https://doi.org/10.1109/JETCAS.2025.3562734","url":null,"abstract":"Transformer-based generative Artificial Intelligence (GenAI) models achieve remarkable results in a wide range of fields, including natural language processing, computer vision, and audio processing. However, this comes at the cost of increased complexity and the need of sophisticated non-linearities such as softmax and GELU. Even if Transformers are computationally dominated by matrix multiplications (MatMul), these non-linearities can become a performance bottleneck, especially if dedicated hardware is used to accelerate MatMul operators. In this work, we introduce a GenAI BFloat16 Transformer acceleration template based on a heterogeneous tightly-coupled cluster containing 256KiB of shared SRAM, 8 general-purpose RISC-V cores, a <inline-formula> <tex-math>$24times 8$ </tex-math></inline-formula> systolic array MatMul accelerator, and a novel accelerator for Transformer softmax, GELU and SiLU non-linearities: SoftEx. SoftEx introduces an approximate exponentiation algorithm balancing efficiency (<inline-formula> <tex-math>$121times $ </tex-math></inline-formula> speedup over glibc’s implementation) with accuracy (mean relative error of 0.14%). In 12 nm technology, SoftEx occupies 0.039 mm<sup>2</sup>, only 3.22% of the cluster, which achieves an operating frequency of 1.12 GHz. Compared to optimized software running on the RISC-V cores, SoftEx achieves significant improvements, accelerating softmax and GELU computations by up to <inline-formula> <tex-math>$10.8times $ </tex-math></inline-formula> and <inline-formula> <tex-math>$5.11times $ </tex-math></inline-formula>, respectively, while reducing their energy consumption by up to <inline-formula> <tex-math>$10.8times $ </tex-math></inline-formula> and <inline-formula> <tex-math>$5.29times $ </tex-math></inline-formula>. These enhancements translate into a <inline-formula> <tex-math>$1.58times $ </tex-math></inline-formula> increase in throughput (310 GOPS at 0.8 V) and a <inline-formula> <tex-math>$1.42times $ </tex-math></inline-formula> improvement in energy efficiency (1.34 TOPS/W at 0.55 V) on end-to-end ViT inference workloads.","PeriodicalId":48827,"journal":{"name":"IEEE Journal on Emerging and Selected Topics in Circuits and Systems","volume":"15 2","pages":"200-216"},"PeriodicalIF":3.7,"publicationDate":"2025-04-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144481856","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Adaptive Two-Range Quantization and Hardware Co-Design for Large Language Model Acceleration 大型语言模型加速的自适应双量程量化与硬件协同设计
IF 3.7 2区 工程技术
IEEE Journal on Emerging and Selected Topics in Circuits and Systems Pub Date : 2025-04-21 DOI: 10.1109/JETCAS.2025.3562937
Siqi Cai;Gang Wang;Wenjie Li;Dongxu Lyu;Guanghui He
{"title":"Adaptive Two-Range Quantization and Hardware Co-Design for Large Language Model Acceleration","authors":"Siqi Cai;Gang Wang;Wenjie Li;Dongxu Lyu;Guanghui He","doi":"10.1109/JETCAS.2025.3562937","DOIUrl":"https://doi.org/10.1109/JETCAS.2025.3562937","url":null,"abstract":"Large language models (LLMs) face high computational and memory demands. While prior studies have leveraged quantization to reduce memory requirements, critical challenges persist: unaligned memory accesses, significant quantization errors when handling outliers that span larger quantization ranges, and the increased hardware overhead associated with processing high-bit-width outliers. To address these issues, we propose a quantization algorithm and hardware architecture co-design for efficient LLM acceleration. Algorithmically, a grouped adaptive two-range quantization (ATRQ) with an in-group embedded identifier is proposed to encode outliers and normal values in distinct ranges, achieving hardware-friendly aligned memory access and reducing quantization errors. From a hardware perspective, we develop a low-overhead ATRQ decoder and an outlier-bit-split processing element (PE) to reduce the hardware overhead associated with high-bit-width outliers, effectively leveraging their inherent sparsity. To support mixed-precision computation and accommodate diverse dataflows during the prefilling and decoding phases, we design a reconfigurable local accumulator that mitigates the overhead associated with additional adders. Experimental results show that the ATRQ-based accelerator outperforms existing solutions, achieving up to <inline-formula> <tex-math>$2.48times $ </tex-math></inline-formula> speedup and <inline-formula> <tex-math>$2.01times $ </tex-math></inline-formula> energy reduction in LLM prefilling phase, and <inline-formula> <tex-math>$1.87times $ </tex-math></inline-formula> speedup and <inline-formula> <tex-math>$2.03times $ </tex-math></inline-formula> energy reduction in the decoding phase, with superior model performance under post-training quantization.","PeriodicalId":48827,"journal":{"name":"IEEE Journal on Emerging and Selected Topics in Circuits and Systems","volume":"15 2","pages":"272-284"},"PeriodicalIF":3.7,"publicationDate":"2025-04-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144481858","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An Overview of Neural Rendering Accelerators: Challenges, Trends, and Future Directions 神经渲染加速器概述:挑战、趋势和未来方向
IF 3.7 2区 工程技术
IEEE Journal on Emerging and Selected Topics in Circuits and Systems Pub Date : 2025-04-17 DOI: 10.1109/JETCAS.2025.3561777
Junha Ryu;Hoi-Jun Yoo
{"title":"An Overview of Neural Rendering Accelerators: Challenges, Trends, and Future Directions","authors":"Junha Ryu;Hoi-Jun Yoo","doi":"10.1109/JETCAS.2025.3561777","DOIUrl":"https://doi.org/10.1109/JETCAS.2025.3561777","url":null,"abstract":"Rapid advancements in neural rendering have revolutionized the fields of augmented reality (AR) and virtual reality (VR) by enabling photorealistic 3D modeling and rendering. However, deploying neural rendering on edge devices presents significant challenges due to computational complexity, memory inefficiencies, and energy constraints. This paper provides a comprehensive overview of neural rendering accelerators, identifying the major hardware inefficiencies across sampling, positional encoding, and multi-layer perception (MLP) stages. We explore hardware-software co-optimization techniques that address these challenges and provide a summary for in-depth analysis. Additionally, emerging trends like 3D Gaussian Splatting (3DGS) and hybrid rendering approaches are briefly introduced, highlighting their potential to improve rendering quality and efficiency. By presenting a unified analysis of challenges, solutions, and future directions, this work aims to guide the development of next-generation neural rendering accelerators, especially for resource-constrained environments.","PeriodicalId":48827,"journal":{"name":"IEEE Journal on Emerging and Selected Topics in Circuits and Systems","volume":"15 2","pages":"299-311"},"PeriodicalIF":3.7,"publicationDate":"2025-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10967345","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144481852","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
LightRot: A Light-Weighted Rotation Scheme and Architecture for Accurate Low-Bit Large Language Model Inference LightRot:用于精确低比特大语言模型推理的轻量级旋转方案和体系结构
IF 3.7 2区 工程技术
IEEE Journal on Emerging and Selected Topics in Circuits and Systems Pub Date : 2025-04-08 DOI: 10.1109/JETCAS.2025.3558300
Sangjin Kim;Yuseon Choi;Jungjun Oh;Byeongcheol Kim;Hoi-Jun Yoo
{"title":"LightRot: A Light-Weighted Rotation Scheme and Architecture for Accurate Low-Bit Large Language Model Inference","authors":"Sangjin Kim;Yuseon Choi;Jungjun Oh;Byeongcheol Kim;Hoi-Jun Yoo","doi":"10.1109/JETCAS.2025.3558300","DOIUrl":"https://doi.org/10.1109/JETCAS.2025.3558300","url":null,"abstract":"As large language models (LLMs) continue to demonstrate exceptional capabilities across various domains, the challenge of achieving energy-efficient and accurate inference becomes increasingly critical. This work presents LightRot, a lightweight rotation scheme and dedicated hardware accelerator designed for low-bit LLM inference. The proposed architecture integrates Grouped Local Rotation (GLR) and Outlier Direction Aligning (ODA) algorithms with a hierarchical Fast Hadamard Transform (FHT)-based rotation unit to address key challenges in low-bit quantization, including the energy overhead of rotation operations. The proposed accelerator, implemented in a 28nm CMOS process, achieves a peak energy efficiency of 27.4TOPS/W for 4-bit inference, surpassing prior state-of-the-art designs. Unlike conventional approaches that rely on higher-precision inference or evaluate on basic language modeling tasks like GPT-2, LightRot is optimized for advanced models such as LLaMA2-13B and LLaMA3-8B. Its performance is further validated on MT-Bench, demonstrating robust applicability to real-world conversational scenarios and redefining benchmarks for chat-based AI systems. By synergizing algorithmic innovations and hardware efficiency, this work sets a new paradigm for scalable, low-bit LLM inference, paving the way for sustainable AI advancements.","PeriodicalId":48827,"journal":{"name":"IEEE Journal on Emerging and Selected Topics in Circuits and Systems","volume":"15 2","pages":"231-243"},"PeriodicalIF":3.7,"publicationDate":"2025-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144481859","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信