IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems最新文献

筛选
英文 中文
IncreMacro: Incremental Macro Placement Refinement IncreMacro:增量宏放置细化
IF 2.7 3区 计算机科学
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems Pub Date : 2025-01-20 DOI: 10.1109/TCAD.2025.3531776
Yuan Pu;Tinghuan Chen;Zhuolun He;Jiajun Qin;Chen Bai;Haisheng Zheng;Yibo Lin;Bei Yu
{"title":"IncreMacro: Incremental Macro Placement Refinement","authors":"Yuan Pu;Tinghuan Chen;Zhuolun He;Jiajun Qin;Chen Bai;Haisheng Zheng;Yibo Lin;Bei Yu","doi":"10.1109/TCAD.2025.3531776","DOIUrl":"https://doi.org/10.1109/TCAD.2025.3531776","url":null,"abstract":"This article proposes <inline-formula> <tex-math>$textsf {IncreMacro}$ </tex-math></inline-formula>, a novel approach for macro placement refinement in the context of integrated circuit (IC) design. The suggested approach iteratively and incrementally optimizes the placement of macros in order to enhance IC layout routability and timing performance. To achieve this, <inline-formula> <tex-math>$textsf {IncreMacro}$ </tex-math></inline-formula> utilizes several methods, including kd-tree-based macro diagnosis, gradient-based macro shifting, constraint-graph-based LP for macro legalization, and diffusion-based cell migration. By employing these techniques iteratively, <inline-formula> <tex-math>$textsf {IncreMacro}$ </tex-math></inline-formula> meets two critical solution requirements of macro placement: 1) pushing macros toward the chip boundary and 2) preserving the original macro relative positional relationship. The proposed approach has been incorporated into <inline-formula> <tex-math>$textsf {AutoDMP}$ </tex-math></inline-formula> and <inline-formula> <tex-math>$textsf {DREAMPlace}~4.0$ </tex-math></inline-formula>, and is evaluated on seven RISC-V benchmark circuits and four TILOS macro placement circuit designs at the 7-nm technology node. Experimental results show that, compared with the macro placement solution provided by <inline-formula> <tex-math>$textsf {AutoDMP}~(textsf {DREAMPlace}~4.0$ </tex-math></inline-formula>), our approach reduces routed wirelength by 15.1% (14.9%), improves the routed worst negative slack (WNS) and total negative slack (TNS) by 99.9 (82.6%) and 99.9% (81.3%), and reduces the total power consumption by 4.4% (4.3%). Meanwhile, compared with <inline-formula> <tex-math>$textsf {IncreMacro}$ </tex-math></inline-formula> <xref>[1]</xref>, our approach augmented with the cell migration algorithm improves the routed WNS and TNS by 24.7% and 23.1%, and remains the average routed wirelength and total power consumption almost unchanged.","PeriodicalId":13251,"journal":{"name":"IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems","volume":"44 8","pages":"3222-3235"},"PeriodicalIF":2.7,"publicationDate":"2025-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10845818","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144663726","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Finite Element Approach Based Numerical Framework for Device Simulator 基于有限元方法的器件模拟器数值框架
IF 2.7 3区 计算机科学
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems Pub Date : 2025-01-17 DOI: 10.1109/TCAD.2025.3531343
Da-Wei Wang;Qing Zhang;Hang Wan;Wen-Sheng Zhao
{"title":"Finite Element Approach Based Numerical Framework for Device Simulator","authors":"Da-Wei Wang;Qing Zhang;Hang Wan;Wen-Sheng Zhao","doi":"10.1109/TCAD.2025.3531343","DOIUrl":"https://doi.org/10.1109/TCAD.2025.3531343","url":null,"abstract":"In this work, a finite element method (FEM)-based numerical framework is proposed to effectively calculate the drift-diffusion equations and compiled into a parallel computing device simulator. In this framework, a novel upwind FEM is developed to solve the convection dominated continuity equations. In the implementation of the upwind method, the vector basis functions are employed to interpolate the edge streamline upwind (SU) current densities into mesh grid to obtain the spatial current density, and then the scalar FEM is used to construct the element matrix equation. Through comparing the calculating results of a 2-D PN-junction with those obtained by the COMSOL Semiconductor, the accuracy of proposed framework is verified first. Then, through several numerical cases, its advantages in comparison with FBSG-, SU Petrov Galerkin (SUPG)-, or control-volume-finite-element method SUPG-based frameworks in terms of mesh grid adaptivity, computing stability, and efficiency are presented. At last, by combining the proposed framework with a domain decomposition scheme and a fully coupled Newton’s approach, a parallel computing device simulator is developed, including both steady-state and transient solvers. The performance of the in-house simulator is evaluated in terms of calculating accuracy, large-scale problem solution capability, and scalabilities.","PeriodicalId":13251,"journal":{"name":"IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems","volume":"44 8","pages":"3197-3207"},"PeriodicalIF":2.7,"publicationDate":"2025-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144663725","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Bridge-NDP: Efficient Communication-Computation Overlap in Near Data Processing System 桥- ndp:近距离数据处理系统中高效的通信-计算重叠
IF 2.7 3区 计算机科学
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems Pub Date : 2025-01-17 DOI: 10.1109/TCAD.2025.3531254
Liyan Chen;Pengyu Liu;Dongxu Lyu;Jianfei Jiang;Qin Wang;Zhigang Mao;Naifeng Jing
{"title":"Bridge-NDP: Efficient Communication-Computation Overlap in Near Data Processing System","authors":"Liyan Chen;Pengyu Liu;Dongxu Lyu;Jianfei Jiang;Qin Wang;Zhigang Mao;Naifeng Jing","doi":"10.1109/TCAD.2025.3531254","DOIUrl":"https://doi.org/10.1109/TCAD.2025.3531254","url":null,"abstract":"Near data processing (NDP), enabled by near data accelerators (NDAs) within DIMM-based main memory, enhances performance by providing more aggregated bandwidth and reducing long-distance data transfers. While the performance of NDAs has received widespread attention, the overhead of host-NDA communication has been overlooked, becoming a bottleneck in NDP systems. To alleviate performance degradation from communication, we propose Bridge-NDP, the first NDP architecture that implements a workflow with efficient communication-computation overlap. Bridge-NDP is built upon the conventional NDP architecture and can be easily applied to existing NDP designs, regardless of the memory level where NDAs are attached. Specifically, we introduce a novel direct host-NDA communication method that utilizes existing memory buses as bridge buses, avoiding the need for new interconnections. It enables seamless integration with other memory accesses while achieving high bandwidth utilization with minimal hardware overhead. For the system-level workflow design, we optimize and extend existing dataflow to achieve richer computing paradigms with fewer redundant memory accesses. Additionally, we provide programming support with efficient API designs and data management to hide low-level resource details and ensure correctness guarantees. Comprehensive experiments demonstrate that Bridge-NDP achieves significant performance improvements, with speedups of <inline-formula> <tex-math>$1.8times $ </tex-math></inline-formula>–<inline-formula> <tex-math>$3.1times $ </tex-math></inline-formula> and bandwidth utilization improvement of <inline-formula> <tex-math>$2.0times $ </tex-math></inline-formula>–<inline-formula> <tex-math>$2.9times $ </tex-math></inline-formula> over the state-of-the-art NDP solutions.","PeriodicalId":13251,"journal":{"name":"IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems","volume":"44 8","pages":"2939-2951"},"PeriodicalIF":2.7,"publicationDate":"2025-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144657447","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
HISIM: Analytical Performance Modeling and Design Space Exploration of 2.5D/3D Integration for AI Computing HISIM:面向AI计算的2.5D/3D集成分析性能建模与设计空间探索
IF 2.7 3区 计算机科学
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems Pub Date : 2025-01-17 DOI: 10.1109/TCAD.2025.3531348
Zhenyu Wang;Pragnya Sudershan Nalla;Jingbo Sun;A. Alper Goksoy;Sumit K. Mandal;Jae-Sun Seo;Vidya A. Chhabria;Jeff Zhang;Chaitali Chakrabarti;Umit Y. Ogras;Yu Cao
{"title":"HISIM: Analytical Performance Modeling and Design Space Exploration of 2.5D/3D Integration for AI Computing","authors":"Zhenyu Wang;Pragnya Sudershan Nalla;Jingbo Sun;A. Alper Goksoy;Sumit K. Mandal;Jae-Sun Seo;Vidya A. Chhabria;Jeff Zhang;Chaitali Chakrabarti;Umit Y. Ogras;Yu Cao","doi":"10.1109/TCAD.2025.3531348","DOIUrl":"https://doi.org/10.1109/TCAD.2025.3531348","url":null,"abstract":"Monolithic designs face significant fabrication cost and data movement challenges, especially when executing complex and diverse AI models. Advanced 2.5D/3D packaging promises high bandwidth and connection density to overcome these challenges, yet it also introduces new electro-thermal constraints. This article develops a suite of analytical performance models to enable efficient benchmarking of a 2.5D/3D heterogeneous system for energy-efficient AI computing. These models encompass various performance metrics related to computing units, network-on-chip (NoC), and network-on-package (NoP). The results are summarized into a new tool, HISIM, which is <inline-formula> <tex-math>$10^{4} times $ </tex-math></inline-formula>–<inline-formula> <tex-math>$10^{6} times $ </tex-math></inline-formula> faster than state-of-the-art AI benchmark tools. Furthermore, HISIM integrates rapid thermal simulation for the 2.5D/3D system, helping shed light on both the potential and limitations of 2.5D/3D heterogeneous integration (HI) on representative AI algorithms. The code of HISIM is available at <uri>https://github.com/mec-UMN/HISIM</uri>.","PeriodicalId":13251,"journal":{"name":"IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems","volume":"44 8","pages":"3208-3221"},"PeriodicalIF":2.7,"publicationDate":"2025-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144663727","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Modeling Method of Reverse Biased Electric Field for JBS Diodes Based on Schwarz-Christoffel Transformation 基于Schwarz-Christoffel变换的JBS二极管反向偏置电场建模方法
IF 2.7 3区 计算机科学
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems Pub Date : 2025-01-17 DOI: 10.1109/TCAD.2025.3531252
Yanqiu Li;Zhiqiang Wang;Kuan Wang;Jun Yuan;Guoqing Xin;Xiaojie Shi
{"title":"A Modeling Method of Reverse Biased Electric Field for JBS Diodes Based on Schwarz-Christoffel Transformation","authors":"Yanqiu Li;Zhiqiang Wang;Kuan Wang;Jun Yuan;Guoqing Xin;Xiaojie Shi","doi":"10.1109/TCAD.2025.3531252","DOIUrl":"https://doi.org/10.1109/TCAD.2025.3531252","url":null,"abstract":"This article introduces a modeling method for the reverse biased electric field of junction barrier Schottky (JBS) diodes, utilizing the Schwarz-Christoffel transformation. Building upon prior research on JBS diodes electric field modeling, this approach is rooted in a purely theoretical derivation, avoiding the dependence on conclusions from simulation software—a limitation of earlier modeling methods. In this study, complex boundary conditions are transformed mathematically into simpler ones to make it possible to solve for the electric field. Then, the analytic solution of the electric field distribution is obtained by using the superposition theorem. To validate this modeling method, the electric field distribution from this model is compared with results from simulation software, and a way of applying the analytic solution of the electric field distribution in commercial simulation software is given.","PeriodicalId":13251,"journal":{"name":"IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems","volume":"44 7","pages":"2832-2835"},"PeriodicalIF":2.7,"publicationDate":"2025-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144322994","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Bayesian Learning-Enhanced Embedded Memory Design With Automated Circuit Variant Generation 基于贝叶斯学习的嵌入式存储器设计与电路变体自动生成
IF 2.7 3区 计算机科学
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems Pub Date : 2025-01-17 DOI: 10.1109/TCAD.2025.3531337
Dongho Kim;Junseo Lee;Seokhun Kim;Jihwan Park;Sangheon Lee;Hanwool Jeong
{"title":"Bayesian Learning-Enhanced Embedded Memory Design With Automated Circuit Variant Generation","authors":"Dongho Kim;Junseo Lee;Seokhun Kim;Jihwan Park;Sangheon Lee;Hanwool Jeong","doi":"10.1109/TCAD.2025.3531337","DOIUrl":"https://doi.org/10.1109/TCAD.2025.3531337","url":null,"abstract":"This article proposes a Bayesian learning driven automated embedded memory design methodology that aims to minimize leakage current, minimize power, and maximize performance while meeting predefined constraints. To achieve this objective effectively, we present an automatic tool that leverages a reference initial circuit design to generate a diverse set of schematic and layout options for logic-equivalent circuit variants and various transistor threshold voltage (Vth) modifications, while ensuring compliance with design rules. Subsequently, leveraging the range of circuit options generated, Bayesian optimization is employed not only to identify optimal circuit parameters but also to select the most appropriate circuit topology and individual transistor <inline-formula> <tex-math>$V_{th}$ </tex-math></inline-formula> to attain the desired design objectives. TSMC 28 nm process simulation results demonstrate the proposed methodology reducing power by 26.28%–46.44%, <inline-formula> <tex-math>$T_{mathrm { access}}$ </tex-math></inline-formula> by 25.60%–42.29%, and leakage current by 22.73%–50.11% compared to the compiler-generated design, with a runtime of 10–40 h.","PeriodicalId":13251,"journal":{"name":"IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems","volume":"44 8","pages":"3099-3111"},"PeriodicalIF":2.7,"publicationDate":"2025-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144657449","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Efficient Modeling Attack on Multiplexer PUFs via Kronecker Matrix Multiplication 基于Kronecker矩阵乘法的多路复用puf高效建模攻击
IF 2.7 3区 计算机科学
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems Pub Date : 2025-01-17 DOI: 10.1109/TCAD.2025.3531336
Hongfei Wang;Caixue Wan;Hai Jin
{"title":"Efficient Modeling Attack on Multiplexer PUFs via Kronecker Matrix Multiplication","authors":"Hongfei Wang;Caixue Wan;Hai Jin","doi":"10.1109/TCAD.2025.3531336","DOIUrl":"https://doi.org/10.1109/TCAD.2025.3531336","url":null,"abstract":"The physical unclonable function (PUF) is valued for its lightweight nature and unique functionality, making it a common choice for securing hardware products requiring authentication and key generation mechanisms. In response to the susceptibility of individual PUFs to modeling attacks, advanced PUF variants have been developed to improve security measures. One notable type in this regard is the multiplexer-based composition of arbiter PUFs, known as MPUF, which aims to meet high reliability and security standards simultaneously. Current research on attacking MPUF encounters challenges, such as substantial demands for training CRPs and low success rates. In this work, we propose a novel numerical modeling attack strategy for MPUFs. Using Kronecker products from mathematical perspective, this method precisely describes the MPUF model without using complex network architectures, boosting attack accuracy, and overall efficiency. Experiment comparison with state-of-the-art works demonstrates that our method achieves better performance in terms of attack accuracy, robustness, and efficiency. Our method is able to successfully attack a (512, 8)-MPUF in 32.71 min with 97.14% accuracy, outperforming all existing attack methods on MPUFs. More, we validate our method through experiments with hardware implementations on FPGAs. The advantages of our method also include the adaptability to be employed to attack other MPUF variations like cMPUF and rMPUF, and the capability to be integrated with an existing attack method for launching efficient attack on MPUFs leveraging reliability information.","PeriodicalId":13251,"journal":{"name":"IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems","volume":"44 8","pages":"2883-2896"},"PeriodicalIF":2.7,"publicationDate":"2025-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10844882","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144657326","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Atleus: Accelerating Transformers on the Edge Enabled by 3D Heterogeneous Manycore Architectures Atleus:由3D异构多核架构支持的边缘加速变压器
IF 2.7 3区 计算机科学
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems Pub Date : 2025-01-17 DOI: 10.1109/TCAD.2025.3531255
Pratyush Dhingra;Janardhan Rao Doppa;Partha Pratim Pande
{"title":"Atleus: Accelerating Transformers on the Edge Enabled by 3D Heterogeneous Manycore Architectures","authors":"Pratyush Dhingra;Janardhan Rao Doppa;Partha Pratim Pande","doi":"10.1109/TCAD.2025.3531255","DOIUrl":"https://doi.org/10.1109/TCAD.2025.3531255","url":null,"abstract":"Transformer architectures have become the standard neural network model for various machine learning (ML) applications, including natural language processing and computer vision. However, the compute and memory requirements introduced by transformer models make them challenging to adopt for edge applications. Furthermore, fine-tuning pretrained transformers (e.g., foundation models) is a common task to enhance the model’s predictive performance on specific tasks/applications. Existing transformer accelerators are oblivious to complexities introduced by fine-tuning. In this article, we propose the design of a three-dimensional (3D) heterogeneous architecture referred to as Atleus that incorporates heterogeneous computing resources specifically optimized to accelerate transformer models for the dual purposes of fine-tuning and inference. Specifically, Atleus utilizes nonvolatile memory and systolic array for accelerating transformer computational kernels using an integrated 3D platform. Moreover, we design a suitable NoC to achieve high performance and energy efficiency. Finally, Atleus adopts an effective quantization scheme to support model compression. Experimental results demonstrate that Atleus outperforms existing state-of-the-art by up to <inline-formula> <tex-math>$56times $ </tex-math></inline-formula> and <inline-formula> <tex-math>$64.5times $ </tex-math></inline-formula> in terms of performance and energy efficiency, respectively.","PeriodicalId":13251,"journal":{"name":"IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems","volume":"44 8","pages":"2842-2855"},"PeriodicalIF":2.7,"publicationDate":"2025-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144657327","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
TeMACLE: A Technology Mapping-Aware Area-Efficient Standard Cell Library Extension Framework TeMACLE:一个技术映射感知区域高效标准单元库扩展框架
IF 2.7 3区 计算机科学
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems Pub Date : 2025-01-15 DOI: 10.1109/TCAD.2025.3529802
Rongliang Fu;Chao Wang;Bei Yu;Tsung-Yi Ho
{"title":"TeMACLE: A Technology Mapping-Aware Area-Efficient Standard Cell Library Extension Framework","authors":"Rongliang Fu;Chao Wang;Bei Yu;Tsung-Yi Ho","doi":"10.1109/TCAD.2025.3529802","DOIUrl":"https://doi.org/10.1109/TCAD.2025.3529802","url":null,"abstract":"Standard cell libraries play a crucial role in modern very large-scale integration design by providing predesigned, precharacterized, and preverified building blocks to simplify the design process. However, the increasing complexity of circuits demands more specialized and optimized cells, thereby necessitating the extension of standard cell libraries. This article proposes TeMACLE, a technology mapping-aware area-efficient framework to extend the standard cell library. Aiming at the area optimization of digital circuits, TeMACLE extends the given original standard cell library through two feasible: 1) the area compaction of standard cells and 2) the area-efficient facilitation for technology mapping. TeMACLE employs K-feasible cones to extract subcircuits and designs a subcircuit encoding method to divide them. Then, an SAT-based subcircuit matching algorithm is proposed to identify all equivalent subcircuits further. Finally, new standard cells are determined by a technology mapping-aware area-efficient strategy. The experimental results on the EPFL benchmark using the FreePDK45 process design kit show the effectiveness and efficiency of TeMACLE. Notably, TeMACLE is available at <uri>https://github.com/Flians/TeMACLE</uri>.","PeriodicalId":13251,"journal":{"name":"IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems","volume":"44 8","pages":"3034-3045"},"PeriodicalIF":2.7,"publicationDate":"2025-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144657450","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Error Resilient Online Reinforcement Learning Using Adaptive Statistical Checks 错误弹性在线强化学习使用自适应统计检查
IF 2.7 3区 计算机科学
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems Pub Date : 2025-01-15 DOI: 10.1109/TCAD.2025.3529820
Chandramouli Amarnath;Mohamed Mejri;Jackson Isenberg;Abhijit Chatterjee
{"title":"Error Resilient Online Reinforcement Learning Using Adaptive Statistical Checks","authors":"Chandramouli Amarnath;Mohamed Mejri;Jackson Isenberg;Abhijit Chatterjee","doi":"10.1109/TCAD.2025.3529820","DOIUrl":"https://doi.org/10.1109/TCAD.2025.3529820","url":null,"abstract":"Online deep reinforcement learning (deep RL)-based systems are being increasingly deployed in a variety of safety-critical applications. Due to the dynamic nature of the environments they work in, onboard reinforcement learning (RL) hardware is vulnerable to soft errors from radiation, thermal effects and electrical noise that corrupts the results of computations. Existing approaches to on-line error resilience in machine learning systems have relied on the availability of large training datasets to configure resilience parameters. This is not always feasible for online RL systems. Similarly, other approaches involving specialized hardware or modifications to training algorithms are difficult to implement for onboard RL applications. In contrast, we present a novel error resilience approach for online RL that leverages running statistics of neuron output values collected across the (real-time) RL training process to configure error detection thresholds (called checks) for the deep RL forward pass. Similarly, we formulate checks on the deep RL backward pass using running statistical thresholds on reduced-dimension checksums of online learning weight updates to rapidly detect and correct errors in online deep RL training. In this methodology, statistical concentration bounds leveraging running statistics are used to diagnose neuron outputs or weights as erroneous. The use of running statistics allows the checks to adapt to changes caused by continual on-line RL training. Erroneous neurons are set to zero (suppressed) in the forward pass. Erroneous weight updates are frozen, allowing nonerroneous weight updates to proceed and allowing online learning without rerunning training episodes. Our approach is compared against the state of the art and validated on several RL algorithms as well as a hardware validation platform.","PeriodicalId":13251,"journal":{"name":"IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems","volume":"44 8","pages":"3112-3125"},"PeriodicalIF":2.7,"publicationDate":"2025-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144657446","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信