IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems最新文献_第2页

Corrections to “A Bidirectional Deep Learning Approach for Designing MEMS Sensors” 对“设计MEMS传感器的双向深度学习方法”的修正

IF 2.7 3区计算机科学

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems Pub Date : 2025-04-22 DOI: 10.1109/TCAD.2025.3528617

Xiong Cheng;Pengfei Zhang;Yiqi Zhou;Daying Sun;Wenhua Gu;Yutao Yue;Xiaodong Huang

引用次数: 0

Fully Programmatic Automated Design Procedure of Comparators for Analog-to-Digital Converters 模数转换器比较器的全程序化自动设计程序

IF 2.9 3区计算机科学

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems Pub Date : 2025-04-11 DOI: 10.1109/TCAD.2025.3560224

Veeti Lahtinen;Santeri Porrasmaa;Altti Heikkinen;Miikka Tenhunen;Jussi Ryynänen;Marko Kosunen

引用次数: 0

An Efficient Placement Speedup Technique Based on Graph Signal Processing 一种基于图信号处理的高效布局加速技术

IF 2.9 3区计算机科学

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems Pub Date : 2025-04-01 DOI: 10.1109/TCAD.2025.3556968

Yiting Liu;Hai Zhou;Jia Wang;Fan Yang;Xuan Zeng;Li Shang

{"title":"An Efficient Placement Speedup Technique Based on Graph Signal Processing","authors":"Yiting Liu;Hai Zhou;Jia Wang;Fan Yang;Xuan Zeng;Li Shang","doi":"10.1109/TCAD.2025.3556968","DOIUrl":"https://doi.org/10.1109/TCAD.2025.3556968","url":null,"abstract":"Placement is a critical task with high computation complexity in VLSI physical design. Modern analytical placers formulate the placement objective as a nonlinear optimization task, which suffers a long iteration time. To accelerate and enhance the placement process, recent studies have turned to deep learning-based approaches, particularly leveraging graph convolution networks (GCNs). However, learning-based placers require time- and data-consuming model training due to the complexity of circuit placement that involves large-scale cells and design-specific graph statistics. This article proposes GiFt, a parameter-free initialization technique for accelerating placement, rooted in graph signal processing. GiFt excels at capturing multiresolution smooth signals of circuit graphs to generate optimized initial placement solutions without the need for time-consuming model training, and meanwhile significantly reduces the number of iterations required by analytical placers. Moreover, we present GiFtPlus, an enhanced version of GiFt, which is more efficient in handling large-scale circuit placement and can accommodate location constraints. Experimental results on public benchmarks show that GiFt and GiFtPlus significantly improve placement efficiency, while achieving competitive or superior performance compared to state-of-the-art placers. In particular, the recently proposed GPU-accelerated analytical placer DREAMPlace uses up to 50% more total runtime than GiFtPlus-DREAMPlace.","PeriodicalId":13251,"journal":{"name":"IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems","volume":"44 10","pages":"3924-3937"},"PeriodicalIF":2.9,"publicationDate":"2025-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145090051","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

An r-DFA-Based Layout Pattern Match Method Supporting Fuzzy Matching 基于r- dfa的支持模糊匹配的布局模式匹配方法

IF 2.9 3区计算机科学

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems Pub Date : 2025-04-01 DOI: 10.1109/TCAD.2025.3556969

Qianxi Chen;Yujiao Deng;Qiang Wu;Zhixiong Di

{"title":"An r-DFA-Based Layout Pattern Match Method Supporting Fuzzy Matching","authors":"Qianxi Chen;Yujiao Deng;Qiang Wu;Zhixiong Di","doi":"10.1109/TCAD.2025.3556969","DOIUrl":"https://doi.org/10.1109/TCAD.2025.3556969","url":null,"abstract":"As chip manufacturing approaches physical limits, the probability of defects due to specific chip layout structures has significantly increased. These defect-prone structures are known as lithographic hotspots. Pattern matching method is widely used in hotspot detection algorithms due to its efficiency and accuracy. However, traditional pattern matching algorithms face major challenges in both solution efficiency and flexibility for fuzzy matching. To overcome these limitations, an integer range-based deterministic finite automaton (r-DFA)-based layout pattern matching method supporting parallelization and fuzzy matching is proposed. Manhattan polygons in the layout are represented as multiple path strings, thereby transforming the pattern matching problem into a string regular expression search problem. To simplifies the construction of large integer range elements in fuzzy matching, the r-DFA is employed, enhancing construction efficiency and enabling the algorithm to achieve linear time complexity. Moreover, this approach focuses most of the matching process within each individual layout polygon, enabling parallelized matching across diverse layout polygons. Compared to the state-of-the-art, our approach supports fuzzy matching, and shows an average efficiency improvement of 1.23 times.","PeriodicalId":13251,"journal":{"name":"IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems","volume":"44 10","pages":"3938-3947"},"PeriodicalIF":2.9,"publicationDate":"2025-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145090053","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

MATCH: Model-Aware TVM-Based Compilation for Heterogeneous Edge Devices MATCH：异构边缘设备基于模型感知的tvm编译

IF 2.9 3区计算机科学

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems Pub Date : 2025-04-01 DOI: 10.1109/TCAD.2025.3556967

Mohamed Amine Hamdi;Francesco Daghero;Giuseppe Maria Sarda;Josse Van Delm;Arne Symons;Luca Benini;Marian Verhelst;Daniele Jahier Pagliari;Alessio Burrello

{"title":"MATCH: Model-Aware TVM-Based Compilation for Heterogeneous Edge Devices","authors":"Mohamed Amine Hamdi;Francesco Daghero;Giuseppe Maria Sarda;Josse Van Delm;Arne Symons;Luca Benini;Marian Verhelst;Daniele Jahier Pagliari;Alessio Burrello","doi":"10.1109/TCAD.2025.3556967","DOIUrl":"https://doi.org/10.1109/TCAD.2025.3556967","url":null,"abstract":"Streamlining the deployment of Deep Neural Networks (DNNs) on heterogeneous edge platforms, coupling within the same micro-controller unit (MCU) instruction processors and hardware accelerators for tensor computations, is becoming one of the crucial challenges of the TinyML field. The best-performing DNN compilation toolchains are usually deeply customized for a single MCU family, and porting them to a different one implies labor-intensive redevelopment of almost the entire compiler. On the opposite side, retargetable toolchains, such as TVM, fail to exploit the capabilities of custom accelerators, producing general but unoptimized code. To overcome this duality, we introduce MATCH, a novel TVM-based DNN deployment framework designed for easy agile retargeting across different MCU processors and accelerators, thanks to a customizable model-based hardware abstraction. We show that a general and retargetable mapping framework can compete with, and even outperform custom toolchains on diverse targets while only needing the definition of an abstract hardware cost model and a SoC-specific API. We tested MATCH on two state-of-the-art heterogeneous MCUs, GAP9 and DIANA. On the four DNN models of the MLPerf Tiny suite MATCH reduces inference latency on average by <inline-formula> <tex-math>$60.87times $ </tex-math></inline-formula> on DIANA, compared to using the plain TVM, thanks to the exploitation of the on-board HW accelerator. Compared to HTVM, a fully customized toolchain for DIANA, we still reduce the latency by 16.94%. On GAP9, using the same benchmarks, we improve the latency by <inline-formula> <tex-math>$2.15times $ </tex-math></inline-formula> compared to the dedicated DORY compiler, thanks to our heterogeneous DNN mapping approach that synergically exploits the DNN accelerator and the eight-cores cluster available on board.","PeriodicalId":13251,"journal":{"name":"IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems","volume":"44 10","pages":"3844-3857"},"PeriodicalIF":2.9,"publicationDate":"2025-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145090257","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Placement Refinement Strategies for Security Closure 安全闭包的放置细化策略

IF 2.9 3区计算机科学

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems Pub Date : 2025-03-28 DOI: 10.1109/TCAD.2025.3555964

Marcelo Danigno;Mateus Fogaça;Rafael Schvittz;Paulo Butzen

引用次数: 0

POFGSP: Priority-Based Out-of-Order Scheduling and Fine-Grain Status Polling for SSD Performance Improvement POFGSP：基于优先级的无序调度和细粒度状态轮询，用于SSD性能改进

IF 2.9 3区计算机科学

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems Pub Date : 2025-03-28 DOI: 10.1109/TCAD.2025.3555507

Wentian Wu;Qianhui Li;Tong Qu;Qi Wang;Zongliang Huo;Tianchun Ye

{"title":"POFGSP: Priority-Based Out-of-Order Scheduling and Fine-Grain Status Polling for SSD Performance Improvement","authors":"Wentian Wu;Qianhui Li;Tong Qu;Qi Wang;Zongliang Huo;Tianchun Ye","doi":"10.1109/TCAD.2025.3555507","DOIUrl":"https://doi.org/10.1109/TCAD.2025.3555507","url":null,"abstract":"With the development of flash technology, the increasing throughput gap between <sc>nand</small> flash memory (NFM) arrays and the I/O interface has become a performance bottleneck for NFM-based solid-state drives (SSDs). Multilevel parallelism techniques have been employed on modern SSDs to meet the challenge of increasing demands for bandwidth in I/O-intensive workloads. However, conventional parallel methods only monitor the status of ways, resulting in the “idle bubble”—idle time of the dies cannot execute subsequent operations until all the dies in the way complete command execution. This issue limits the resource utilization and performance of SSDs. To minimize the idle bubble, we propose priority-based out-of-order scheduling and fine-grain status polling (POFGSP). The priority-based out-of-order scheduling relaxes constraints on command execution order and schedules commands with the same execution time to be executed in parallel. Therefore, the scheduler reduces these idle bubbles caused by differences in command execution times. Moreover, the fine-grain status polling approach polls the die-level status during the interface’s idle time, reducing idle bubbles with accurate status. Compared to state-of-the-art schedulers, our POFGSP approach can reduce request response time by 35.6% under real-world cloud block storage workloads and improve the SSD system’s maximum bandwidth by 8.7%–74.9%.","PeriodicalId":13251,"journal":{"name":"IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems","volume":"44 10","pages":"3724-3737"},"PeriodicalIF":2.9,"publicationDate":"2025-03-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145090259","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Intermittent-Friendly Neural Architecture Search: Demystifying Accuracy and Overhead Tradeoffs 间歇性友好的神经结构搜索：揭开准确性和开销权衡的神秘面纱

IF 2.9 3区计算机科学

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems Pub Date : 2025-03-28 DOI: 10.1109/TCAD.2025.3555963

Hashan Roshantha Mendis;Chih-Hsuan Yen;Chih-Kai Kang;Pi-Cheng Hsiu

{"title":"Intermittent-Friendly Neural Architecture Search: Demystifying Accuracy and Overhead Tradeoffs","authors":"Hashan Roshantha Mendis;Chih-Hsuan Yen;Chih-Kai Kang;Pi-Cheng Hsiu","doi":"10.1109/TCAD.2025.3555963","DOIUrl":"https://doi.org/10.1109/TCAD.2025.3555963","url":null,"abstract":"The fusion of tiny energy harvesting devices with deep neural networks (DNN) optimized for intermittent execution is vital for sustainable intelligent applications at the edge. However, current intermittent-aware neural architecture search (NAS) frameworks overlook the inherent intermittency management overhead (IMO) of DNNs, leading to under-performance upon deployment. Moreover, we observe that straightforward IMO minimization within NAS may degrade solution accuracy. This work explores the relationship between DNN architectural characteristics, IMO, and accuracy, uncovering the varying sensitivity toward IMO across different DNN characteristics. Inspired by our insights, we present two guidelines for leveraging IMO sensitivity in NAS. First, the overall architecture search space can be reduced to exclude parameters with low IMO sensitivity, and second, network blocks with high IMO sensitivity can be primarily focused during the search, facilitating the discovery of highly accurate networks with low IMO. We incorporate these guidelines into TiNAS, which integrates cutting-edge tiny NAS and intermittent-aware NAS frameworks. Evaluations are conducted across various datasets and latency requirements, as well as deployment experiments on a Texas Instruments device under different intermittent power profiles. Compared to two variants, one minimizing IMO and the other disregarding IMO, TiNAS, respectively, achieves up to 38% higher accuracy and 33% lower IMO, with greater improvements for larger datasets. Its deployed solutions also achieve up to a 1.33 times inference speedup, especially under fluctuating power conditions.","PeriodicalId":13251,"journal":{"name":"IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems","volume":"44 10","pages":"3990-4003"},"PeriodicalIF":2.9,"publicationDate":"2025-03-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145100355","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

OpenLS-DGF: An Adaptive Open-Source Dataset Generation Framework for Machine-Learning Tasks in Logic Synthesis OpenLS-DGF：用于逻辑综合中机器学习任务的自适应开源数据集生成框架

IF 2.9 3区计算机科学

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems Pub Date : 2025-03-27 DOI: 10.1109/TCAD.2025.3555506

Liwei Ni;Rui Wang;Miao Liu;Xingyu Meng;Xiaoze Lin;Junfeng Liu;Guojie Luo;Zhufei Chu;Weikang Qian;Xiaoyan Yang;Biwei Xie;Xingquan Li;Huawei Li

{"title":"OpenLS-DGF: An Adaptive Open-Source Dataset Generation Framework for Machine-Learning Tasks in Logic Synthesis","authors":"Liwei Ni;Rui Wang;Miao Liu;Xingyu Meng;Xiaoze Lin;Junfeng Liu;Guojie Luo;Zhufei Chu;Weikang Qian;Xiaoyan Yang;Biwei Xie;Xingquan Li;Huawei Li","doi":"10.1109/TCAD.2025.3555506","DOIUrl":"https://doi.org/10.1109/TCAD.2025.3555506","url":null,"abstract":"This article introduces OpenLS-DGF, an adaptive logic synthesis dataset generation framework, to enhance machine-learning (ML) applications within the logic synthesis process. Previous dataset generation flows were tailored for specific tasks or lacked integrated ML capabilities. While OpenLS-DGF supports various ML tasks by encapsulating the three fundamental steps of logic synthesis: 1) Boolean representation; 2) logic optimization; and 3) technology mapping. It preserves the original information in both Verilog and ML-friendly GraphML formats. The Verilog files offer semi-customizable capabilities, enabling researchers to insert additional steps and incrementally refine the generated dataset. Furthermore, OpenLS-DGF includes an adaptive circuit engine that facilitates the final dataset management and downstream tasks. The generated OpenLS-D-v1 dataset comprises 46 combinational designs from established benchmarks, totaling over 966 000 Boolean circuits. OpenLS-D-v1 supports integrating new data features, making it more versatile for new tasks. This article demonstrates the versatility of OpenLS-D-v1 through four distinct downstream tasks: circuit classification, circuit ranking, quality of results (QoR) prediction, and probability prediction. Each task is chosen to represent essential steps of logic synthesis, and the experimental results show the generated dataset from OpenLS-DGF achieves prominent diversity and applicability. The source code and datasets are available at <uri>https://github.com/Logic-Factory/ACE/blob/master/OpenLS-DGF</uri>.","PeriodicalId":13251,"journal":{"name":"IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems","volume":"44 10","pages":"3830-3843"},"PeriodicalIF":2.9,"publicationDate":"2025-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145090249","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Formal Synthesis of Neural Barrier Certificates for Dynamical Systems via DC Programming 基于DC规划的动力系统神经屏障证书形式化综合

IF 2.9 3区计算机科学

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems Pub Date : 2025-03-27 DOI: 10.1109/TCAD.2025.3555513

Yang Wang;Hanlong Chen;Wang Lin;Zuohua Ding

引用次数: 0