{"title":"IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems publication information","authors":"","doi":"10.1109/TCAD.2025.3547450","DOIUrl":"https://doi.org/10.1109/TCAD.2025.3547450","url":null,"abstract":"","PeriodicalId":13251,"journal":{"name":"IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems","volume":"44 4","pages":"C3-C3"},"PeriodicalIF":2.7,"publicationDate":"2025-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10934977","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143667368","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems publication information","authors":"","doi":"10.1109/TCAD.2025.3566794","DOIUrl":"https://doi.org/10.1109/TCAD.2025.3566794","url":null,"abstract":"","PeriodicalId":13251,"journal":{"name":"IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems","volume":"44 6","pages":"C3-C3"},"PeriodicalIF":2.7,"publicationDate":"2025-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11007761","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144100074","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Revisiting Assumptions Ordering in CAR-Based Model Checking","authors":"Yibo Dong;Yu Chen;Jianwen Li;Geguang Pu;Ofer Strichman","doi":"10.1109/TCAD.2025.3551658","DOIUrl":"https://doi.org/10.1109/TCAD.2025.3551658","url":null,"abstract":"Model checking is an automatic formal verification technique that is widely used in hardware verification. The state-of-the-art complete model-checking techniques, based on IC3/PDR and its general variant CAR, are based on computing symbolically sets of under- and over-approximating state sets (called “frames”) with multiple calls to a SAT solver. The performance of those techniques is sensitive to the order of the assumptions with which the SAT solver is invoked, because it affects the unsatisfiable cores that it emits if the formula is unsatisfiable—which the solver emits when the formula is unsatisfiable—that crucially affect the search process. This observation was previously published (Dureja et al., 2020), where two partial assumption ordering strategies, intersection and rotation were suggested (partial in the sense that they determine the order of only a subset of the literals). In this article we extend and improve these strategies based on an analysis of the reason for their effectiveness. We prove that intersection is effective because of what we call locality of the cores, and our improved strategy is based on this observation. We conclude our paper with an extensive empirical evaluation of the various ordering techniques. One of our strategies, Hybrid-CAR, which switches between strategies at runtime, not only outperforms other, fixed ordering strategies, but also outperforms other state-of-the-art bug-finding algorithms, such as ABC-BMC.","PeriodicalId":13251,"journal":{"name":"IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems","volume":"44 10","pages":"4032-4037"},"PeriodicalIF":2.9,"publicationDate":"2025-03-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145100331","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"CMD: A Cache-Assisted GPU Memory Deduplication Architecture","authors":"Wei Zhao;Dan Feng;Wei Tong;Xueliang Wei;Bing Wu","doi":"10.1109/TCAD.2025.3552674","DOIUrl":"https://doi.org/10.1109/TCAD.2025.3552674","url":null,"abstract":"Massive off-chip accesses in graphics processing units (GPUs) are the main performance bottleneck. We find that many writes are duplicate, and the duplication can be <monospace>inter-dup</monospace> and <monospace>intra-dup</monospace>. While <monospace>inter-dup</monospace> means different memory blocks are identical, and <monospace>intra-dup</monospace> means all the 4B elements in a line are the same. In this work, we propose a cache-assisted GPU memory deduplication architecture named cache-assisted GPU memory deduplicated (CMD) to reduce the off-chip accesses via utilizing the data duplication in GPU applications. CMD includes three key design contributions which aim to reduce the three kinds of accesses: 1) a novel GPU memory deduplication architecture that removes the <monospace>intra-dup</monospace> and <monospace>inter-dup</monospace> lines. We design several techniques to manage duplicate blocks, reducing massive off-chip writes; 2) we propose a cache-assisted read scheme to reduce the reads to duplicate data. When an L2 cache miss wants to read the duplicate block, if the reference block has been fetched to L2 and it is clean, we can copy it to the L2 missed block without accessing off-chip DRAM. As for the reads to <monospace>intra-dup</monospace> data, CMD uses the on-chip metadata cache to get the data; and 3) when a cache line is evicted, the clean sectors in the line are invalidated while the dirty sectors are written back. However, most read-only victims are rereferenced from DRAM more than twice. Therefore, we add a full-associate FIFO to accommodate the read-only (it is also clean) victims to reduce the rereference counts. Experiments show that CMD can decrease the off-chip accesses by 31.01%, reduce the energy by 32.78% and improve performance by 42.53%. Besides, CMD can improve the performance of memory-intensive workloads by 57.56%.","PeriodicalId":13251,"journal":{"name":"IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems","volume":"44 10","pages":"3752-3763"},"PeriodicalIF":2.9,"publicationDate":"2025-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145090255","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ruisi Zhang;Rachel Selina Rajarathnam;David Z. Pan;Farinaz Koushanfar
{"title":"ICMarks: A Robust Watermarking Framework for Integrated Circuit Physical Design IP Protection","authors":"Ruisi Zhang;Rachel Selina Rajarathnam;David Z. Pan;Farinaz Koushanfar","doi":"10.1109/TCAD.2025.3552503","DOIUrl":"https://doi.org/10.1109/TCAD.2025.3552503","url":null,"abstract":"Physical design watermarking (WM) on contemporary integrated circuit (IC) layout encodes signatures without considering the dense connections and design constraints, which could lead to performance degradation on the watermarked products. This article presents <monospace>ICMarks</monospace>, a quality-preserving and robust WM framework for modern IC physical design. <monospace>ICMarks</monospace> embeds unique watermark signatures during the physical design’s placement stage, thereby authenticating the IC layout ownership. <monospace>ICMarks</monospace>’s novelty lies in 1) strategically identifying a region of cells to watermark with minimal impact on the layout performance and 2) a two-level WM framework for augmented robustness toward potential removal and forging attacks. Extensive evaluations on benchmarks of different design objectives and sizes validate that <monospace>ICMarks</monospace> incurs no wirelength and timing metrics degradation, while successfully proving ownership. Furthermore, we demonstrate <monospace>ICMarks</monospace> is robust against two major WM attack categories, namely, watermark removal and forging attacks; even if the adversaries have prior knowledge of the WM schemes, the signatures cannot be removed without significantly undermining the layout quality.","PeriodicalId":13251,"journal":{"name":"IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems","volume":"44 10","pages":"3910-3923"},"PeriodicalIF":2.9,"publicationDate":"2025-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145089960","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Evaluation of FONOS FDSOI FET for Synapse-Inspired Artificial Neural Network","authors":"Rameez Raja Shaik;K. P. Pradhan","doi":"10.1109/TCAD.2025.3552664","DOIUrl":"https://doi.org/10.1109/TCAD.2025.3552664","url":null,"abstract":"In this article, a well calibrated FDSOI ferroelectric (Fe)-FET/FeFET is gate-stacked with a charge trap nitride (CTN) for investigating memory and synaptic applications. The CTN is sandwiched between Fe and silicon channel of FeFET to obtain ferroelectric/oxide/nitride/oxide/semiconductor (FONOS) FET for improved memory and synaptic applications. This approach of using both Fe and CTN layers to improve memory and synaptic applications is labeled as hybrid approach. Since, both the Fe or CTN as a standalone gate-stack dielectric can give threshold-voltage (<inline-formula> <tex-math>$V_{T}$ </tex-math></inline-formula>) shift leading to hysteresis transfer characteristics with a program (PGM)/erase (ERS) schemes that predict a memory window (MW). The FeFET uses spontaneous polarization (<inline-formula> <tex-math>$P_{S}$ </tex-math></inline-formula>) to obtain memory operation which offer low power and area. The CTN uses highly mature and reliable charge-trapping-memory (CTM)-based e<inline-formula> <tex-math>${}^{-} $ </tex-math></inline-formula>/h+ trapping <inline-formula> <tex-math>$Leftrightarrow $ </tex-math></inline-formula> de-trapping mechanism with PGM/ERS scheme to obtain MW that has high-<inline-formula> <tex-math>$V_{T}$ </tex-math></inline-formula> (HVT) and low-<inline-formula> <tex-math>$V_{T}$ </tex-math></inline-formula> (LVT) transfer characteristics. Hence, a combination of these leading to a FONOS FET architecture, which is investigated for improved memory properties like MW and retention followed by examining on synaptic attributes for realistic device training accuracy toward artificial neural networks (ANNs). From the investigations: 1) FONOS memory device has predicted improvement in MW with 2.65 V for 8-nm thick CTN layer and retention of Fe-electric field (<inline-formula> <tex-math>$E_{mathrm { Fe}}$ </tex-math></inline-formula>) and charge density for <inline-formula> <tex-math>$approx 11$ </tex-math></inline-formula> years and 2) synaptic device has shown an accuracy of 93% for MNIST digit dataset in ANNs. These predictions from the FONOS FET aimed at providing hybrid solutions to next-generation devices with extended applications in memory and neuromorphic applications with reliable device operation.","PeriodicalId":13251,"journal":{"name":"IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems","volume":"44 10","pages":"3882-3889"},"PeriodicalIF":2.9,"publicationDate":"2025-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145090054","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Ultralow Power Zirconium Electrolyte-Based Synaptic Device for Neuromorphic Computing","authors":"Milind Kumar;Reetwik Bhadra;Amitesh Kumar","doi":"10.1109/TCAD.2025.3551654","DOIUrl":"https://doi.org/10.1109/TCAD.2025.3551654","url":null,"abstract":"This work presents a novel electrolyte-based artificial synaptic device mimicking mammalian synaptic behavior because of an Electric Double Layer (EDL) for various neural-based applications. ZrO2, as an Electrolyte, with Indium-Gallium-Zinc-Oxide (IGZO) as a channel material, has been utilised for Synaptic Device (ZEISD). A novel electrolyte, ZrO2, with intercalation property, produces synaptic behaviors similar to biological neurons. The device shows sound synaptic functions, such as excitatory postsynaptic current (EPSC), inhibitory postsynaptic current (IPSC), and paired-pulse facilitation (PPF). With different spike amplitudes and frequencies, the device transitions from short-term potentiation (STP) to long-term potentiation (LTP) and depression characteristics, an essential neural behavior. FERMI, PRINT, and SRH modelling have been utilised to obtain neural behaviors. A higher On/Off ratio <inline-formula> <tex-math>$approx ~10^{5}$ </tex-math></inline-formula> has been obtained in line with designed and calibrated synaptic devices with ultralow power consumption as low as 0.162 pJ. Our findings hold significant implications for advancing the field of artificial neuromorphic devices, allowing for the creation of adaptable, dynamic functionalities.","PeriodicalId":13251,"journal":{"name":"IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems","volume":"44 10","pages":"3890-3895"},"PeriodicalIF":2.9,"publicationDate":"2025-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145090250","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Catch the Star: Weight Recovery Attack Using Side-Channel Star Map Against DNN Accelerator","authors":"Le Wu;Liji Wu;Xiangmin Zhang","doi":"10.1109/TCAD.2025.3551652","DOIUrl":"https://doi.org/10.1109/TCAD.2025.3551652","url":null,"abstract":"The rapid development of Artificial Intelligence (AI) technology must be connected to the arithmetic support of high-performance hardware. However, when the deep neural network (DNN) accelerator performs inference tasks at the edge end, the sensitive data of DNN will generate leakage through side-channel information. The adversary can recover the model structure and weight parameters of DNN by using the side-channel information, which seriously affects the protection of necessary intellectual property (IP) of DNN, so the hardware security of the DNN accelerator is critical. In the current research of Side-channel attack (SCA) for matrix multiplication units, such as systolic arrays, the linear multiplication operation leads to a more extensive weights search space for the SCA, and extracting all the weight parameters requires higher attack conditions. This article proposes a new power SCA method, which includes a Collision-Correlation Power Analysis (Collision-CPA) and Correlation-based Weight Search Algorithm (C-WSA) to address the problem. The Collision-CPA reduces the attack conditions for the SCA by building multiple Hamming Distance (HD)-based power leakage models for the systolic array. Meanwhile, the C-WSA dramatically reduces the weights search space. In addition, the concept of a Side-channel star map (SCSM) is proposed for the first time in this article, and the adversary can quickly and accurately locate the correct weight information in the SCSM. Through experiments, we recover all the weight parameters of a <inline-formula> <tex-math>$3times 3$ </tex-math></inline-formula> systolic array based on 100000 power traces, in which the weight search space is reduced by up to 97.7%. For the DNN accelerator at the edge, especially the systolic array structure, our proposed novel SCA aligns more with practical attack scenarios, with lower attack conditions, and higher attack efficiency.","PeriodicalId":13251,"journal":{"name":"IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems","volume":"44 10","pages":"3697-3709"},"PeriodicalIF":2.9,"publicationDate":"2025-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145090256","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xiaohang Wang;Miao Xu;Amit Kumar Singh;Yingtao Jiang;Mei Yang
{"title":"On Optimizing Inter- and Intra-Chiplet Interconnection Topologies for Robust Multi-Chiplet Systems","authors":"Xiaohang Wang;Miao Xu;Amit Kumar Singh;Yingtao Jiang;Mei Yang","doi":"10.1109/TCAD.2025.3550432","DOIUrl":"https://doi.org/10.1109/TCAD.2025.3550432","url":null,"abstract":"Inter- and intra-chiplet interconnection networks play a vital role in the operation of many core systems made of multiple chiplets. However, these networks are susceptible to faults caused by manufacturing defects and attacks resulting from the malicious insertion of hardware Trojans and backdoors. Unlike conventional fault-tolerant or countermeasure methods, this article focuses on optimizing network robustness to withstand both faults and attacks, while considering the constraints of chiplet area and power budget. To achieve this, this article first defines network robustness as a quantifiable measure based on various network parameters, after which an optimization problem is formulated to optimize the robustness of the network topology. To efficiently solve this problem, a reinforcement learning algorithm is proposed. Experimental results demonstrate that the proposed method is capable of generating inter- and intra-chiplet interconnection networks that are significantly more robust than existing topology generation methods. Specifically, the proposed method improves robustness over ButterDonut and Kite, respectively, by an average of 10.88% and 14.06% under random faults and by 9.37% and 7.81% under targeted attacks. These experimental results confirm that the proposed method is capable of generating robust inter- and intra-chiplet interconnection networks that can withstand both faults and attacks. By optimizing the network topology’s robustness, it provides a valuable contribution to the design and security of chiplet-based core systems.","PeriodicalId":13251,"journal":{"name":"IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems","volume":"44 10","pages":"3976-3989"},"PeriodicalIF":2.9,"publicationDate":"2025-03-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145100330","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An Improved Two-Step Attack on Lattice-Based Cryptography: A Case Study of Kyber","authors":"Kai Wang;Dejun Xu;Jing Tian","doi":"10.1109/TCAD.2025.3550443","DOIUrl":"https://doi.org/10.1109/TCAD.2025.3550443","url":null,"abstract":"After three rounds of post-quantum cryptography (PQC) strict evaluations conducted by NIST, CRYSTALS-Kyber was successfully selected in July 2022 and standardized in August 2024. It becomes urgent to further evaluate Kyber’s physical security for the upcoming deployment phase. In this brief, we present an improved two-step attack on Kyber to quickly recover the full secret key, s, by using much fewer power traces and less time. In the first step, we use the correlation power analysis (CPA) to obtain a portion of guess values of s with a small number of power traces. The CPA is enhanced by utilizing both Pearson and Kendall’s rank correlation coefficients and modifying the leakage model to improve the accuracy. In the second step, we adopt the lattice attack to recover s based on the results of CPA. The success rate is largely built up by constructing a trial-and-error method. We deploy the reference implementations of Kyber-512, -768, and -1024 on an ARM Cortex-M4 target board and successfully recover s in approximately <inline-formula> <tex-math>$9sim 10$ </tex-math></inline-formula> min with at most 15 power traces, using a Xeon Gold 6342-equipped machine for the attack.","PeriodicalId":13251,"journal":{"name":"IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems","volume":"44 9","pages":"3643-3647"},"PeriodicalIF":2.9,"publicationDate":"2025-03-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144887808","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}