{"title":"Efficient Discovery of Actual Causality Using Abstraction Refinement","authors":"Arshia Rafieioskouei;Borzoo Bonakdarpour","doi":"10.1109/TCAD.2024.3448299","DOIUrl":"https://doi.org/10.1109/TCAD.2024.3448299","url":null,"abstract":"Causality is the relationship where one event contributes to the production of another, with the cause being partly responsible for the effect and the effect partly dependent on the cause. In this article, we propose a novel and effective method to formally reason about the causal effect of events in engineered systems, with application for finding the root-cause of safety violations in embedded and cyber-physical systems. We are motivated by the notion of actual causality by Halpern and Pearl, which focuses on the causal effect of particular events rather than type-level causality, which attempts to make general statements about scientific and natural phenomena. Our first contribution is formulating discovery of actual causality in computing systems modeled by transition systems as an satisfiability modulo theory solving problem. Since datasets for causality analysis tend to be large, in order to tackle the scalability problem of automated formal reasoning, our second contribution is a novel technique based on abstraction refinement that allows identifying for actual causes within smaller abstract causal models. We demonstrate the effectiveness of our approach (by several orders of magnitude) using three case studies to find the actual cause of violations of safety in 1) a neural network controller for a mountain car; 2) a controller for a Lunar Lander obtained by reinforcement learning; and 3) an MPC controller for an F-16 autopilot simulator.","PeriodicalId":13251,"journal":{"name":"IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems","volume":"43 11","pages":"4274-4285"},"PeriodicalIF":2.7,"publicationDate":"2024-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142636466","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Hyper Parametric Timed CTL","authors":"Masaki Waga;Étienne André","doi":"10.1109/TCAD.2024.3443704","DOIUrl":"https://doi.org/10.1109/TCAD.2024.3443704","url":null,"abstract":"Hyperproperties enable simultaneous reasoning about multiple execution traces of a system and are useful to reason about noninterference, opacity, robustness, fairness, observational determinism, etc. We introduce hyper parametric timed computation tree logic (HyperPTCTL), extending hyperlogics with timing reasoning and, notably, parameters to express unknown values. We mainly consider its nest-free fragment, where the temporal operators cannot be nested. However, we allow extensions that enable counting actions and comparing the duration since the most recent occurrence of specific actions. We show that our nest-free fragment with this extension is sufficiently expressive to encode the properties, e.g., opacity, (un)fairness, or robust observational (non)determinism. We propose semi-algorithms for the model checking and synthesis of parametric timed automata (TAs) (an extension of TAs with timing parameters) against this nest-free fragment with the extension via reduction to the PTCTL model checking and synthesis. While the general model checking (and thus synthesis) problem is undecidable, we show that a large part of our extended (yet nest-free) fragment is decidable, provided the parameters only appear in the property, not in the model. We also exhibit additional decidable fragments where the parameters within the model are allowed. We implemented our semi-algorithms on the top of the IMITATOR model checker and performed experiments. Our implementation supports most of the nest-free fragments (beyond the decidable classes). The experimental results highlight our method’s practical relevance.","PeriodicalId":13251,"journal":{"name":"IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems","volume":"43 11","pages":"4286-4297"},"PeriodicalIF":2.7,"publicationDate":"2024-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142636468","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
P. Habeeb;Deepak D’Souza;Kamal Lodaya;Pavithra Prabhakar
{"title":"Interval Image Abstraction for Verification of Camera-Based Autonomous Systems","authors":"P. Habeeb;Deepak D’Souza;Kamal Lodaya;Pavithra Prabhakar","doi":"10.1109/TCAD.2024.3448306","DOIUrl":"https://doi.org/10.1109/TCAD.2024.3448306","url":null,"abstract":"We propose an abstraction-refinement-based algorithm for the problem of verifying the safety of a camera-based autonomous system in a synthetic 3D-scene, based on the notion of interval images. An interval image is an abstract data structure that represents a set of images in a 3D-scene. We give a computer graphics style rendering algorithm to efficiently compute interval images from a given region. Our proposed abstraction-refinement algorithm leverages recent abstract interpretation tools for neural networks. We have implemented and evaluated the proposed technique on complex 3D-scenes, demonstrating its effectiveness and scalability in comparison with earlier techniques.","PeriodicalId":13251,"journal":{"name":"IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems","volume":"43 11","pages":"4310-4321"},"PeriodicalIF":2.7,"publicationDate":"2024-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142636559","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Navid Hashemi;Lars Lindemann;Jyotirmoy V. Deshmukh
{"title":"Statistical Reachability Analysis of Stochastic Cyber-Physical Systems Under Distribution Shift","authors":"Navid Hashemi;Lars Lindemann;Jyotirmoy V. Deshmukh","doi":"10.1109/TCAD.2024.3438072","DOIUrl":"https://doi.org/10.1109/TCAD.2024.3438072","url":null,"abstract":"Reachability analysis is a popular method to give safety guarantees for stochastic cyber-physical systems (SCPSs) that takes in a symbolic description of the system dynamics and uses set-propagation methods to compute an overapproximation of the set of reachable states over a bounded time horizon. In this article, we investigate the problem of performing reachability analysis for an SCPS that does not have a symbolic description of the dynamics, but instead is described using a digital twin model that can be simulated to generate system trajectories. An important challenge is that the simulator implicitly models a probability distribution over the set of trajectories of the SCPS; however, it is typical to have a sim2real gap, i.e., the actual distribution of the trajectories in a deployment setting may be shifted from the distribution assumed by the simulator. We thus propose a statistical reachability analysis technique that, given a user-provided threshold \u0000<inline-formula> <tex-math>$1-epsilon $ </tex-math></inline-formula>\u0000, provides a set that guarantees that any trajectory during deployment lies in this set with probability not smaller than this threshold. Our method is based on three main steps: 1) learning a deterministic surrogate model from sampled trajectories; 2) conducting reachability analysis over the surrogate model; and 3) employing robust conformal inference (CI) using an additional set of sampled trajectories to quantify the surrogate model’s distribution shift with respect to the deployed SCPS. To counter conservatism in reachable sets, we propose a novel method to train surrogate models that minimizes a quantile loss term (instead of the usual mean squared loss), and a new method that provides tighter guarantees using CI using a normalized surrogate error. We demonstrate the effectiveness of our technique on various case studies.","PeriodicalId":13251,"journal":{"name":"IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems","volume":"43 11","pages":"4250-4261"},"PeriodicalIF":2.7,"publicationDate":"2024-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142636374","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Approximate Conformance Checking for Closed-Loop Systems With Neural Network Controllers","authors":"P. Habeeb;Lipsy Gupta;Pavithra Prabhakar","doi":"10.1109/TCAD.2024.3445813","DOIUrl":"https://doi.org/10.1109/TCAD.2024.3445813","url":null,"abstract":"In this article, we consider the problem of checking approximate conformance of closed-loop systems with the same plant but different neural network (NN) controllers. First, we introduce a notion of approximate conformance on NNs, which allows us to quantify semantically the deviations in closed-loop system behaviors with different NN controllers. Next, we consider the problem of computationally checking this notion of approximate conformance on two NNs. We reduce this problem to that of reachability analysis on a combined NN, thereby, enabling the use of existing NN verification tools for conformance checking. Our experimental results on an autonomous rocket landing system demonstrate the feasibility of checking approximate conformance on different NNs trained for the same dynamics, as well as the practical semantic closeness exhibited by the corresponding closed-loop systems.","PeriodicalId":13251,"journal":{"name":"IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems","volume":"43 11","pages":"4322-4333"},"PeriodicalIF":2.7,"publicationDate":"2024-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142636560","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"CaBaFL: Asynchronous Federated Learning via Hierarchical Cache and Feature Balance","authors":"Zeke Xia;Ming Hu;Dengke Yan;Xiaofei Xie;Tianlin Li;Anran Li;Junlong Zhou;Mingsong Chen","doi":"10.1109/TCAD.2024.3446881","DOIUrl":"https://doi.org/10.1109/TCAD.2024.3446881","url":null,"abstract":"Federated learning (FL) as a promising distributed machine learning paradigm has been widely adopted in Artificial Intelligence of Things (AIoT) applications. However, the efficiency and inference capability of FL is seriously limited due to the presence of stragglers and data imbalance across massive AIoT devices, respectively. To address the above challenges, we present a novel asynchronous FL approach named CaBaFL, which includes a hierarchical cache-based aggregation mechanism and a feature balance-guided device selection strategy. CaBaFL maintains multiple intermediate models simultaneously for local training. The hierarchical cache-based aggregation mechanism enables each intermediate model to be trained on multiple devices to align the training time and mitigate the straggler issue. In specific, each intermediate model is stored in a low-level cache for local training and when it is trained by sufficient local devices, it will be stored in a high-level cache for aggregation. To address the problem of imbalanced data, the feature balance-guided device selection strategy in CaBaFL adopts the activation distribution as a metric, which enables each intermediate model to be trained across devices with totally balanced data distributions before aggregation. Experimental results show that compared to the state-of-the-art FL methods, CaBaFL achieves up to 9.26X training acceleration and 19.71% accuracy improvements.","PeriodicalId":13251,"journal":{"name":"IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems","volume":"43 11","pages":"4057-4068"},"PeriodicalIF":2.7,"publicationDate":"2024-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142594996","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Backdoor Attacks on Safe Reinforcement Learning-Enabled Cyber–Physical Systems","authors":"Shixiong Jiang;Mengyu Liu;Fanxin Kong","doi":"10.1109/TCAD.2024.3447468","DOIUrl":"https://doi.org/10.1109/TCAD.2024.3447468","url":null,"abstract":"Safe reinforcement learning (RL) aims to derive a control policy that navigates a safety-critical system while avoiding unsafe explorations and adhering to safety constraints. While safe RL has been extensively studied, its vulnerabilities during the policy training have barely been explored in an adversarial setting. This article bridges this gap and investigates the training time vulnerability of formal language-guided safe RL. Such vulnerability allows a malicious adversary to inject backdoor behavior into the learned control policy. First, we formally define backdoor attacks for safe RL and divide them into active and passive ones depending on whether to manipulate the observation. Second, we propose two novel algorithms to synthesize the two kinds of attacks, respectively. Both algorithms generate backdoor behaviors that may go unnoticed after deployment but can be triggered when specific states are reached, leading to safety violations. Finally, we conduct both theoretical analysis and extensive experiments to show the effectiveness and stealthiness of our methods.","PeriodicalId":13251,"journal":{"name":"IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems","volume":"43 11","pages":"4093-4104"},"PeriodicalIF":2.7,"publicationDate":"2024-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142595095","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Bank on Compute-Near-Memory: Design Space Exploration of Processing-Near-Bank Architectures","authors":"Rafael Medina;Giovanni Ansaloni;Marina Zapater;Alexandre Levisse;Saeideh Alinezhad Chamazcoti;Timon Evenblij;Dwaipayan Biswas;Francky Catthoor;David Atienza","doi":"10.1109/TCAD.2024.3442989","DOIUrl":"https://doi.org/10.1109/TCAD.2024.3442989","url":null,"abstract":"Near-DRAM computing strategies advocate for providing computational capabilities close to where data is stored. Although this paradigm can effectively address the memory-to-processor communication bottleneck, it also presents new challenges: The strict resource constraints in the memory periphery demand careful tailoring of architectural elements. We herein propose a novel framework and methodology to explore compute-near-memory designs that interface to DRAM memory banks, demonstrating the area, energy, and performance tradeoffs subject to the architectural configuration. We exemplify this methodology by conducting two studies on compute-near-bank designs: 1) analyzing the interaction between control and data resources, and 2) exploring the integration of processing units with different DRAM standards. According to our study, the optimal size ratios between instruction and data capacity vary from \u0000<inline-formula> <tex-math>$2times $ </tex-math></inline-formula>\u0000 to \u0000<inline-formula> <tex-math>$4times $ </tex-math></inline-formula>\u0000 across benchmarks from representative application domains. The retrieved Pareto-optimal solutions from our framework improve state-of-the-art designs, e.g., achieving a 50% performance increase on matrix operations with 15% energy overhead relative to the FIMDRAM design. In addition, the exploration of DRAM shows the interplay between available internal bandwidth, performance, and area overhead. For example, a threefold increase in bandwidth rises performance by 47% across workloads at a 34% extra area cost.","PeriodicalId":13251,"journal":{"name":"IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems","volume":"43 11","pages":"4117-4129"},"PeriodicalIF":2.7,"publicationDate":"2024-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142595037","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"FlexFL: Heterogeneous Federated Learning via APoZ-Guided Flexible Pruning in Uncertain Scenarios","authors":"Zekai Chen;Chentao Jia;Ming Hu;Xiaofei Xie;Anran Li;Mingsong Chen","doi":"10.1109/TCAD.2024.3444695","DOIUrl":"https://doi.org/10.1109/TCAD.2024.3444695","url":null,"abstract":"Along with the increasing popularity of deep learning (DL) techniques, more and more Artificial Intelligence of Things (AIoT) systems are adopting federated learning (FL) to enable privacy-aware collaborative learning among the AIoT devices. However, due to the inherent data and device heterogeneity issues, the existing FL-based AIoT systems suffer from the model selection problem. Although various heterogeneous FL methods have been investigated to enable collaborative training among the heterogeneous models, there is still a lack of 1) wise heterogeneous model generation methods for the devices; 2) consideration of uncertain factors; and 3) performance guarantee for the large models, thus strongly limiting the overall FL performance. To address the above issues, this article introduces a novel heterogeneous FL framework named FlexFL. By adopting our average percentage of zeros (APoZ)-guided flexible pruning strategy, FlexFL can effectively derive best-fit models for the heterogeneous devices to explore their greatest potential. Meanwhile, our proposed adaptive local pruning strategy allows the AIoT devices to prune their received models according to their varying resources within uncertain scenarios. Moreover, based on the self-knowledge distillation, FlexFL can enhance the inference performance of the large models by learning the knowledge from the small models. Comprehensive experimental results show that, compared to the state-of-the-art heterogeneous FL methods, FlexFL can significantly improve the overall inference accuracy by up to 14.24%. Our code can be found here \u0000<uri>https://github.com/mastlab-T3S/FlexFL</uri>\u0000.","PeriodicalId":13251,"journal":{"name":"IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems","volume":"43 11","pages":"4069-4080"},"PeriodicalIF":2.7,"publicationDate":"2024-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142595055","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Moritz Scherer;Luka Macan;Victor J. B. Jung;Philip Wiese;Luca Bompani;Alessio Burrello;Francesco Conti;Luca Benini
{"title":"Deeploy: Enabling Energy-Efficient Deployment of Small Language Models on Heterogeneous Microcontrollers","authors":"Moritz Scherer;Luka Macan;Victor J. B. Jung;Philip Wiese;Luca Bompani;Alessio Burrello;Francesco Conti;Luca Benini","doi":"10.1109/TCAD.2024.3443718","DOIUrl":"https://doi.org/10.1109/TCAD.2024.3443718","url":null,"abstract":"With the rise of embodied foundation models (EFMs), most notably small language models (SLMs), adapting Transformers for the edge applications has become a very active field of research. However, achieving the end-to-end deployment of SLMs on the microcontroller (MCU)-class chips without high-bandwidth off-chip main memory access is still an open challenge. In this article, we demonstrate high efficiency end-to-end SLM deployment on a multicore RISC-V (RV32) MCU augmented with ML instruction extensions and a hardware neural processing unit (NPU). To automate the exploration of the constrained, multidimensional memory versus computation tradeoffs involved in the aggressive SLM deployment on the heterogeneous (multicore+NPU) resources, we introduce Deeploy, a novel deep neural network (DNN) compiler, which generates highly optimized C code requiring minimal runtime support. We demonstrate that Deeploy generates the end-to-end code for executing SLMs, fully exploiting the RV32 cores’ instruction extensions and the NPU. We achieve leading-edge energy and throughput of \u0000<inline-formula> <tex-math>$490 ; mu $ </tex-math></inline-formula>\u0000J per token, at 340 token per second for an SLM trained on the TinyStories dataset, running for the first time on an MCU-class device without the external memory.","PeriodicalId":13251,"journal":{"name":"IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems","volume":"43 11","pages":"4009-4020"},"PeriodicalIF":2.7,"publicationDate":"2024-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142595056","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}