{"title":"System-level simulation-based verification of Autonomous Driving Systems with the VIVAS framework and CARLA simulator","authors":"Srajan Goyal , Alberto Griggio , Stefano Tonetta","doi":"10.1016/j.scico.2024.103253","DOIUrl":"10.1016/j.scico.2024.103253","url":null,"abstract":"<div><div>Ensuring the safety and reliability of increasingly complex Autonomous Driving Systems (ADS) poses significant challenges, particularly when these systems rely on AI components for perception and control. In the ESA-funded project VIVAS, we developed a comprehensive framework for system-level, simulation-based Verification and Validation (V&V) of autonomous systems. This framework integrates a simulation model of the system, an abstract model describing system behavior symbolically, and formal methods for scenario generation and verification of simulation executions. The automated scenario generation process is guided by diverse coverage criteria.</div><div>In this paper, we present the application of the VIVAS framework to ADS by integrating it with CARLA, a widely-used driving simulator, and its ScenarioRunner tool. This integration facilitates the creation of diverse and complex driving scenarios to validate different state-of-the-art AI-based ADS agents shared by the CARLA community through its Autonomous Driving Challenge. We detail the development of a symbolic ADS model and the formulation of a coverage criterion focused on the behaviors of vehicles surrounding the ADS. Using the VIVAS framework, we generate and execute various highway-driving scenarios, evaluating the capabilities of the AI components. The results demonstrate the effectiveness of VIVAS in automating scenario generation for different off-the-shelf AI solutions.</div></div>","PeriodicalId":49561,"journal":{"name":"Science of Computer Programming","volume":"242 ","pages":"Article 103253"},"PeriodicalIF":1.5,"publicationDate":"2024-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143167307","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Modelling and verifying BDI agents under uncertainty","authors":"Blair Archibald , Michele Sevegnani , Mengwei Xu","doi":"10.1016/j.scico.2024.103254","DOIUrl":"10.1016/j.scico.2024.103254","url":null,"abstract":"<div><div>Belief-Desire-Intention (BDI) agents feature uncertain beliefs (e.g. sensor noise), probabilistic action outcomes (e.g. attempting and action and failing), and non-deterministic choices (e.g. what plan to execute next). To be safely applied in real-world scenarios we need reason about such agents, for example, we need probabilities of mission success and the <em>strategies</em> used to maximise this. Most agents do not currently consider uncertain beliefs, instead a belief either holds or does not. We show how to use epistemic states to model uncertain beliefs, and define a Markov Decision Process for the semantics of the Conceptual Agent Notation (<span>Can</span>) agent language allowing support for uncertain beliefs, non-deterministic event, plan, and intention selection, and probabilistic action outcomes. The model is executable using an automated tool—<span>CAN-verify</span>—that supports error checking, agent simulation, and exhaustive exploration via an encoding to Bigraphs that produces transition systems for probabilistic model checkers such as PRISM. These model checkers allow reasoning over quantitative properties and strategy synthesis. Using the example of an autonomous submarine and drone surveillance together with scalability experiments, we demonstrate our approach supports uncertain belief modelling, quantitative model checking, and strategy synthesis in practice.</div></div>","PeriodicalId":49561,"journal":{"name":"Science of Computer Programming","volume":"242 ","pages":"Article 103254"},"PeriodicalIF":1.5,"publicationDate":"2024-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143167308","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Trey Woodlief, Felipe Toledo, Sebastian Elbaum, Matthew B. Dwyer
{"title":"The SGSM framework: Enabling the specification and monitor synthesis of safe driving properties through scene graphs","authors":"Trey Woodlief, Felipe Toledo, Sebastian Elbaum, Matthew B. Dwyer","doi":"10.1016/j.scico.2024.103252","DOIUrl":"10.1016/j.scico.2024.103252","url":null,"abstract":"<div><div>As autonomous vehicles (AVs) become mainstream, assuring that they operate in accordance with safe driving properties becomes paramount. The ability to specify and monitor driving properties is at the center of such assurance. Yet, the mismatch between the semantic space over which typical driving properties are asserted (e.g., vehicles, pedestrians) and the sensed inputs of AVs (e.g., images, point clouds) poses a significant assurance gap. Related efforts bypass this gap by either assuming that data at the right semantic level is available, or they develop bespoke methods for capturing such data. Our recent Scene Graph Safety Monitoring (SGSM) framework addresses this challenge by extracting scene graphs (SGs) from sensor inputs to capture the entities related to the AV, specifying driving properties using a domain-specific language that enables building propositions over those graphs and composing them through temporal logic, and synthesizing monitors to detect property violations. Through this paper we further explain, formalize, analyze, and extend the SGSM framework, producing SGSM++. This extension is significant in that it incorporates the ability for the framework to encode the semantics of <em>resetting</em> a property violation, enabling the framework to count the quantity and duration of violations.</div><div>We implemented SGSM++ to monitor for violations of 9 properties of 3 AVs from the CARLA Autonomous Driving Leaderboard, confirming the viability of the framework, which found that the AVs violated 71% of properties during at least one test including almost 1400 unique violations over 30 total test executions, with violations lasting up to 9.25 minutes. Artifact available at <span><span>https://github.com/less-lab-uva/ExtendingSGSM</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49561,"journal":{"name":"Science of Computer Programming","volume":"242 ","pages":"Article 103252"},"PeriodicalIF":1.5,"publicationDate":"2024-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143167306","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Assessing the coverage of W-based conformance testing methods over code faults","authors":"Khaled El-Fakih , Faiz Hassan , Ayman Alzaatreh , Nina Yevtushenko","doi":"10.1016/j.scico.2024.103234","DOIUrl":"10.1016/j.scico.2024.103234","url":null,"abstract":"<div><div>We present novel empirical assessments of prominent finite state machine (FSM) conformance test derivation methods against their coverage of code faults. We consider a number of realistic extended FSM examples with their related Java implementations and derive for these examples complete test suites using the <em>W</em> method and its <em>HSI</em> and <em>H</em> derivatives considering the case when the implementation under test (IUT) has the same number of states as the specification FSM. We also consider <span><math><msup><mrow><mi>W</mi></mrow><mrow><mo>+</mo><mo>+</mo></mrow></msup></math></span>, <span><math><mi>H</mi><mi>S</mi><msup><mrow><mi>I</mi></mrow><mrow><mo>+</mo><mo>+</mo></mrow></msup></math></span>, and <span><math><msup><mrow><mi>H</mi></mrow><mrow><mo>+</mo><mo>+</mo></mrow></msup></math></span> test suites derived considering the case when the IUT can have one more extra state. For each pair of considered test suites, we determine if there is a difference between the pair in covering the implementations faults. If the difference is significant, we determine which test suite outperforms the other. We run two other assessments which show that the obtained results are not due to the size or length of the test suites. In addition, we conduct assessments to determine whether each of the methods has better coverage of certain classes of faults than others and whether the <em>W</em> outperforms the <em>HSI</em> and <em>H</em> methods over only certain classes of faults. The results and outcomes of conducted experiments are summarized. Major artifacts used in the assessments are provided as benchmarks for further studies.</div></div>","PeriodicalId":49561,"journal":{"name":"Science of Computer Programming","volume":"241 ","pages":"Article 103234"},"PeriodicalIF":1.5,"publicationDate":"2024-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142720058","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Andy S. Tatman , Hans-Dieter A. Hiep , Stijn de Gouw
{"title":"Analysis and formal specification of OpenJDK's BitSet: Proof files","authors":"Andy S. Tatman , Hans-Dieter A. Hiep , Stijn de Gouw","doi":"10.1016/j.scico.2024.103232","DOIUrl":"10.1016/j.scico.2024.103232","url":null,"abstract":"<div><div>This artifact <span><span>[1]</span></span> (accompanying our iFM 2023 paper <span><span>[2]</span></span>) describes the software we developed that contributed towards our analysis of OpenJDK's <span>BitSet</span> class. This class represents a vector of bits that grows as needed. Our analysis exposed numerous bugs. In our paper, we proposed and compared a number of solutions supported by formal specifications. Full mechanical verification of the <span>BitSet</span> class is not yet possible due to limited support for bitwise operations in KeY and bugs in BitSet. Our artifact contains proofs for a subset of the methods and new proof rules to support bitwise operators.</div></div>","PeriodicalId":49561,"journal":{"name":"Science of Computer Programming","volume":"241 ","pages":"Article 103232"},"PeriodicalIF":1.5,"publicationDate":"2024-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142702235","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Achim D. Brucker , Idir Ait-Sadoune , Nicolas Méric , Burkhart Wolff
{"title":"Parametric ontologies in formal software engineering","authors":"Achim D. Brucker , Idir Ait-Sadoune , Nicolas Méric , Burkhart Wolff","doi":"10.1016/j.scico.2024.103231","DOIUrl":"10.1016/j.scico.2024.103231","url":null,"abstract":"<div><div>Isabelle/DOF is an ontology framework on top of Isabelle/HOL. It allows for the formal development of ontologies and continuous conformity-checking of integrated documents, including the tracing of typed meta-data of documents. Isabelle/DOF deeply integrates into the Isabelle/HOL ecosystem, allowing to write documents containing (informal) text, executable code, (formal and semiformal) definitions, and proofs. Users of Isabelle/DOF can either use HOL or one of the many formal methods that have been embedded into Isabelle/HOL to express formal parts of their documents.</div><div>In this paper, we extend Isabelle/DOF with annotations of <figure><img></figure>-terms, a pervasive data-structure underlying Isabelle to syntactically represent expressions and formulas. We achieve this by using Higher-order Logic (HOL) itself for query-expressions and data-constraints (ontological invariants) executed via code-generation and reflection. Moreover, we add support for <em>parametric</em> ontological classes, thus exploiting HOL's polymorphic type system.</div><div>The benefits are: First, the HOL representation allows for flexible and efficient run-time checking of abstract properties of formal content under evolution. Second, it is possible to prove properties over generic ontological classes. We demonstrate these new features by a number of smaller ontologies from various domains and a case study using a substantial ontology for formal system development targeting certification according to CENELEC 50128.</div></div>","PeriodicalId":49561,"journal":{"name":"Science of Computer Programming","volume":"241 ","pages":"Article 103231"},"PeriodicalIF":1.5,"publicationDate":"2024-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142702236","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"CAN-Verify: Automated analysis for BDI agents","authors":"Mengwei Xu , Blair Archibald , Michele Sevegnani","doi":"10.1016/j.scico.2024.103233","DOIUrl":"10.1016/j.scico.2024.103233","url":null,"abstract":"<div><div>We present <span>CAN-Verify</span>, an automated tool for analysing BDI agents written in the Conceptual Agent Notation (<span>Can</span>) language. <span>CAN-Verify</span> includes support for syntactic error detection before agent execution, agent program interpretation (running agents), and model-checking of agent programs (analysing agents). The model checking supports verifying the correctness of agents against both generic agent requirements, such as if a task is accomplished, and user-defined requirements, such as certain beliefs eventually holding. The latter can be expressed in structured natural language, allowing the tool to be used by agent programmers without formal training in the underlying verification techniques.</div></div>","PeriodicalId":49561,"journal":{"name":"Science of Computer Programming","volume":"241 ","pages":"Article 103233"},"PeriodicalIF":1.5,"publicationDate":"2024-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142702234","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Erwan Mahe , Boutheina Bannour , Christophe Gaston , Pascale Le Gall
{"title":"Efficient interaction-based offline runtime verification of distributed systems with lifeline removal","authors":"Erwan Mahe , Boutheina Bannour , Christophe Gaston , Pascale Le Gall","doi":"10.1016/j.scico.2024.103230","DOIUrl":"10.1016/j.scico.2024.103230","url":null,"abstract":"<div><div>Runtime Verification (RV) refers to a family of techniques in which system executions are observed and confronted to formal specifications, with the aim of identifying faults. In offline RV, observation and verification are done in two separate and successive steps. In this paper, we define an approach to offline RV of Distributed Systems (DS) against interactions. Interactions are formal models describing communications within a DS. A DS is composed of subsystems deployed on different machines and interacting via message passing to achieve common goals. Therefore, observing executions of a DS entails logging a collection of local execution traces, one for each subsystem, collected on its host machine. We call <em>multi-trace</em> such observational artifacts. A major challenge in analyzing multi-traces is that there are no practical means to synchronize the ends of observations of all the local traces. We address this via an operation called lifeline removal, which we apply on-the-fly to the specification during the verification of a multi-trace once a local trace has been entirely analyzed. This operation removes from the interaction the specification of actions occurring on the subsystem that is no longer observed. This may allow further execution of the specification by removing potential deadlock. We prove the correctness of the resulting RV algorithm and introduce two optimization techniques, which we also prove correct. We implement a Partial Order Reduction (POR) technique by selecting a one-unambiguous action (as a unique first step to a linearization) whose existence is determined via the lifeline removal operator. Additionally, Local Analyses (LOC), i.e., the verification of local traces, can be leveraged during the global multi-trace analysis to prove failure more quickly. Experiments illustrate the application of our RV approach and the benefits of our optimizations.</div></div>","PeriodicalId":49561,"journal":{"name":"Science of Computer Programming","volume":"241 ","pages":"Article 103230"},"PeriodicalIF":1.5,"publicationDate":"2024-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142702237","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Gerhard Schellhorn, Stefan Bodenmüller, Wolfgang Reif
{"title":"Verification of forward simulations with thread-local, step-local proof obligations","authors":"Gerhard Schellhorn, Stefan Bodenmüller, Wolfgang Reif","doi":"10.1016/j.scico.2024.103227","DOIUrl":"10.1016/j.scico.2024.103227","url":null,"abstract":"<div><div>This paper presents a proof technique for proving refinements for general state-based models of concurrent systems that reduces proving forward simulations to thread-local, step-local proof obligations. The approach has been implemented in our theorem prover KIV, which translates imperative programs to a set of transition rules and generates proof obligations accordingly. Instances of this proof technique should also be applicable to systems specified with ASM rules, B events, or Z operations. To exemplify the proof methodology, we demonstrate it with two case studies. The first verifies linearizability of a lock-free implementation of concurrent hash sets by showing that it refines an abstract concurrent system with atomic operations. The second applies the proof technique to the verification of opacity of Transactional Mutex Locks (TML), a Software Transactional Memory algorithm. Compared to the standard approach of proving a forward simulation directly, both case studies show a significant reduction in proof effort.</div></div>","PeriodicalId":49561,"journal":{"name":"Science of Computer Programming","volume":"241 ","pages":"Article 103227"},"PeriodicalIF":1.5,"publicationDate":"2024-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142660651","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}