A. Bombarda, S. Bonfanti, A. Gargantini, Yu Lei, Feng Duan
{"title":"RATE: A model‐based testing approach that combines model refinement and test execution","authors":"A. Bombarda, S. Bonfanti, A. Gargantini, Yu Lei, Feng Duan","doi":"10.1002/stvr.1835","DOIUrl":"https://doi.org/10.1002/stvr.1835","url":null,"abstract":"In this paper, we present an approach to conformance testing based on abstract state machines (ASMs) that combines model refinement and test execution (RATE) and its application to three case studies. The RATE approach consists in generating test sequences from ASMs and checking the conformance between code and models in multiple iterations. The process follows these steps: (1) model the system as an abstract state machine; (2) validate and verify the model; (3) generate test sequences automatically from the ASM model; (4) execute the tests over the implementation and compute the code coverage; (5) if the coverage is below the desired threshold, then refine the abstract state machine model to add the uncovered functionalities and return to step 2. We have applied the proposed approach in three case studies: a traffic light control system (TLCS), the IEEE 11073‐20601 personal health device (PHD) protocol, and the mechanical ventilator Milano (MVM). By applying RATE, at each refinement level, we have increased code coverage and identified some faults or conformance errors for all the case studies. The fault detection capability of RATE has also been confirmed by mutation analysis, in which we have highlighted that, many mutants can be killed even by the most abstract models.","PeriodicalId":49506,"journal":{"name":"Software Testing Verification & Reliability","volume":"49 1","pages":""},"PeriodicalIF":1.5,"publicationDate":"2022-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86789770","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Uncertainty quantification for deep neural networks: An empirical comparison and usage guidelines","authors":"Michael Weiss, P. Tonella","doi":"10.1002/stvr.1840","DOIUrl":"https://doi.org/10.1002/stvr.1840","url":null,"abstract":"Deep neural networks (DNN) are increasingly used as components of larger software systems that need to process complex data, such as images, written texts, audio/video signals. DNN predictions cannot be assumed to be always correct for several reasons, amongst which the huge input space that is dealt with, the ambiguity of some inputs data, as well as the intrinsic properties of learning algorithms, which can provide only statistical warranties. Hence, developers have to cope with some residual error probability. An architectural pattern commonly adopted to manage failure prone components is the supervisor, an additional component that can estimate the reliability of the predictions made by untrusted (e.g., DNN) components and can activate an automated healing procedure when these are likely to fail, ensuring that the deep learning‐based system (DLS) does not cause damages, despite its main functionality being suspended.","PeriodicalId":49506,"journal":{"name":"Software Testing Verification & Reliability","volume":"91 1","pages":""},"PeriodicalIF":1.5,"publicationDate":"2022-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85874069","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Fuzz testing for digital TV receivers and multitasking control software verification","authors":"Yves Le Traon, Tao Xie","doi":"10.1002/stvr.1836","DOIUrl":"https://doi.org/10.1002/stvr.1836","url":null,"abstract":"This issue contains two very different papers, in terms of subjects and proposed test and verification techniques. The first paper focuses on testing the robustness of digital TV (DTV) receivers through (non)compliance fuzz testing. The second one focuses on a model-based approach to enable the verification of multitasking control software, proposing an OS-in-the-Loop (OiL) verification framework. The first paper, ‘A fuzzing-based test-creation approach for evaluating digital TV receivers via transport streams’ by Fabricio Izumi, Eddie B. de Lima Filho, Lucas C. Cordeiro, Orlewilson Maia, Rômulo Fabrício, Bruno Farias and Aguinaldo Silva, concerns the generation of noncompliance tests using grammar-based guided fuzzing. The originality of this contribution resides in the nature of the test subjects, which are DTV receivers, their (mis)configurations and transport streams. The originality extends to conformance testing by targeting robustness improvements: Instead of checking whether it behaves as expected, the goal is to verify the DTV receiver response against inaccurate or inconsistent data, based on fuzzing input generation. Finally, the approach is supported by a complete evaluation framework, which includes a testing environment, audio and video verification algorithms and a strategy for test creation (recommended by Paul Strooper, Rob Hierons and Yves Le Traon). The second paper, ‘OS-in-the-Loop verification for multi-tasking control software’ by Yunja Choi, presents an original approach to perform verification for embedded control software, specifically an OiL verification framework. This framework is based on a modelling of embedded operating systems, enabling the composition of the interactions of the OS model and the device controllers, thanks to an algorithm described in the paper. Multitasking is thus treated thanks to this composition mechanism. The framework makes it possible to apply various verification methods for multitasking (random simulation, dynamic concolic testing and model checking). The application of the OiL verification to a small-case study illustrates the benefit of the framework, which has been successfully applied on two typical pieces of multitasking embedded software from industry (recommended by Benoit Baudry, Rob Hierons and Yves Le Traon). We hope you will find these papers interesting and inspiring for your future work.","PeriodicalId":49506,"journal":{"name":"Software Testing Verification & Reliability","volume":"26 1","pages":""},"PeriodicalIF":1.5,"publicationDate":"2022-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84592095","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"OS‐in‐the‐Loop verification for multi‐tasking control software","authors":"Yunja Choi","doi":"10.1002/stvr.1834","DOIUrl":"https://doi.org/10.1002/stvr.1834","url":null,"abstract":"Embedded control software that controls safety‐critical IoT devices requires systematic and comprehensive verification to ensure safe operation of the device. However, rigorous verification in this domain has not been feasible due to the high complexity of embedded control software, which is characterized by the frequent use of multi‐tasking, interrupts, and periodic alarms. Realizing that two major factors, scalability and exactness, are extremely difficult to achieve at the same time but critical for effective and efficient verification in this domain, this work introduces a domain‐specific compositional OS‐in‐the‐Loop (OiL) verification approach and sets out to push the boundary in achieving both factors. The suggested approach (1) models the behavior of the underlying operating system to limit the search space using the notion of controlled concurrency, (2) performs heterogeneous composition of controllers with the formal OS model to reduce verification complexity, and (3) utilizes state‐of‐the‐art verification techniques for the purpose of comprehensive verification up to a given search depth.","PeriodicalId":49506,"journal":{"name":"Software Testing Verification & Reliability","volume":"127 1","pages":""},"PeriodicalIF":1.5,"publicationDate":"2022-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89887924","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Fabrício Izumi, E. Filho, L. Cordeiro, O. Maia, Rômulo Fabrício, B. Farias, Aguinaldo Silva
{"title":"A fuzzing‐based test‐creation approach for evaluating digital TV receivers via transport streams","authors":"Fabrício Izumi, E. Filho, L. Cordeiro, O. Maia, Rômulo Fabrício, B. Farias, Aguinaldo Silva","doi":"10.1002/stvr.1833","DOIUrl":"https://doi.org/10.1002/stvr.1833","url":null,"abstract":"Digital TV (DTV) receivers are usually submitted to testing systems for conformity and robustness assessment, and their approval implies correct operation under a given DTV specification protocol. However, many broadcasters inadvertently misconfigure their devices and transmit the wrong information concerning data structures and protocol format. Since most receivers were not designed to operate under such conditions, malfunction and incorrect behaviour may be noticed, often recognized as field problems, thus compromising a given system's operation. Moreover, the way those problems are usually introduced in DTV signals presents some randomness, but with known restrictions given by the underlying transport protocols used in DTV systems, which resembles fuzzing techniques. Indeed, everything may happen since any deviation can incur problems, depending on each specific implementation. This error scenario is addressed here, and a novel receiver robustness evaluation methodology based on non‐compliance tests using grammar‐based guided fuzzing is proposed. In particular, devices are submitted to unforeseen conditions and incorrect configuration. They are created with guided fuzzing based on real problems, protocol structure, and system architecture to provide resources for handling them, thus ensuring correct operation. Experiments using such a fuzzing scheme have shown its efficacy and provided opportunities to improve robustness regarding commercial DTV platforms.","PeriodicalId":49506,"journal":{"name":"Software Testing Verification & Reliability","volume":"1 1","pages":""},"PeriodicalIF":1.5,"publicationDate":"2022-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85851921","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Combinatorial testing and model checking","authors":"Yves Le Traon, Tao Xie","doi":"10.1002/stvr.1832","DOIUrl":"https://doi.org/10.1002/stvr.1832","url":null,"abstract":"This issue contains two papers. The first paper focuses on combinatorial testing, and the second one focuses on model checking.Thefirst paper, ‘ Combinatorial methods for dynamic grey-box SQL injection testing ’ by Bernhard Garn, Jovan Zivanovic, Manuel Leithner and Dimitris E. Simos, concerns combinatorial testing for SQL injection. Code injections attacks, and in particular SQL injection (SQLi) attacks, are still among the most critical threats for web applications. These attacks rely on exploiting vulnerabilities, which must be actively chased to deploy a secure system. Leveraging combinatorial testing, the authors propose novel attack grammars to generate SQLi attacks against MySQL-compatible databases. One originality of this contribution resides in dynamically optimizing and improving the attack grammars to the context. This context-sensitive adaptation technique is supported by a prototype tool named SQLInjector + and is validated and benchmarked on a representative set of web applications under test. The contribution is accompanied by a nice addition to the field: a simple framework called WAFTF for testing the filtering tech-niques of web application firewalls such as ModSecurity. (Recommended by Yves Le Traon) The second paper, ‘ Comprehensive evaluation of file systems robustness with SPIN model checking ’ by Jingcheng Yuan, Toshiaki Aoki and Xiaoyun Guo, presents a study that comprehensively evaluates the robustness of file systems using a model checking approach, covering the majority of the mainstream file system types and both single-thread and multi-thread modes. In particular, to abstract real file systems, the authors developed Promela models optimized to avoid state explosion during model checking and used an SPIN model checker to check these models for detecting corner-case errors during an unexpected power outage. The authors analysed counterexamples generated by model checking to determine an improved file system model that is capable of preventing errors in most mainstream file system types and then rechecked the improved file system model and verified the absence of all critical errors.","PeriodicalId":49506,"journal":{"name":"Software Testing Verification & Reliability","volume":"16 1","pages":""},"PeriodicalIF":1.5,"publicationDate":"2022-08-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78328631","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
R. Gopinath, Jie M. Zhang, Marinos Kintis, Mike Papadakis
{"title":"Mutation analysis and its industrial applications","authors":"R. Gopinath, Jie M. Zhang, Marinos Kintis, Mike Papadakis","doi":"10.1002/stvr.1830","DOIUrl":"https://doi.org/10.1002/stvr.1830","url":null,"abstract":"from EFSM specifications using numerous coverage criteria, which are evaluated using mutation analysis. The authors present their results, and provide recommendations for practitioners. 3. The third paper is Learning-based Mutant Reduction using Fine-grained Mutation Operators by Shin Hong and Yunho Kim . This paper proposes MUTRAIN, a technique for reducing the cost of mutation testing. It uses cost-considerate linear regression to learn a mutation model allows prediction of mutation score from a much smaller set of fine-grained mutation operators.","PeriodicalId":49506,"journal":{"name":"Software Testing Verification & Reliability","volume":"10 1","pages":""},"PeriodicalIF":1.5,"publicationDate":"2022-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75144061","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Comprehensive evaluation of file systems robustness with SPIN model checking","authors":"Jingcheng Yuan, Toshiaki Aoki, Xiaoyun Guo","doi":"10.1002/stvr.1828","DOIUrl":"https://doi.org/10.1002/stvr.1828","url":null,"abstract":"In existing computer systems, file systems are indispensable for organizing user data and system codes. However, several studies have reported certain file system errors that cause significant data loss or system crashes. Most of these errors are due to external failures, such as an unexpected power outage. However, comprehensively evaluating file system robustness to detect these errors is challenging. The various types of file systems use different data structures and algorithms for various applications. Moreover, file system errors may be triggered by an unpredictable external condition. In addition, a file system works in an operating system's kernel layer as a passive module and runs in a multi‐thread mode, which makes file system testing time‐intensive. Furthermore, the large number of states in file systems leads to greedy checking, which results in a state explosion. In this study, we comprehensively evaluated the robustness expected in multiple properties of file systems using a model checking approach. The evaluation covered the majority of the mainstream file system types and included both single‐thread and multi‐thread modes. We developed Promela models that abstracted the real file systems and subsequently checked them using a SPIN model checker. Our model was optimized to avoid state explosion during model checking. Using the model checking, we successfully detected corner‐case errors during an unexpected power outage. By analysing counterexamples generated by model checking, we determined an improved file system model capable of preventing errors in most mainstream file system types. Finally, we rechecked the improved file system model and verified the absence of all critical errors.","PeriodicalId":49506,"journal":{"name":"Software Testing Verification & Reliability","volume":"151 1","pages":""},"PeriodicalIF":1.5,"publicationDate":"2022-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79551036","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"IEEE International Conference on Software Testing, Verification and Validation (ICST 2020)","authors":"C. Pasareanu, A. Zeller","doi":"10.1002/stvr.1829","DOIUrl":"https://doi.org/10.1002/stvr.1829","url":null,"abstract":"This special issue contains articles which are extended versions of some of the best papers presented at the IEEE International Conference on Software Testing, Verification and Validation (ICST 2020). ICST is intended as a common forum for researchers, scientists, engineers and practitioners throughout the world to present their latest research findings, ideas, developments and applications in the area of Software Testing, Verification and Validation. The articles are ‘ Fostering the Diversity of Exploratory Testing in Web Applications ’ , by Leveau et al., ‘ RVPRIO: a Tool for Prioritizing Runtime Verification Violations ’ , by Cabral et al., and ‘ Automated Black-Box Testing of Nominal and Error Scenarios in RESTful APIs ’ , by Corradini et al., covering diverse topics in software testing and verification. In the first article, the authors investigate exploratory testing, a form of software testing that leverages business expertise, in the context of web applications. They propose a new approach that monitors online interactions performed by testers to suggest new interactions, thus enabling deeper explorations of the applications. In the second article, the authors leverage machine learning to prioritise violations reported by runtime verification, leading to the discovery of previously unknown bugs in open-source projects. In the third article, the authors develop black-box testing techniques for RESTful APIs, a mainstream approach for web API design, leading to the discovery of new faults in already deployed web services. We would like to thank the authors for submitting their contributions and the reviewers for their excellent job. We would also like to thank Rob Hierons for kind guidance and great patience with this volume.","PeriodicalId":49506,"journal":{"name":"Software Testing Verification & Reliability","volume":"18 1","pages":""},"PeriodicalIF":1.5,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72519647","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The integration of machine learning into automated test generation: A systematic mapping study","authors":"Afonso Fontes, Gregory Gay","doi":"10.1002/stvr.1845","DOIUrl":"https://doi.org/10.1002/stvr.1845","url":null,"abstract":"Machine learning (ML) may enable effective automated test generation. We characterize emerging research, examining testing practices, researcher goals, ML techniques applied, evaluation, and challenges in this intersection by performing. We perform a systematic mapping study on a sample of 124 publications. ML generates input for system, GUI, unit, performance, and combinatorial testing or improves the performance of existing generation methods. ML is also used to generate test verdicts, property‐based, and expected output oracles. Supervised learning—often based on neural networks—and reinforcement learning—often based on Q‐learning—are common, and some publications also employ unsupervised or semi‐supervised learning. (Semi‐/Un‐)Supervised approaches are evaluated using both traditional testing metrics and ML‐related metrics (e.g., accuracy), while reinforcement learning is often evaluated using testing metrics tied to the reward function. The work‐to‐date shows great promise, but there are open challenges regarding training data, retraining, scalability, evaluation complexity, ML algorithms employed—and how they are applied—benchmarks, and replicability. Our findings can serve as a roadmap and inspiration for researchers in this field.","PeriodicalId":49506,"journal":{"name":"Software Testing Verification & Reliability","volume":"28 1","pages":""},"PeriodicalIF":1.5,"publicationDate":"2022-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75501628","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}