{"title":"High‐coverage metamorphic testing of concurrency support in C compilers","authors":"Matt Windsor, A. Donaldson, John Wickerson","doi":"10.1002/stvr.1812","DOIUrl":"https://doi.org/10.1002/stvr.1812","url":null,"abstract":"We present a technique and automated toolbox for randomized testing of C compilers. Unlike prior compiler‐testing approaches, we generate concurrent test cases in which threads communicate using fine‐grained atomic operations, and we study actual compiler implementations rather than abstract mappings. Our approach is (1) to generate test cases with precise oracles directly from an axiomatization of the C concurrency model; (2) to apply metamorphic fuzzing to each test case, aiming to amplify the coverage they are likely to achieve on compiler codebases; and (3) to execute each fuzzed test case extensively on a range of real machines. Our tool, C4, benefits compiler developers in two ways. First, test cases generated by C4 can achieve line coverage of parts of the LLVM C compiler that are reached by neither the LLVM test suite nor an existing (sequential) C fuzzer. This information can be used to guide further development of the LLVM test suite and can also shed light on where and how concurrency‐related compiler optimizations are implemented. Second, C4 can be used to gain confidence that a compiler implements concurrency correctly. As evidence of this, we show that C4 achieves high strong mutation coverage with respect to a set of concurrency‐related mutants derived from a recent version of LLVM and that it can find historic concurrency‐related bugs in GCC. As a by‐product of concurrency‐focused testing, C4 also revealed two previously unknown sequential compiler bugs in recent versions of GCC and the IBM XL compiler.","PeriodicalId":49506,"journal":{"name":"Software Testing Verification & Reliability","volume":"48 1","pages":""},"PeriodicalIF":1.5,"publicationDate":"2022-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79762517","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Farewell after an 11‐year journey as joint editor‐in‐chief","authors":"R. Hierons","doi":"10.1002/stvr.1816","DOIUrl":"https://doi.org/10.1002/stvr.1816","url":null,"abstract":"","PeriodicalId":49506,"journal":{"name":"Software Testing Verification & Reliability","volume":"139 1","pages":""},"PeriodicalIF":1.5,"publicationDate":"2022-05-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79940594","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Integration testing and metamorphic testing","authors":"Yves Le Traon, Tao Xie","doi":"10.1002/stvr.1817","DOIUrl":"https://doi.org/10.1002/stvr.1817","url":null,"abstract":"The first paper, ‘ Towards using coupling measures to guide black-box integration testing in component-based systems ’ concerns integration testing in component-based systems. The authors investigate the correlation between component and interface coupling measures found in literature and the number of observed failures at two architectural levels: the component level and the software interface level. The finding serves as a first step towards an approach for systematic selection of test cases during integration testing of a distributed component-based software system with black-box components. For example, the number of coupled elements may be an indicator for failure-proneness and can be used to guide test case prioritisation during system integration testing; data-flow-based coupling measurements may not capture the nature of an automotive software system and thus are inapplicable; having a grey box model may improve system integration testing. Overall, prioritising testing of highly coupled components/interfaces can be a valid approach for systematic integration testing. ‘ High-coverage metamorphic testing of concurrency C compilers an approach and automated toolbox randomised testing of C compilers, checking whether C compilers concurrency in accordance the expected C11 semantics. ’ experimental results some interesting code relating concurrency, detects fence","PeriodicalId":49506,"journal":{"name":"Software Testing Verification & Reliability","volume":"14 1","pages":""},"PeriodicalIF":1.5,"publicationDate":"2022-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91274600","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"MuFBDTester: A mutation‐based test sequence generator for FBD programs implementing nuclear power plant software","authors":"Lingjun Liu, Eunkyoung Jee, Doo-Hwan Bae","doi":"10.1002/stvr.1815","DOIUrl":"https://doi.org/10.1002/stvr.1815","url":null,"abstract":"Function block diagram (FBD) is a standard programming language for programmable logic controllers (PLCs). PLCs have been widely used to develop safety‐critical systems such as nuclear reactor protection systems. It is crucial to test FBD programs for such systems effectively. This paper presents an automated test sequence generation approach using mutation testing techniques for FBD programs and the developed tool, MuFBDTester. Given an FBD program, MuFBDTester analyses the program and generates mutated programs based on mutation operators. MuFBDTester translates the given program and mutants into the input language of a satisfiability modulo theories (SMT) solver to derive a set of test sequences. The primary objective is to find the test data that can distinguish between the results of the given program and mutants. We conducted experiments with several examples including real industrial cases to evaluate the effectiveness and efficiency of our approach. With the control of test size, the results indicated that the mutation‐based test suites were statistically more effective at revealing artificial faults than structural coverage‐based test suites. Furthermore, the mutation‐based test suites detected more reproduced faults, found in industrial programs, than structural coverage‐based test suites. Compared to structural coverage‐based test generation time, the time required by MuFBDTester to generate one test sequence from industrial programs is approximately 1.3 times longer; however, it is considered to be worth paying the price for high effectiveness. Using MuFBDTester, the manual effort of creating test suites was significantly reduced from days to minutes due to automated test generation. MuFBDTester can provide highly effective test suites for FBD engineers.","PeriodicalId":49506,"journal":{"name":"Software Testing Verification & Reliability","volume":"89 1","pages":""},"PeriodicalIF":1.5,"publicationDate":"2022-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79473208","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Metamorphic testing and test automation","authors":"R. Hierons, Tao Xie","doi":"10.1002/stvr.1814","DOIUrl":"https://doi.org/10.1002/stvr.1814","url":null,"abstract":"This issue contains two papers. The first paper focuses on metamorphic testing and the second one focuses on test automation.Thefirst paper, ‘ Metamorphic relation prioritization for effective regression testing ’ by Madhusudan Srinivasan and Upulee Kanewala, concerns metamorphic testing. Metamorphic testing (MT) is an approach devised to support the testing of software that is untestable in the sense that it is not feasible to determine, in advance, the expected output for a given test input. The basic idea behind MT is that it is sometimes possible to provide a property (metamorphic relation) over multiple test runs that use inputs that are related in some way. A classic example is that we may not know what the cosine of x should be for some arbitrary x but we do know that cos( x ) should be the same as cos( (cid:1) x ). Previous work has proposed the use of multiple metamorphic relations (MRs), but the authors explore how one might prioritize (order) such MRs. Prioritization is based on information regarding a previous version of the software under test. The authors propose two approaches: prioritize on coverage or on fault detection. Optimization is achieved using a greedy algorithm that is sometimes called Additional Greedy. (Recommended by Dan Hao). The second paper, ‘ Improving test automation maturity: A multivocal literature review ’ by Yuqing Wang, Mika V. Mäntylä, Zihao Liu, Jouni Markkula and Päivi Raulamo-jurvanen, presents a multivocal literature review to survey and synthesize the guidelines given in the literature for improving test automation maturity. The authors select and review 81 primary studies (26 academic literature sources and 55 grey literature sources). From these primary studies, the authors extract 26 test automation best practices along with advice on how to conduct these best practices in forms of implementation/improvement approaches, actions, technical techniques, concepts and experience-based opinions. In particular, the literature review results contribute test automation best practices to suggest steps for improving test automation maturity, narrow the gap between practice and research in terms of the industry ’ s need to improve test automation maturity, provide a centralized knowledge base of existing guidelines for test automation maturity improvement and identify related research challenge and opportunities.","PeriodicalId":49506,"journal":{"name":"Software Testing Verification & Reliability","volume":"8 1","pages":""},"PeriodicalIF":1.5,"publicationDate":"2022-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79357656","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Lucas Cabral, Breno Miranda, Igor Lima, Marcelo d’Amorim
{"title":"RVprio: A tool for prioritizing runtime verification violations","authors":"Lucas Cabral, Breno Miranda, Igor Lima, Marcelo d’Amorim","doi":"10.1002/stvr.1813","DOIUrl":"https://doi.org/10.1002/stvr.1813","url":null,"abstract":"Runtime verification (RV) helps to find software bugs by monitoring formally specified properties during testing. A key problem in using RV during testing is how to reduce the manual inspection effort for checking whether property violations are true bugs. To date, there was no automated approach for determining the likelihood that property violations were true bugs to reduce tedious and time‐consuming manual inspection. We present RVprio, the first automated approach for prioritizing RV violations in order of likelihood of being true bugs. RVprio uses machine learning classifiers to prioritize violations. For training, we used a labelled dataset of 1170 violations from 110 projects. On that dataset, (1) RVprio reached 90% of the effectiveness of a theoretically optimal prioritizer that ranks all true bugs at the top of the ranked list, and (2) 88.1% of true bugs were in the top 25% of RVprio‐ranked violations; 32.7% of true bugs were in the top 10%. RVprio was also effective when we applied it to new unlabelled violations, from which we found previously unknown bugs—54 bugs in 8 open‐source projects. Our dataset is publicly available online.","PeriodicalId":49506,"journal":{"name":"Software Testing Verification & Reliability","volume":"70 1","pages":""},"PeriodicalIF":1.5,"publicationDate":"2022-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83410992","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Dominik Hellhake, J. Bogner, Tobias Schmid, S. Wagner
{"title":"Towards using coupling measures to guide black‐box integration testing in component‐based systems","authors":"Dominik Hellhake, J. Bogner, Tobias Schmid, S. Wagner","doi":"10.1002/stvr.1811","DOIUrl":"https://doi.org/10.1002/stvr.1811","url":null,"abstract":"In component‐based software development, integration testing is a crucial step in verifying the composite behaviour of a system. However, very few formally or empirically validated approaches are available for systematically testing if components have been successfully integrated. In practice, integration testing of component‐based systems is usually performed in a time‐ and resource‐limited context, which further increases the demand for effective test selection strategies. In this work, we therefore analyse the relationship between different component and interface coupling measures found in literature and the distribution of failures found during integration testing of an automotive system. By investigating the correlation for each measure at two architectural levels, we discuss its usefulness to guide integration testing at the software component level as well as for the hardware component level where coupling is measured among multiple electronic control units (ECUs) of a vehicle. Our results indicate that there is a positive correlation between coupling measures and failure‐proneness at both architectural level for all tested measures. However, at the hardware component level, all measures achieved a significantly higher correlation when compared to the software‐level correlation. Consequently, we conclude that prioritizing testing of highly coupled components and interfaces is a valid approach for systematic integration testing, as coupling proved to be a valid indicator for failure‐proneness.","PeriodicalId":49506,"journal":{"name":"Software Testing Verification & Reliability","volume":"57 1","pages":""},"PeriodicalIF":1.5,"publicationDate":"2022-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72686650","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yuqing Wang, M. Mäntylä, Zihao Liu, Jouni Markkula, Päivi Raulamo-Jurvanen
{"title":"Improving test automation maturity: A multivocal literature review","authors":"Yuqing Wang, M. Mäntylä, Zihao Liu, Jouni Markkula, Päivi Raulamo-Jurvanen","doi":"10.1002/stvr.1804","DOIUrl":"https://doi.org/10.1002/stvr.1804","url":null,"abstract":"Mature test automation is key for achieving software quality at speed. In this paper, we present a multivocal literature review with the objective to survey and synthesize the guidelines given in the literature for improving test automation maturity. We selected and reviewed 81 primary studies, consisting of 26 academic literature and 55 grey literature sources. From primary studies, we extracted 26 test automation best practices (e.g., Define an effective test automation strategy, Set up good test environments, and Develop high‐quality test scripts) and collected many pieces of advice (e.g., in forms of implementation/improvement approaches, technical techniques, concepts, and experience‐based heuristics) on how to conduct these best practices. We made main observations: (1) There are only six best practices whose positive effect on maturity improvement have been evaluated by academic studies using formal empirical methods; (2) several technical related best practices in this MLR were not presented in test maturity models; (3) some best practices can be linked to success factors and maturity impediments proposed by other scholars; (4) most pieces of advice on how to conduct proposed best practices were identified from experience studies and their effectiveness need to be further evaluated with cross‐site empirical evidence using formal empirical methods; (5) in the literature, some advice on how to conduct certain best practices are conflicting, and some advice on how to conduct certain best practices still need further qualitative analysis.","PeriodicalId":49506,"journal":{"name":"Software Testing Verification & Reliability","volume":"14 1","pages":""},"PeriodicalIF":1.5,"publicationDate":"2022-02-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86890026","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Combinatorial testing and model‐based testing","authors":"R. Hierons, Tao Xie","doi":"10.1002/stvr.1810","DOIUrl":"https://doi.org/10.1002/stvr.1810","url":null,"abstract":"This issue contains two papers. The first paper focuses on combinatorial testing and the second one focuses on model-based testing. The first paper, ‘Combinatorial methods for testing Internet of Things smart home systems’ by Bernhard Garn, Dominik-Philip Schreiber, Dimitris E. Simos, Rick Kuhn, Jeff Voas, and Raghu Kacker, presents an approach for applying combinatorial testing (CT) to the internal configuration and functionality of Internet of Things (IoT) home automation hub systems. The authors first create an input parameter model of an IoT home automation hub system for use with test generation strategies of combinatorial testing and then propose an automated test execution framework and two test oracles for evaluation purposes. The proposed approach makes use of the appropriately formulated model of the hub and generates test sets derived from this model satisfying certain combinatorial coverage conditions. The authors conduct an evaluation of the proposed approach on a real-world IoT system. The evaluation results show that the proposed approach reveals multiple errors in the devices under test, and all approaches under comparison perform nearly equally well (recommended by W. K. Chan). The second paper, ‘Effective grey-box testing with partial FSM models’ by Robert Sachtleben and Jan Peleska, explores the problem of testing from a finite state machine (FSM) and considers the scenario in which an input can be enabled in some states and disabled in other states. There is already a body of work on testing from FSMs in which inputs are not always defined (partial FSMs), but such work typically allows the system under test (SUT) to be such that some inputs are defined in a state of the SUT but are not defined in the corresponding state of the specification FSM (the SUT can be ‘more’ defined). The paper introduces a conformance relation, called strong reduction, that requires that exactly the same inputs are defined in the specification and the SUT. A new test generation technique is given for strong reduction, with this returning test suites that are complete: a test suite is guaranteed to fail if the SUT is faulty and also satisfies certain conditions that place an upper bound on the number of states of the SUT. The overall approach also requires that the tester can determine which inputs are enabled in the current state of the SUT and so testing is grey-box (recommended by Helene Waeselynck).","PeriodicalId":49506,"journal":{"name":"Software Testing Verification & Reliability","volume":"6 1","pages":""},"PeriodicalIF":1.5,"publicationDate":"2022-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75275755","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Davide Corradini, Amedeo Zampieri, Michele Pasqua, Emanuele Viglianisi, Michael Dallago, M. Ceccato
{"title":"Automated black‐box testing of nominal and error scenarios in RESTful APIs","authors":"Davide Corradini, Amedeo Zampieri, Michele Pasqua, Emanuele Viglianisi, Michael Dallago, M. Ceccato","doi":"10.1002/stvr.1808","DOIUrl":"https://doi.org/10.1002/stvr.1808","url":null,"abstract":"RESTful APIs (or REST APIs for short) represent a mainstream approach to design and develop web APIs using the REpresentational State Transfer architectural style. Black‐box testing, which assumes only the access to the system under test with a specific interface, is the only viable option when white‐box testing is impracticable. This is the case for REST APIs: their source code is usually not (or just partially) available, or a white‐box analysis across many dynamically allocated distributed components (typical of a micro‐services architecture) is computationally challenging. This paper presents RestTestGen, a novel black‐box approach to automatically generate test cases for REST APIs, based on their interface definition (an OpenAPI specification). Input values and requests are generated for each operation of the API under test with the twofold objective of testing nominal execution scenarios and error scenarios. Two distinct oracles are deployed to detect when test cases reveal implementation defects. While this approach is mainly targeting the research community, it is also of interest to developers because, as a black‐box approach, it is universally applicable across different programming languages, or in the case external (compiled only) libraries are used in a REST API. The validation of our approach has been performed on more than 100 of real‐world REST APIs, highlighting the effectiveness of the approach in revealing actual faults in already deployed services.","PeriodicalId":49506,"journal":{"name":"Software Testing Verification & Reliability","volume":"73 1","pages":""},"PeriodicalIF":1.5,"publicationDate":"2022-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73685752","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}