{"title":"Message from the AST 2021 Program Chairs","authors":"","doi":"10.1109/ast52587.2021.00005","DOIUrl":"https://doi.org/10.1109/ast52587.2021.00005","url":null,"abstract":"","PeriodicalId":315603,"journal":{"name":"2021 IEEE/ACM International Conference on Automation of Software Test (AST)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129307676","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Alessio Bucaioni, F. D. Silvestro, I. Singh, Mehrdad Saadatmand, H. Muccini, Thorvaldur Jochumsson
{"title":"Model-based Automation of Test Script Generation Across Product Variants: a Railway Perspective","authors":"Alessio Bucaioni, F. D. Silvestro, I. Singh, Mehrdad Saadatmand, H. Muccini, Thorvaldur Jochumsson","doi":"10.1109/AST52587.2021.00011","DOIUrl":"https://doi.org/10.1109/AST52587.2021.00011","url":null,"abstract":"In this work, we report on our experience in defining and applying a model-based approach for the automatic generation of test scripts for product variants in software product lines. The proposed approach is the result of an effort leveraging the experiences and results from the technology transfer activities with our industrial partner Bombardier Transportation. The proposed approach employs metamodelling and model transformations for representing different testing artefacts and making their generation automatic. We demonstrate the industrial applicability and efficiency of the proposed approach using the Bombardier Transportation Aventra software product line. We observe that the proposed approach mitigates the development effort, time consumption and consistency drawbacks typical of traditional strategies.","PeriodicalId":315603,"journal":{"name":"2021 IEEE/ACM International Conference on Automation of Software Test (AST)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115489893","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A framework for the automation of testing computer vision systems","authors":"F. Wotawa, Lorenz Klampfl, Ledio Jahaj","doi":"10.1109/AST52587.2021.00023","DOIUrl":"https://doi.org/10.1109/AST52587.2021.00023","url":null,"abstract":"Vision systems, i.e., systems that enable the detection and tracking of objects in images, have gained substantial importance over the past decades. They are used in quality assurance applications, e.g., for finding surface defects in products during manufacturing, surveillance, but also automated driving, requiring reliable behavior. Interestingly, there is only little work on quality assurance and especially testing of vision systems in general. In this paper, we contribute to the area of testing vision software, and present a framework for the automated generation of tests for systems based on vision and image recognition with the focus on easy usage, uniform usability and expandability. The framework makes use of existing libraries for modifying the original images and to obtain similarities between the original and modified images. We show how such a framework can be used for testing a particular industrial application on identifying defects on riblet surfaces and present preliminary results from the image classification domain.","PeriodicalId":315603,"journal":{"name":"2021 IEEE/ACM International Conference on Automation of Software Test (AST)","volume":"74 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115178618","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Towards exhaustive branch coverage with PathCrawler","authors":"Nicky Williams","doi":"10.1109/AST52587.2021.00022","DOIUrl":"https://doi.org/10.1109/AST52587.2021.00022","url":null,"abstract":"Branch coverage of source code is a very widely used test criterion. Moreover, branch coverage is a similar problem to line coverage, MC/DC and the coverage of assertion violations, certain runtime errors and various other types of test objective. Indeed, establishing that a large number of test objectives are unreachable, or conversely, providing the test inputs which reach them, is at the heart of many verification tasks. However, automatic test generation for exhaustive branch coverage remains an elusive goal: many modern tools obtain high coverage scores without being able to provide an explanation for why some branches are not covered, such as a demonstration that they are unreachable. Concolic test generation offers the promise of exhaustive coverage but covers paths more efficiently than branches. In this paper, I explain why, and propose different strategies to improve its performance on exhaustive branch coverage. A comparison of these strategies on examples of real code shows promising results.","PeriodicalId":315603,"journal":{"name":"2021 IEEE/ACM International Conference on Automation of Software Test (AST)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121826648","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multimodal Surprise Adequacy Analysis of Inputs for Natural Language Processing DNN Models","authors":"Seah Kim, Shin Yoo","doi":"10.1109/AST52587.2021.00017","DOIUrl":"https://doi.org/10.1109/AST52587.2021.00017","url":null,"abstract":"As Deep Neural Networks (DNNs) are rapidly adopted in various domains, many test adequacy metrics for DNN inputs have been introduced to help evaluating, and validating, trained DNN models. Surprise Adequacy (SA) is one such metric that aims to quantitatively measure how surprising a new input is with respect to the data used to train the given model. While SA has been shown to be effective for computer vision tasks such as image classification or object segmentation, its efficacy for DNN based Natural Language Processing has not been thoroughly studied. This paper evaluates whether it is feasible to apply SA analysis to DNN models trained for NLP tasks. We also show that the input distribution captured in the latent embedding space can be multimodal1 for some NLP tasks, unlike those observed in computer vision tasks, and investigate if catering for the multimodal property of NLP models can improve SA analysis. An empirical evaluation of extended SA metrics with three NLP tasks and nine DNN models shows that, while unimodal SAs perform sufficiently well for text classification, multimodal SA can outperform unimodal metrics.","PeriodicalId":315603,"journal":{"name":"2021 IEEE/ACM International Conference on Automation of Software Test (AST)","volume":"118 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115680248","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Test Suites as a Source of Training Data for Static Analysis Alert Classifiers","authors":"Lori Flynn, William Snavely, Z. Kurtz","doi":"10.1109/AST52587.2021.00019","DOIUrl":"https://doi.org/10.1109/AST52587.2021.00019","url":null,"abstract":"Flaw-finding static analysis tools typically generate large volumes of code flaw alerts including many false positives. To save on human effort to triage these alerts, a significant body of work attempts to use machine learning to classify and prioritize alerts. Identifying a useful set of training data, however, remains a fundamental challenge in developing such classifiers in many contexts. We propose using static analysis test suites (i.e., repositories of \"benchmark\" programs that are purpose-built to test coverage and precision of static analysis tools) as a novel source of training data. In a case study, we generated a large quantity of alerts by executing various static analyzers on the Juliet C/C++ test suite, and we automatically derived ground truth labels for these alerts by referencing the Juliet test suite metadata. Finally, we used this data to train classifiers to predict whether an alert is a false positive. Our classifiers obtained high precision (90.2%) and recall (88.2%) for a large number of code flaw types on a hold-out test set. This preliminary result suggests that pre-training classifiers on test suite data could help to jumpstart static analysis alert classification in data-limited contexts.","PeriodicalId":315603,"journal":{"name":"2021 IEEE/ACM International Conference on Automation of Software Test (AST)","volume":"11 31","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114044345","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Haiyang Sun, Andrea Rosà, Daniele Bonetta, Walter Binder
{"title":"Automatically Assessing and Extending Code Coverage for NPM Packages","authors":"Haiyang Sun, Andrea Rosà, Daniele Bonetta, Walter Binder","doi":"10.1109/AST52587.2021.00013","DOIUrl":"https://doi.org/10.1109/AST52587.2021.00013","url":null,"abstract":"Typical Node.js applications extensively rely on packages hosted in the npm registry. As such packages may be used by thousands of other packages or applications, it is important to assess their code coverage. Moreover, increasing code coverage may help detect previously unknown issues. In this paper, we introduce TESA, a new tool that automatically assembles a test suite for any package in the npm registry. The test suite includes 1) tests written for the target package and usually hosted in its development repository, and 2) tests selected from dependent packages. The former tests allow assessing the code coverage of the target package, while the latter ones can increase code coverage by exploiting third-party tests that also exercise code in the target package. We use TESA to assess the code coverage of 500 popular npm packages. Then, we demonstrate that TESA can significantly increase code coverage by including tests from dependent packages. Finally, we show that the test suites assembled by TESA increase the effectiveness of existing dynamic program analyses to identify performance issues that are not detectable when only executing the developer’s tests.","PeriodicalId":315603,"journal":{"name":"2021 IEEE/ACM International Conference on Automation of Software Test (AST)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123610436","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Continuous Testing Improvement Model","authors":"M. A. Mascheroni, E. Irrazábal, G. Rossi","doi":"10.1109/AST52587.2021.00020","DOIUrl":"https://doi.org/10.1109/AST52587.2021.00020","url":null,"abstract":"Continuous Delivery is a practice where high-quality software is built in a way that it can be released into production at any time. However, systematic literature reviews and surveys performed as part of this Doctoral Research report that both the literature and the industry are still facing problems related to testing using practices like Continuous Delivery or Continuous Deployment. Thus, we propose Continuous Testing Improvement Model (CTIM) as a solution to the testing problems in continuous software development environments. It brings together proposals and approaches from different authors which are presented as good practices grouped by type of tests and divided into four levels. These levels indicate an improvement hierarchy and an evolutionary path in the implementation of Continuous Testing. Also, an application called EvalCTIM was developed to support the appraisal of a testing process using the proposed model. Finally, to validate the model, an action-research methodology was employed through an interpretive theoretical evaluation followed by case studies conducted in real software development projects. After several improvements made as part of the validation outcomes, the results demonstrate that the model can be used as a solution for implementing Continuous Testing gradually at companies using Continuous Deployment or Continuous Delivery and measuring its progress.","PeriodicalId":315603,"journal":{"name":"2021 IEEE/ACM International Conference on Automation of Software Test (AST)","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126581775","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"SuMo: A Mutation Testing Strategy for Solidity Smart Contracts","authors":"Morena Barboni, A. Morichetta, A. Polini","doi":"10.1109/AST52587.2021.00014","DOIUrl":"https://doi.org/10.1109/AST52587.2021.00014","url":null,"abstract":"Smart Contracts are software programs that are deployed and executed within a blockchain infrastructure. Due to their immutable nature, directly resulting from the specific characteristics of the deploying infrastructure, smart contracts must be thoroughly tested before their release. Testing is one of the main activities that can help to improve the reliability of a smart contract, so as to possibly prevent considerable loss of valuable assets. It is therefore important to provide the testers with tools that permit them to assess the activity they performed.Mutation testing is a powerful approach for assessing the fault-detection capability of a test suite. In this paper, we propose SuMo, a novel mutation testing tool for Ethereum Smart Contracts. SuMo implements a set of 44 mutation operators that were designed starting from the latest Solidity documentation, and from well-known mutation testing tools. These allow to simulate a wide variety of faults that can be made by smart contract developers. The set of operators was designed to limit the generation of stillborn mutants, which slow down the mutation testing process and limit the usability of the tool. We report a first evaluation of SuMo on open-source projects for which test suites were available. The results we got are encouraging, and they suggest that SuMo can effectively help developers to deliver more reliable smart contracts.","PeriodicalId":315603,"journal":{"name":"2021 IEEE/ACM International Conference on Automation of Software Test (AST)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132825183","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Aitor Arrieta, J. Ayerdi, M. Illarramendi, Aitor Agirre, Goiuria Sagardui Mendieta, Maite Arratibel
{"title":"Using Machine Learning to Build Test Oracles: an Industrial Case Study on Elevators Dispatching Algorithms","authors":"Aitor Arrieta, J. Ayerdi, M. Illarramendi, Aitor Agirre, Goiuria Sagardui Mendieta, Maite Arratibel","doi":"10.1109/AST52587.2021.00012","DOIUrl":"https://doi.org/10.1109/AST52587.2021.00012","url":null,"abstract":"The software of elevators requires maintenance over several years to deal with new functionality, correction of bugs or legislation changes. To automatically validate this software, test oracles are necessary. A typical approach in industry is to use regression oracles. These oracles have to execute the test input both, in the software version under test and in a previous software version. This practice has several issues when using simulation to test elevators dispatching algorithms at system level. These issues include a long test execution time and the impossibility of re-using test oracles both at different test levels and in operation. To deal with these issues, we propose DARIO, a test oracle that relies on regression learning algorithms to predict the Qualify of Service of the system. The regression learning algorithms of this oracle are trained by using data from previously tested versions. An empirical evaluation with an industrial case study demonstrates the feasibility of using our approach in practice. A total of five regression learning algorithms were validated, showing that the regression tree algorithm performed best. For the regression tree algorithm, the accuracy when predicting verdicts by DARIO ranged between 79 to 87%.","PeriodicalId":315603,"journal":{"name":"2021 IEEE/ACM International Conference on Automation of Software Test (AST)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128988622","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}