2016 IEEE International Conference on Software Testing, Verification and Validation (ICST)最新文献_第5页

A Framework to Evaluate the Effectiveness of Different Load Testing Analysis Techniques 一个评估不同负载测试分析技术有效性的框架

2016 IEEE International Conference on Software Testing, Verification and Validation (ICST) Pub Date : 2016-04-11 DOI: 10.1109/ICST.2016.9

Ruoyu Gao, Z. Jiang, C. Barna, Marin Litoiu

{"title":"A Framework to Evaluate the Effectiveness of Different Load Testing Analysis Techniques","authors":"Ruoyu Gao, Z. Jiang, C. Barna, Marin Litoiu","doi":"10.1109/ICST.2016.9","DOIUrl":"https://doi.org/10.1109/ICST.2016.9","url":null,"abstract":"Large-scale software systems like Amazon and eBay must be load tested to ensure they can handle hundreds and millions of current requests in the field. Load testing usually lasts for a few hours or even days and generates large volumes of system behavior data (execution logs and counters). This data must be properly analyzed to check whether there are any performance problems in a load test. However, the sheer size of the data prevents effective manual analysis. In addition, unlike functional tests, there is usually no test oracle associated with a load test. To cope with these challenges, there have been many analysis techniques proposed to automatically detect problems in a load test by comparing the behavior of the current test against previous test(s). Unfortunately, none of these techniques compare their performance against each other. In this paper, we have proposed a framework, which evaluates and compares the effectiveness of different test analysis techniques. We have evaluated a total of 23 test analysis techniques using load testing data from three open source systems. Based on our experiments, we have found that all the test analysis techniques can effectively build performance models using data from both buggy or non-buggy tests and flag the performance deviations between them. It is more cost-effective to compare the current test against two recent previous test(s), while using testing data collected under longer sampling intervals (≥180 seconds). Among all the test analysis techniques, Control Chart, Descriptive Statistics and Regression Tree yield the best performance. Our evaluation framework and findings can be very useful for load testing practitioners and researchers. To encourage further research on this topic, we have made our testing data publicity available to download.","PeriodicalId":155554,"journal":{"name":"2016 IEEE International Conference on Software Testing, Verification and Validation (ICST)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125564322","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 16

Canopus: A Domain-Specific Language for Modeling Performance Testing Canopus:用于建模性能测试的领域特定语言

2016 IEEE International Conference on Software Testing, Verification and Validation (ICST) Pub Date : 2016-04-11 DOI: 10.1109/ICST.2016.13

Maicon Bernardino, A. Zorzo, E. Rodrigues

引用次数: 26

Profiting from Unit Tests for Integration Testing 从集成测试的单元测试中获利

2016 IEEE International Conference on Software Testing, Verification and Validation (ICST) Pub Date : 2016-04-11 DOI: 10.1109/ICST.2016.28

Dominik Holling, Andreas Hofbauer, A. Pretschner, Matthias Gemmar

{"title":"Profiting from Unit Tests for Integration Testing","authors":"Dominik Holling, Andreas Hofbauer, A. Pretschner, Matthias Gemmar","doi":"10.1109/ICST.2016.28","DOIUrl":"https://doi.org/10.1109/ICST.2016.28","url":null,"abstract":"In practice, integration testing typically focuses on a small selection of components or subsystems to integrate and test. This reduces the effort required to create test cases and test environments. However, many defects are only detected when performing integration testing on all possible integrations. These defects are typically only detected later in the development process and lead to increased testing and fault localization efforts. By describing and operationalizing knowledge of such defects, we are able to (semi-)automatically detect them in integration testing. Our OUTFIT tool targets superfluous or missing functionality and untested exception/fault handling in Matlab Simulink models and generated code. It re-uses existing, or automatically generates, high coverage test cases to measure coverage in an opportunistically assembled integration of components or subsystems. A manual inspection of the coverage results then reveals missing or potentially superfluous behavior and thus reveals defects of the targeted kind. Used in a bottom-up integration testing strategy, OUTFIT front loads the detection of such defects, and reduces the fault localization effort. We evaluate OUTFIT using three components of a real-world electrical engine control system of a hybrid car. We find that the results are reproducible, effective and efficiently produced. The achieved coverage of the results is reproducible for 10 executions within a small standard deviation. OUTFIT is effective in finding a potential defect, and efficiently analyzes all evaluated components within a worst case execution time of 110 minutes.","PeriodicalId":155554,"journal":{"name":"2016 IEEE International Conference on Software Testing, Verification and Validation (ICST)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116853499","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 11

A Theoretical Framework for Understanding Mutation-Based Testing Methods 理解基于突变的测试方法的理论框架

2016 IEEE International Conference on Software Testing, Verification and Validation (ICST) Pub Date : 2016-01-25 DOI: 10.1109/ICST.2016.22

Donghwan Shin, Doo-Hwan Bae

{"title":"A Theoretical Framework for Understanding Mutation-Based Testing Methods","authors":"Donghwan Shin, Doo-Hwan Bae","doi":"10.1109/ICST.2016.22","DOIUrl":"https://doi.org/10.1109/ICST.2016.22","url":null,"abstract":"In the field of mutation analysis, mutation is the systematic generation of mutated programs (i.e., mutants) from an original program. The concept of mutation has been widely applied to various testing problems, including test set selection, fault localization, and program repair. However, surprisingly little focus has been given to the theoretical foundation of mutation-based testing methods, making it difficult to understand, organize, and describe various mutation-based testing methods. This paper aims to consider a theoretical framework for understanding mutation-based testing methods. While there is a solid testing framework for general testing, this is incongruent with mutation-based testing methods, because it focuses on the correctness of a program for a test, while the essence of mutation-based testing concerns the differences between programs (including mutants) for a test. In this paper, we begin the construction of our framework by defining a novel testing factor, called a test differentiator, to transform the paradigm of testing from the notion of correctness to the notion of difference. We formally define behavioral differences of programs for a set of tests as a mathematical vector, called a d-vector. We explore the multi-dimensional space represented by d-vectors, and provide a graphical model for describing the space. Based on our framework and formalization, we interpret existing mutation-based fault localization methods and mutant set minimization as applications, and identify novel implications for future work.","PeriodicalId":155554,"journal":{"name":"2016 IEEE International Conference on Software Testing, Verification and Validation (ICST)","volume":"55 1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-01-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134449642","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 20

Test Set Diameter: Quantifying the Diversity of Sets of Test Cases 测试集直径:量化测试用例集的多样性

2016 IEEE International Conference on Software Testing, Verification and Validation (ICST) Pub Date : 2015-06-10 DOI: 10.1109/ICST.2016.33

R. Feldt, Simon M. Poulding, D. Clark, S. Yoo

{"title":"Test Set Diameter: Quantifying the Diversity of Sets of Test Cases","authors":"R. Feldt, Simon M. Poulding, D. Clark, S. Yoo","doi":"10.1109/ICST.2016.33","DOIUrl":"https://doi.org/10.1109/ICST.2016.33","url":null,"abstract":"A common and natural intuition among software testers is that test cases need to differ if a software system is to be tested properly and its quality ensured. Consequently, much research has gone into formulating distance measures for how test cases, their inputs and/or their outputs differ. However, common to these proposals is that they are data type specific and/or calculate the diversity only between pairs of test inputs, traces or outputs. We propose a new metric to measure the diversity of sets of tests: the test set diameter (TSDm). It extends our earlier, pairwise test diversity metrics based on recent advances in information theory regarding the calculation of the normalized compression distance (NCD) for multisets. A key advantage is that TSDm is a universal measure of diversity and so can be applied to any test set regardless of data type of the test inputs (and, moreover, to other test-related data such as execution traces). But this universality comes at the cost of greater computational effort compared to competing approaches. Our experiments on four different systems show that the test set diameter can help select test sets with higher structural and fault coverage than random selection even when only applied to test inputs. This can enable early test design and selection, prior to even having a software system to test, and complement other types of test automation and analysis. We argue that this quantification of test set diversity creates a number of opportunities to better understand software quality and provides practical ways to increase it.","PeriodicalId":155554,"journal":{"name":"2016 IEEE International Conference on Software Testing, Verification and Validation (ICST)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115482865","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 106

Making System User Interactive Tests Repeatable: When and What Should we Control? 使系统用户交互测试可重复:我们应该控制什么时间和什么内容?

2016 IEEE International Conference on Software Testing, Verification and Validation (ICST) Pub Date : 2015-05-16 DOI: 10.1109/ICST.2016.53

Zebao Gao

引用次数: 41