{"title":"FILO: FIx-LOcus Recommendation for Problems Caused by Android Framework Upgrade","authors":"M. Mobilio, O. Riganelli, D. Micucci, L. Mariani","doi":"10.1109/ISSRE.2019.00043","DOIUrl":"https://doi.org/10.1109/ISSRE.2019.00043","url":null,"abstract":"Dealing with the evolution of operating systems is challenging for developers of mobile apps, who have to deal with frequent upgrades that often include backward incompatible changes of the underlying API framework. As a consequence of framework upgrades, apps may show misbehaviours and unexpected crashes once executed within an evolved environment. Identifying the portion of the app that must be modified to correctly execute on a newly released operating system can be challenging. Although incompatibilities are visibile at the level of the interactions between the app and its execution environment, the actual methods to be changed are often located in classes that do not directly interact with any external element. To facilitate debugging activities for problems introduced by backward incompatible upgrades of the operating system, this paper presents FILO, a technique that can recommend the method that must be changed to implement the fix from the analysis of a single failing execution. FILO can also select key symptomatic anomalous events that can help the developer understanding the reason of the failure and facilitate the implementation of the fix. Our evaluation with multiple known compatibility problems introduced by Android upgrades shows that FILO can effectively and efficiently identify the faulty methods in the apps.","PeriodicalId":254749,"journal":{"name":"2019 IEEE 30th International Symposium on Software Reliability Engineering (ISSRE)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129032382","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Felix Dobslaw, R. Feldt, David Michaëlsson, Patrick Haar, F. D. O. Neto, Richard Torkar
{"title":"Estimating Return on Investment for GUI Test Automation Frameworks","authors":"Felix Dobslaw, R. Feldt, David Michaëlsson, Patrick Haar, F. D. O. Neto, Richard Torkar","doi":"10.1109/ISSRE.2019.00035","DOIUrl":"https://doi.org/10.1109/ISSRE.2019.00035","url":null,"abstract":"Automated graphical user interface (GUI) tests can reduce manual testing activities and increase test frequency. This motivates the conversion of manual test cases into automated GUI tests. However, it is not clear whether such automation is cost-effective given that GUI automation scripts add to the code base and demand maintenance as a system evolves. In this paper, we introduce a method for estimating maintenance cost and Return on Investment (ROI) for Automated GUI Testing (AGT). The method utilizes the existing source code change history and has the potential to be used for the evaluation of other testing or quality assurance automation technologies. We evaluate the method for a real-world, industrial software system and compare two fundamentally different AGT frameworks, namely Selenium and EyeAutomate, to estimate and compare their ROI. We also report on their defect-finding capabilities and usability. The quantitative data is complemented by interviews with employees at the company the study has been conducted at. The method was successfully applied, and estimated maintenance cost and ROI for both frameworks are reported. Overall, the study supports earlier results showing that implementation time is the leading cost for introducing AGT. The findings further suggest that, while EyeAutomate tests are significantly faster to implement, Selenium tests require more of a programming background but less maintenance.","PeriodicalId":254749,"journal":{"name":"2019 IEEE 30th International Symposium on Software Reliability Engineering (ISSRE)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130614287","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Back to Basics - Redefining Quality Measurement for Hybrid Software Development Organizations","authors":"Satyabrata Pradhan, Venky Nanniyur","doi":"10.1109/ISSRE.2019.00047","DOIUrl":"https://doi.org/10.1109/ISSRE.2019.00047","url":null,"abstract":"As the software industry transitions from a license-based model to a subscription-based Software-as-aService (SaaS) model, many software development groups are using a hybrid development model that incorporates Agile and Waterfall methodologies in different parts of the organization. The traditional metrics used for measuring software quality in Waterfall or Agile paradigms do not apply to this new hybrid methodology. In addition, to respond to higher quality demands from customers and to gain a competitive advantage in the market, many companies are starting to prioritize quality as a strategic differentiator. As a result, quality metrics are included in the decision-making activities all the way up to the executive level, including Board of Director reviews. This paper presents key challenges associated with measuring software quality in organizations using the hybrid development model. We developed a framework called PIER (Prevention-InspectionEvaluation-Removal) to provide a comprehensive metric definition for hybrid organizations. The framework includes quality measurements, quality enforcement, and quality decision points at different organizational levels and project milestones during the software development life cycle (SDLC). The metrics framework defined in this paper is being used for all Cisco Systems products used in customer premises. Preliminary field metrics data for one of the product groups show quality improvement after implementation of the proposed measurement system.","PeriodicalId":254749,"journal":{"name":"2019 IEEE 30th International Symposium on Software Reliability Engineering (ISSRE)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125625633","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yaohui Wang, Hui Xu, Yangfan Zhou, Michael R. Lyu, Xin Wang
{"title":"Textout: Detecting Text-Layout Bugs in Mobile Apps via Visualization-Oriented Learning","authors":"Yaohui Wang, Hui Xu, Yangfan Zhou, Michael R. Lyu, Xin Wang","doi":"10.1109/ISSRE.2019.00032","DOIUrl":"https://doi.org/10.1109/ISSRE.2019.00032","url":null,"abstract":"Layout bugs commonly exist in mobile apps. Due to the fragmentation issues of smartphones, a layout bug may occur only on particular versions of smartphones. It is quite challenging to detect such bugs for state-of-the-art commercial automated testing platforms, although they can test an app with thousands of different smartphones in parallel. The main reason is that typical layout bugs neither crash an app nor generate any error messages. In this paper, we present our work for detecting text-layout bugs, which account for a large portion of layout bugs. We model text-layout bug detection as a classification problem. This then allows us to address it with sophisticated image processing and machine learning techniques. To this end, we propose an approach which we call Textout. Textout takes screenshots as its input and adopts a specifically-tailored text detection method and a convolutional neural network (CNN) classifier to perform automatic text-layout bug detection. We collect 33,102 text-region images as our training dataset and verify the effectiveness of our tool with 1,481 text-region images collected from real-world apps. Textout achieves an AUC (area under the curve) of 0.956 on the test dataset and shows an acceptable overhead. The dataset is open-source released for follow-up research.","PeriodicalId":254749,"journal":{"name":"2019 IEEE 30th International Symposium on Software Reliability Engineering (ISSRE)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133864393","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Test Case Generation Based on Client-Server of Web Applications by Memetic Algorithm","authors":"Wen Wang, Xiaohong Guo, Zheng Li, Ruilian Zhao","doi":"10.1109/ISSRE.2019.00029","DOIUrl":"https://doi.org/10.1109/ISSRE.2019.00029","url":null,"abstract":"Currently, more than 90% web applications are potentially vulnerable to attacks from both the client side and server side. Test case generation plays a crucial role in testing web applications, where most existing studies focus on test case generation either from client-side or from server-side to detect vulnerabilities, regardless of the interactions between client and server. Consequently, it is difficult for those test cases to discover certain faults which involve both client and server. In this paper, the server-side sensitive paths are considered as vulnerable code paths due to insufficient or erroneous filtering mechanisms. An evolutionary testing approach based on the memetic algorithm is proposed to connect the server-side and client-side, in which test cases are generated from the client-side behavior model, while guided by the coverage of sensitive paths from server-side. The experiments are conducted on four open source web applications, and the results demonstrate that our approach can generate test cases from the client-side behavior model that can cover the server-side sensitive paths, on which the vulnerabilities can be detected more effectively.","PeriodicalId":254749,"journal":{"name":"2019 IEEE 30th International Symposium on Software Reliability Engineering (ISSRE)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129405421","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Criteria to Systematically Evaluate (Safety) Assurance Cases","authors":"T. Chowdhury, Alan Wassyng, R. Paige, M. Lawford","doi":"10.1109/ISSRE.2019.00045","DOIUrl":"https://doi.org/10.1109/ISSRE.2019.00045","url":null,"abstract":"An assurance case (AC) captures explicit reasoning associated with assuring critical properties, such as safety. A vital attribute of an AC is that it facilitates the identification of fallacies in the validity of any claim. There is considerable published research related to confidence in ACs, which primarily relate to a measure of soundness of reasoning. Evaluation of an AC is more general than measuring confidence and considers multiple aspects of the quality of an AC. Evaluation criteria thus play a significant role in making the evaluation process more systematic. This paper contributes to the identification of effective evaluation criteria for ACs, the rationale for their use, and initial tests of the criteria on existing ACs. We classify these criteria as to whether they apply to the structure of the AC, or to the content of the AC. This paper focuses on safety as the critical property to be assured, but only a very small number of the criteria are specific to safety, and can serve as placeholders for evaluation criteria specific to other critical properties. All of the other evaluation criteria are generic. This separation is useful when evaluating ACs developed using different notations, and when evaluating ACs against safety standards. We explore the rationale for these criteria as well as the way they are used by the developers of the AC and also when they are used by a third-party evaluator.","PeriodicalId":254749,"journal":{"name":"2019 IEEE 30th International Symposium on Software Reliability Engineering (ISSRE)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123923116","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
W. Oizumi, L. Sousa, Anderson Oliveira, L. Carvalho, Alessandro F. Garcia, T. Colanzi, R. Oliveira
{"title":"On the Density and Diversity of Degradation Symptoms in Refactored Classes: A Multi-case Study","authors":"W. Oizumi, L. Sousa, Anderson Oliveira, L. Carvalho, Alessandro F. Garcia, T. Colanzi, R. Oliveira","doi":"10.1109/ISSRE.2019.00042","DOIUrl":"https://doi.org/10.1109/ISSRE.2019.00042","url":null,"abstract":"Root canal refactoring is a software development activity that is intended to improve dependability-related attributes such as modifiability and reusability. Despite being an activity that contributes to these attributes, deciding when applying root canal refactoring is far from trivial. In fact, finding which elements should be refactored is not a cut-and-dried task. One of the main reasons is the lack of consensus on which characteristics indicate the presence of structural degradation. Thus, we evaluated whether the density and diversity of multiple automatically detected symptoms can be used as consistent indicators of the need for root canal refactoring. To achieve our goal, we conducted a multi-case exploratory study involving 6 open source systems and 2 systems from our industry partners. For each system, we identified the classes that were changed through one or more root canal refactorings. After that, we compared refactored and non-refactored classes with respect to the density and diversity of degradation symptoms. We also investigated if the most recurrent combinations of symptoms in refactored classes can be used as strong indicators of structural degradation. Our results show that refactored classes usually present higher density and diversity of symptoms than non-refactored classes. However, root canal refactorings that are performed by developers in practice may not be enough for reducing degradation, since the vast majority had little to no impact on the density and diversity of symptoms. Finally, we observed that symptom combinations in refactored classes are similar to the combinations in non-refactored classes. Based on our findings, we elicited an initial set of requirements for automatically recommending root canal refactorings.","PeriodicalId":254749,"journal":{"name":"2019 IEEE 30th International Symposium on Software Reliability Engineering (ISSRE)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128455093","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Inferring Performance Bug Patterns from Developer Commits","authors":"Yiqun Chen, Stefan Winter, N. Suri","doi":"10.1109/ISSRE.2019.00017","DOIUrl":"https://doi.org/10.1109/ISSRE.2019.00017","url":null,"abstract":"Performance bugs, i.e., program source code that is unnecessarily inefficient, have received significant attention by the research community in recent years. A number of empirical studies have investigated how these bugs differ from \"ordinary\" bugs that cause functional deviations and several approaches to aid their detection, localization, and removal have been proposed. Many of these approaches focus on certain subclasses of performance bugs, e.g., those resulting from redundant computations or unnecessary synchronization, and the evaluation of their effectiveness is usually limited to a small number of known instances of these bugs. To provide researchers working on performance bug detection and localization techniques with a larger corpus of performance bugs to evaluate against, we conduct a study of more than 700 performance bug fixing commits across 13 popular open source projects written in C and C++ and investigate the relative frequency of bug types as well as their complexity. Our results show that many of these fixes follow a small set of bug patterns, that they are contributed by experienced developers, and that the number of lines needed to fix performance bugs is highly project dependent.","PeriodicalId":254749,"journal":{"name":"2019 IEEE 30th International Symposium on Software Reliability Engineering (ISSRE)","volume":"44 4","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114081392","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
D. Santiago, Justin Phillips, Patrick Alt, Brian R. Muras, Tariq M. King, Peter J. Clarke
{"title":"Machine Learning and Constraint Solving for Automated Form Testing","authors":"D. Santiago, Justin Phillips, Patrick Alt, Brian R. Muras, Tariq M. King, Peter J. Clarke","doi":"10.1109/ISSRE.2019.00030","DOIUrl":"https://doi.org/10.1109/ISSRE.2019.00030","url":null,"abstract":"In recent years there has been a focus on the automatic generation of test cases using white box testing techniques, however the same cannot be said for the generation of test cases at the system-level from natural language system requirements. Some of the white-box techniques include: the use of constraint solvers for the automatic generation of test inputs at the white box level; the use of control flow graphs generated from code; and the use of path generation and symbolic execution to generate test inputs and test for path feasibility. Techniques such as boundary value analysis (BVA) may also be used for generating stronger test suites. However, for black box testing we rely on specifications or implicit requirements and spend considerable time and effort designing and executing test cases. This paper presents an approach that leverages natural language processing and machine learning techniques to capture black box system behavior in the form of constraints. Constraint solvers are then used to generate test cases using BVA and equivalence class partitioning. We also conduct a proof of concept that applies this approach to a simplified task management application and an enterprise job recruiting application.","PeriodicalId":254749,"journal":{"name":"2019 IEEE 30th International Symposium on Software Reliability Engineering (ISSRE)","volume":"52 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122475037","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Evaluation of Anomaly Detection Algorithms Made Easy with RELOAD","authors":"T. Zoppi, A. Ceccarelli, A. Bondavalli","doi":"10.1109/ISSRE.2019.00051","DOIUrl":"https://doi.org/10.1109/ISSRE.2019.00051","url":null,"abstract":"Anomaly detection aims at identifying patterns in data that do not conform to the expected behavior. Despite anomaly detection has been arising as one of the most powerful techniques to suspect attacks or failures, dedicated support for the experimental evaluation is actually scarce. In fact, existing frameworks are mostly intended for the broad purposes of data mining and machine learning. Intuitive tools tailored for evaluating anomaly detection algorithms for failure and attack detection with an intuitive support to sliding windows are currently missing. This paper presents RELOAD, a flexible and intuitive tool for the Rapid EvaLuation Of Anomaly Detection algorithms. RELOAD is able to automatically i) fetch data from an existing data set, ii) identify the most informative features of the data set, iii) run anomaly detection algorithms, including those based on sliding windows, iv) apply multiple strategies to features and decide on anomalies, and v) provide conclusive results following an extensive set of metrics, along with plots of algorithms scores. Finally, RELOAD includes a simple GUI to set up the experiments and examine results. After describing the structure of the tool and detailing inputs and outputs of RELOAD, we exercise RELOAD to analyze an intrusion detection dataset available on a public platform, showing its setup, metric scores and plots.","PeriodicalId":254749,"journal":{"name":"2019 IEEE 30th International Symposium on Software Reliability Engineering (ISSRE)","volume":"80 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115028049","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}