{"title":"Failure Prediction for Cloud Applications through Ensemble Learning","authors":"Jomar Domingos","doi":"10.1109/ISSREW53611.2021.00095","DOIUrl":"https://doi.org/10.1109/ISSREW53611.2021.00095","url":null,"abstract":"Faults are an inherent threat to computers systems and software. Predicting system failures that may occur in the near future will allow preventive actions to avoid or considerably reduce failure impact. In this work, we aim to develop a new methodology to accomplish failure prediction in cloud applications through ensemble machine learning. Our failure prediction approach consists of identifying sequences of system state patterns that precede failures (i.e., symptom detection) using failures datasets (obtained using realistic failure injection) to train different models. These ensembles will be subsequently validated using fault injection. An aspect necessarily addressed in or research is the study of the timing properties of failures and its impact on the failure prediction task, since the feasibility of failure prediction is strictly coupled with the notion of lead time. Failure prediction is feasible if there is enough time to predict the failure and to run prevention measures. Although cloud computing presents characteristics that allow applications to be more dependable (with high availability and reliability through fault tolerance mechanisms), the ability to take countermeasures before failure occurrence will allow to extend cloud based solutions to critical application scenarios. Therefore, machine learning (i.e., ensemble) models to predict failures is a promising path to achieve this goal.","PeriodicalId":385392,"journal":{"name":"2021 IEEE International Symposium on Software Reliability Engineering Workshops (ISSREW)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133058491","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Lama J. Moukahal, Mohammad Zulkernine, Martin Soukup
{"title":"Towards a Secure Software Lifecycle for Autonomous Vehicles","authors":"Lama J. Moukahal, Mohammad Zulkernine, Martin Soukup","doi":"10.1109/ISSREW53611.2021.00104","DOIUrl":"https://doi.org/10.1109/ISSREW53611.2021.00104","url":null,"abstract":"The race for driverless vehicles is on the rise among industry players. Connected and Autonomous Vehicles (CAVs) success is founded on software integration that employs advanced technologies to offer valuable services. Software integration and network connectivity expose vehicles to numerous cyberattacks, making software security development the core factor affecting the reliability and safety of autonomous vehicles. The architecture of CAVs introduces unique challenges for automotive security development and operation that traditional security lifecycles are insufficient to manage. This paper presents a Secure Vehicle Software Engineering (SVSE) lifecycle that ensures security-by-design, devoting security considerations throughout all phases of the vehicle software development process. The SVSE lifecycle incorporates security activities that mitigate the development and operation challenges, reducing cybersecurity violations. It assists the automotive industry in complying with international security standards by granting security considerations throughout the development lifecycle that accommodate the requirements of industrial standards. The SVSE lifecycle promises manageability and deliverability of security practices throughout the full-life span of vehicles, making CAVs more resilient to cyberattacks.","PeriodicalId":385392,"journal":{"name":"2021 IEEE International Symposium on Software Reliability Engineering Workshops (ISSREW)","volume":"71 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131328848","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"MindFI: A Fault Injection Tool for Reliability Assessment of MindSpore Applicacions","authors":"Yang Zheng, Zhenye Feng, Zheng Hu, Ke Pei","doi":"10.1109/ISSREW53611.2021.00068","DOIUrl":"https://doi.org/10.1109/ISSREW53611.2021.00068","url":null,"abstract":"With the emergence of big data and remarkable improvement of computational power, deep neural network (DNN) based intelligent systems, with the superb performance on computer vision, nature language processing, and optimization processing, etc, has been acceleratingly replacing traditional software in various aspects. However, due to the uncertainty of DNN modules learned from data, the intelligent systems are more likely to exhibit incorrect behaviors. Faults in software and hardware are also inevitably in practice, where the hidden defects can easily cause model failure. These will lead to severe accidents and losses in safety- and reliability-critical scenarios, such as autonomous driving. Techniques to test the differences between actual and desired behaviors and evaluate the reliability of DNN applications at faulty conditions is therefore significant for building a trustworthy DNN system. A popular method is fault injection and various fault injection tools have been developed for ML frameworks, such as Tensorflow, PyTorch. In this paper, we present a tool, MindFI, which targets to cover a variety of faults in ML programs written in Mindspore. Data, software and hardware faults can be easily injected in general Mindspore programs. We also use MindFI to evaluate the resilience of several commonly used ML programs against a assessment metrics.","PeriodicalId":385392,"journal":{"name":"2021 IEEE International Symposium on Software Reliability Engineering Workshops (ISSREW)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132690232","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Genetic Algorithm for Scheduling Communication Networks in Time-Triggered Systems-of-Systems","authors":"S. Majidi, R. Obermaisser","doi":"10.1109/ISSREW53611.2021.00053","DOIUrl":"https://doi.org/10.1109/ISSREW53611.2021.00053","url":null,"abstract":"Systems engineering and management have evolved from developing distributed systems to the integration of complex adaptive systems and the advent of Systems of Systems (SoS). The predictable collaboration among constituent systems in an SoS plays a crucial role, especially in time-critical applications. In an SoS, we face independent constituent systems without global knowledge and central control. At the same time, resource reservations using appropriate scheduling algorithms are required to satisfy real-time requirements. Most of the existing scheduling solutions are defined for monolithic systems or complex systems with centralized authorities. In this paper, we present Time-Triggered Systems of Systems (TTSoS) and introduce a scheduling model for SoS applications with real-time requirements. We propose a two-level iterative genetic algorithm (GA) for scheduling the tasks and messages within and between time-triggered constituent systems. The results show the capability of temporal guarantees. Also, the results of the GA scheduler are compared with the performance of local search heuristics for the generated examples.","PeriodicalId":385392,"journal":{"name":"2021 IEEE International Symposium on Software Reliability Engineering Workshops (ISSREW)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114862954","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Predicting gray fault based on context graph in container-based cloud","authors":"Siyu Yu, Ningjiang Chen, Birui Liang","doi":"10.1109/ISSREW53611.2021.00067","DOIUrl":"https://doi.org/10.1109/ISSREW53611.2021.00067","url":null,"abstract":"Distributed Container-based cloud system has the advantages of rapid deployment, efficient virtualization, simplified configuration, and well-scalability. However, good scalability may slow down container-based cloud because it is more vulnerable to gray faults. As a new fault model similar with fail-slow and limping, gray fault has so many root causes that current studies focus only on a certain type of fault are not sufficient. And unlike traditional cloud, container is a black box provided by service providers, making it difficult for traditional API intrusion-based diagnosis methods to implement. A better approach should shield low-level causes from high-level processing. A Gray Fault Prediction Strategy based on Context Graph is proposed according to the correlation between gray faults and application scenarios. From historical data, the performance metrics related to how above context evolve to fault scenarios are established, and scenarios represented by corresponding data are stored in a graph. A scenario will be predicted as a fault scenario, if its isomorphic scenario is found in the graph. The experimental results show that the success rate of prediction is stable at more than 90%, and it is verified the overhead is optimized well.","PeriodicalId":385392,"journal":{"name":"2021 IEEE International Symposium on Software Reliability Engineering Workshops (ISSREW)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114291827","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"IDEA: Runtime Collection of Android Data","authors":"L. Baresi, Kostandin Caushi","doi":"10.1109/ISSREW53611.2021.00055","DOIUrl":"https://doi.org/10.1109/ISSREW53611.2021.00055","url":null,"abstract":"A single Android app is an app family supposed to work well on diverse, heterogeneous devices and on different versions of the operating system. This means that some problems can only be discovered when the app is run on a particular device and a specific version of the operating system. The availability of device data, set preferences, execution logs, measured performance, and actual activity layouts is key for identifying and scoping these problems. The more data one can collect, and analyze, the more accurate fault identification can be. Android does not ease the collection of these data and existing tools -to the best of our knowledge- have huge limitations (e.g., restrictions imposed by the execution model or security constraints). To overcome them, and provide a viable solution, the paper proposes a dedicated library called IDEA (Inclusive Data Extraction for Android). If IDEA is used while implementing the app, a dedicated service can be activated on the device, collects all the aforementioned data, and sends them to a user-defined server, which can then carry out the appropriate analyses. The paper summarizes the limitations that motivated the development of a library, describes what IDEA provides, and presents a first assessment. While we are aware that imposing the use of IDEA for implementing monitorable Android apps can be seen as a quite strong requirement, we are also confident that the benefits can pay off.","PeriodicalId":385392,"journal":{"name":"2021 IEEE International Symposium on Software Reliability Engineering Workshops (ISSREW)","volume":"93 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114441665","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Shuncheng Tang, Zhenya Zhang, Jia Tang, Lei Ma, Yinxing Xue
{"title":"Issue Categorization and Analysis of an Open-Source Driving Assistant System","authors":"Shuncheng Tang, Zhenya Zhang, Jia Tang, Lei Ma, Yinxing Xue","doi":"10.1109/ISSREW53611.2021.00057","DOIUrl":"https://doi.org/10.1109/ISSREW53611.2021.00057","url":null,"abstract":"Autonomous driving system (ADS) has attracted great much attention from both academia and industry in recent years. Since these systems are safety-critical, assurance of their safety and reliability is of great significance. Research efforts have been paid to Level-4 ADS systems to understand their safety concerns and vulnerabilities; however, no progress has been made in Level-2 systems, though they have been deployed more widely. In this work, we focus on an open-source Level-2 driver assistant system, namely, OPENPILOT, and perform an empirical study on the issues raised by developers and users in the developers' communities. We first overview and introduce the logical architecture of OPENPILOT; then, we present our methodologies of collecting pull requests and issues from two developers' communities; as a result, we collect 1293 pull requests, 694 issues, and then we classify them into 5 categories; lastly, we discuss on the strengths and weaknesses of OPENPILOT and the future directions, based on the collected issues. Our work is the first attempt to perform a comprehensive study on the issue analysis for OPENPILOT, and it also motivates more future studies on the systematic testing and analysis of these systems.","PeriodicalId":385392,"journal":{"name":"2021 IEEE International Symposium on Software Reliability Engineering Workshops (ISSREW)","volume":"27 9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134417248","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Disclosing the Fragility Problem of Virtual Safety Testing for Autonomous Driving Systems","authors":"Zhisheng Hu, Shengjian Guo, Zhenyu Zhong, Kang Li","doi":"10.1109/ISSREW53611.2021.00106","DOIUrl":"https://doi.org/10.1109/ISSREW53611.2021.00106","url":null,"abstract":"In order to overcome the insufficiency of road testing in cost and coverage, vendors have been using 3D simulators to conduct virtual safety testing (VST) for autonomous driving. In this work, we report our observations of the unexpected issues in virtual safety testing: 1. Some commonly overlooked factors in simulator usage can subtly affect the running simulation, leading to false positive collision cases; 2. Minor changes in the simulation world can lead to drastically different testing results. We collectively refer to these unexpected or uncertain issues as the VST fragility problem. With the developed concrete cases, we explain the fragility problem in details and reason for the root causes accordingly. Moreover, we propose mitigation guidance based on the underlying causes of the fragility points.","PeriodicalId":385392,"journal":{"name":"2021 IEEE International Symposium on Software Reliability Engineering Workshops (ISSREW)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133682787","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Lukas Beierlieb, Alberto Avritzer, Lukas Iffländer, Nuno Antunes, Aleksandar Milenkoski, Samuel Kounev
{"title":"Software Testing Strategies for Detecting Hypercall Handlers' Aging-related Bugs","authors":"Lukas Beierlieb, Alberto Avritzer, Lukas Iffländer, Nuno Antunes, Aleksandar Milenkoski, Samuel Kounev","doi":"10.1109/ISSREW53611.2021.00043","DOIUrl":"https://doi.org/10.1109/ISSREW53611.2021.00043","url":null,"abstract":"With the continuing rise of cloud technology hypervisors play a vital role in the performance and reliability of current services. As long-running applications, they are susceptible to software aging. Hypervisors offer so-called hypercall interfaces for communication with the hosted virtual machines. These interfaces require thorough testing to ensure their long-term reliability. Existing research deals with the aging properties of hypervisors in general without considering the hypercalls. In this work, we share our experience that we collected during trying to understand hypercalls and their parameters and use them to construct test cases for hypervisor aging of Microsoft Hyper-V. We present a bug that we detected, which was reported and acknowledged by Microsoft. Further, based on our manual binary code analysis, we propose the idea of automating the analysis process to detect valid parameter ranges and execution conditions of hypercalls without manual effort.","PeriodicalId":385392,"journal":{"name":"2021 IEEE International Symposium on Software Reliability Engineering Workshops (ISSREW)","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116227550","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Anas Nadeem, Muhammad Usman Sarwar, Muhammad Zubair Malik
{"title":"Automatic Issue Classifier: A Transfer Learning Framework for Classifying Issue Reports","authors":"Anas Nadeem, Muhammad Usman Sarwar, Muhammad Zubair Malik","doi":"10.1109/ISSREW53611.2021.00113","DOIUrl":"https://doi.org/10.1109/ISSREW53611.2021.00113","url":null,"abstract":"Issue tracking systems are used in the software industry for the facilitation of maintenance activities that keep the software robust and up to date with ever-changing industry requirements. Usually, users report issues that can be categorized into different labels such as bug reports, enhancement requests, and questions related to the software. Most of the issue tracking systems make the labelling of these issue reports optional for the issue submitter, which leads to a large number of unlabeled issue reports. In this paper, we present a state-of-the-art method to classify the issue reports into their respective categories i.e. bug, enhancement, and question. This is a challenging task because of the common use of informal language in the issue reports. Existing studies use traditional natural language processing approaches adopting key-word based features, which fail to incorporate the contextual relationship between words and therefore result in a high rate of false positives and false negatives. Moreover, previous works utilize a uni-label approach to classify the issue reports however, in reality, an issue-submitter can tag one issue report with more than one label at a time. This paper presents our approach to classify the issue reports in a multi-label setting. We use an off-the-shelf neural network called RoBERTa and fine-tune it to classify the issue reports. We validate our approach on issue reports belonging to numerous industrial projects from GitHub. We were able to achieve promising F-1 scores of 81 %, 74%, and 80% for bug reports, enhancements, and questions, respectively. We also develop an industry tool called Automatic Issue Classifier (AIC), which automatically assigns labels to newly reported issues on GitHub repositories with high accuracy.","PeriodicalId":385392,"journal":{"name":"2021 IEEE International Symposium on Software Reliability Engineering Workshops (ISSREW)","volume":"134 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116586243","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}