{"title":"MetaA: Multi-Dimensional Evaluation of Testing Ability via Adversarial Examples in Deep Learning","authors":"Siqi Gu, Jiawei Liu, Zhan-wei Hui, Wenhong Liu, Zhenyu Chen","doi":"10.1109/QRS57517.2022.00104","DOIUrl":"https://doi.org/10.1109/QRS57517.2022.00104","url":null,"abstract":"Deep learning (DL) has shown superior performance in many areas, making the quality assurance of DL-based software particularly important. Adversarial examples are generated by deliberately adding subtle perturbations in input samples and can easily attack less reliable DL models. Most existing works only utilize a single metric to evaluate the generated adversarial examples, such as attacking success rate or structure similarity measure. The problem is that they cannot avoid extreme testing situations and provide multifaceted evaluation results.This paper presents MetaA, a multi-dimensional evaluation framework for testing ability of adversarial examples in deep learning. Evaluating the testing ability represents measuring the testing performance to make improvements. Specifically, MetaA performs comprehensive validation on generating adversarial examples from two horizontal and five vertical dimensions. We design MetaA according to the definition of the adversarial examples and the issue mentioned in [1] that how to enrich the evaluation dimension rather than merely quantifying the improvement of DL and software.We conduct several analyses and comparative experiments vertically and horizontally to evaluate the reliability and effectiveness of MetaA. The experimental results show that MetaA can avoid speculation and reach agreement among different indicators when they reflect inconsistencies. The detailed and comprehensive analysis of evaluation results can further guide the optimization of adversarial examples and the quality assurance of DL-based software.","PeriodicalId":143812,"journal":{"name":"2022 IEEE 22nd International Conference on Software Quality, Reliability and Security (QRS)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129165016","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Adopting Misclassification Detection and Outlier Modification to Fault Correction in Deep Learning-Based Systems","authors":"Chuan-Min Chu, Chin-Yu Huang, Neil C. Fang","doi":"10.1109/QRS57517.2022.00108","DOIUrl":"https://doi.org/10.1109/QRS57517.2022.00108","url":null,"abstract":"Over the past few decades, researchers in software engineering (SE) have focused on testing, analyzing, repairing, and generating programs automatically and effectively. Today, combining neural networks and traditional software engineering techniques has major potential to benefit software quality and productivity. Regarding the development of neural networks, deep learning (DL) and convolution neural networks (CNNs) have been widely adopted by software applications for making decisions or providing suggestions. Considering life-critical DL-based applications, there is a need to correct the wrong decisions made by DL systems immediately. Therefore, we propose a novel fault-correction framework for alleviating potential misclassification issues of DL systems called the Outlier Modification for DL Systems (OMDLS). Our experiment results with two public datasets using different scales and label numbers to show that modifying the outliers based on the misclassification pairs can improve accuracy by up to 2.12% without retraining the model and modifying the inference immediately.","PeriodicalId":143812,"journal":{"name":"2022 IEEE 22nd International Conference on Software Quality, Reliability and Security (QRS)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128698115","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Coverage Testing of Industrial Simulink Models using Monte-Carlo and SMT-Based Methods","authors":"Daisuke Ishii, Takashi Tomita, Toshiaki Aoki, The Quyen Ngo, Thi Bich Ngoc Do, Hideaki Takai","doi":"10.1109/QRS57517.2022.00050","DOIUrl":"https://doi.org/10.1109/QRS57517.2022.00050","url":null,"abstract":"Simulink is a popular tool for modeling cyber-physical systems. As more models are produced in industry, automated quality assurance of models becomes increasingly important. This paper describes an empirical evaluation of four methods for the coverage testing of Simulink models: A) SimuLink Design Verifier (SLDV), a dedicated official tool; B) Template-Based Monte-Carlo (TBMC) method, a random test generation method that utilizes input signal templates; C) SMT- Based Model Checking (SBMC) method that conducts static analysis via encoding models into logic formulas; and D) a hybrid method of B and C. Based on the evaluation results, we carefully designed the hybrid method to complement the features of TBMC and SBMC. In the experiments, we have applied the methods to fourteen models and evaluated their performance. The results show that the hybrid method achieved better results than SLDV for several models.","PeriodicalId":143812,"journal":{"name":"2022 IEEE 22nd International Conference on Software Quality, Reliability and Security (QRS)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121559621","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Srijoni Majumdar, Ashutosh Varshney, Partha Pratim Das, Paul D. Clough, S. Chattopadhyay
{"title":"An Effective Low-Dimensional Software Code Representation using BERT and ELMo","authors":"Srijoni Majumdar, Ashutosh Varshney, Partha Pratim Das, Paul D. Clough, S. Chattopadhyay","doi":"10.1109/QRS57517.2022.00082","DOIUrl":"https://doi.org/10.1109/QRS57517.2022.00082","url":null,"abstract":"Contextualised word representations (e.g., ELMo and BERT) have been shown to outperform static representations (e.g., Word2vec, Fasttext, and GloVe) for many NLP tasks. In this paper, we investigate the use of contextualised embeddings for code search and classification, an area receiving less attention. We construct CodeELMo by training ELMo from scratch and fine tuning CodeBERT embeddings using masked language modeling based on natural language (NL) texts related to software development concepts and programming language (PL) texts consisting of method comment pairs from open source code bases. The dimensionality of the Finetuned Code BERT embeddings is reduced using linear transformations and augmented with a CodeELMo representation to develop CodeELBE – a lowdimensional contextualised software code representation. Results for binary classification and retrieval tasks show that CodeELBE1 considerably improves retrieval performance on standard deep code search datasets compared to CodeBERT and baseline BERT models.","PeriodicalId":143812,"journal":{"name":"2022 IEEE 22nd International Conference on Software Quality, Reliability and Security (QRS)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132440310","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Huiyu Liu, Jing Liu, Haiying Sun, Tengfei Li, John Zhang
{"title":"Uncertainty-Aware Behavior Modeling and Quantitative Safety Evaluation for Automatic Flight Control Systems","authors":"Huiyu Liu, Jing Liu, Haiying Sun, Tengfei Li, John Zhang","doi":"10.1109/QRS57517.2022.00062","DOIUrl":"https://doi.org/10.1109/QRS57517.2022.00062","url":null,"abstract":"Automatic flight control systems (AFCS) are safety-critical systems tightly integrating computation, networking and physical processes. However, the uncertainty resulting from evolving dynamics in cyberspace and the physical world can affect the reliability of decision-making in the controller, threatening the system’s safety. How to accurately capture the uncertainty, effectively control the aircraft and improve safety has become an unavoidable challenge for the software industry. To this end, we define an uncertainty-aware modeling language (UAML), which supports modeling the AFCS’s dynamic behavior and environmental uncertainty using formal specifications. We use a machine learning-based method to predict the risk levels in operating environments as the representation of uncertainty from the physical world. The prediction result is transferred to UAML as the parameters. On this basis, we present a framework for quantitative safety evaluation using statistical model checking based on UPPAAL-SMC to help AFCS make reliable decisions at runtime. We illustrate our approach by modeling and analyzing a realistic example, and the experimental result demonstrates the effectiveness of our approach.","PeriodicalId":143812,"journal":{"name":"2022 IEEE 22nd International Conference on Software Quality, Reliability and Security (QRS)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128030760","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An Ontological Analysis of Safety-Critical Software and Its Anomalies","authors":"Hezhen Liu, Zhi Jin, Zheng Zheng, Chengqiang Huang, Xun Zhang","doi":"10.1109/QRS57517.2022.00040","DOIUrl":"https://doi.org/10.1109/QRS57517.2022.00040","url":null,"abstract":"The progressively dominant role of software in safety-critical systems raise concerns about the software dependability. There are limited mature practices and guides for assessing software dependability and analyzing system-level hazards triggered by software anomalies. A problem is that faults, errors, and failures that represent software anomalies, albeit with different natures, are usually used indistinctly to predict software dependability, leading to unsolid results. The lack of such consensual conceptualization also leads to poor interoperability between supporting tools, and, consequently, difficulties in anomaly management and software maintenance. Anomaly analysis and management is more tough for safety-critical software due to its higher complexity and the safety-critical nature. The complex context of safety-critical software causes difficulties in determining the evolution/propagation path of software anomalies and the impact on system safety. To capture the nature of safety-critical software and support an understanding of mechanisms of software anomalies and associated hazards, we propose three reference ontologies: Safety-critical Software Ontology, Software Fault Ontology and Software-failure-induced Hazard Ontology, which are built based on international standards, guides, and relevant conceptual models. We also discuss the relationships among them. That will facilitate a better understanding of the software anomaly mechanisms and the design of intervening/mitigation solutions. We demonstrate how these ontologies can help analyze software problems of real-world safety-critical systems.","PeriodicalId":143812,"journal":{"name":"2022 IEEE 22nd International Conference on Software Quality, Reliability and Security (QRS)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133932252","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Cong Wang, Haiying Sun, Hui Dou, HongTao Chen, Jing Liu
{"title":"MC/DC Test Case Automatic Generation for Safety-Critical Systems","authors":"Cong Wang, Haiying Sun, Hui Dou, HongTao Chen, Jing Liu","doi":"10.1109/QRS57517.2022.00079","DOIUrl":"https://doi.org/10.1109/QRS57517.2022.00079","url":null,"abstract":"Testing is an essential part of the software development of Safety-Critical Systems (SCSs). Since it can automatically generate test cases using the system requirement models, Model-Based Testing (MBT) is suitable for SCSs. However, most of the existing system modeling languages for SCSs mainly focus on representing functional requirements rather than safety, e.g., SysML. In this paper, we first propose a modeling language, Safety SysML State Machine (S2MSM), to guarantee safety during the requirement modeling stage. Second, we propose a model transformation algorithm to transform the S2MSM model into an intermediate model. Then, we design a time flow operation sequence that simulates the external real-time environment. Finally, we generate test cases from the intermediate model according to the MC/DC criterion and time flow operation sequence. We conduct a case study on a real-world SCS application to demonstrate the effectiveness and efficiency of the proposed approach.","PeriodicalId":143812,"journal":{"name":"2022 IEEE 22nd International Conference on Software Quality, Reliability and Security (QRS)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115164525","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Chuangchuang Zhang, Yanming Liu, Hong-yong Yang, Yihao Li, Shuning Zhang
{"title":"Availability and Cost aware Multi-'omain Service Deployment Optimization","authors":"Chuangchuang Zhang, Yanming Liu, Hong-yong Yang, Yihao Li, Shuning Zhang","doi":"10.1109/QRS57517.2022.00051","DOIUrl":"https://doi.org/10.1109/QRS57517.2022.00051","url":null,"abstract":"Network Function Virtualization (NFV) achieves flexible provisioning of network services by using Service Function Chain (SFC) composed of a set of Virtual Network Functions (VNFs). However, complex multi-domain networks pose serious challenges to multi-domain service deployment with availability guarantee. In this paper, we study the availability and cost aware multi-domain service deployment optimization problem. We formulate a multi-objective optimization model with the aim to minimize resource consumption cost and operating cost, while guaranteeing availability by jointly considering VNF failures and server failures, as well as cross-domain deployment operating cost. Then, we design a VNF backup based multi-domain SFC deployment algorithm to reduce resource consumption cost and operating cost. The evaluation results demonstrate that our proposed algorithm can achieve lower resource consumption cost and operating cost than comparison algorithms.","PeriodicalId":143812,"journal":{"name":"2022 IEEE 22nd International Conference on Software Quality, Reliability and Security (QRS)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115960383","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Automated Identification of Performance Changes at Code Level","authors":"D. Reichelt, Stefan Kühne, W. Hasselbring","doi":"10.1109/QRS57517.2022.00096","DOIUrl":"https://doi.org/10.1109/QRS57517.2022.00096","url":null,"abstract":"To develop software with optimal performance, even small performance changes need to be identified. Identifying performance changes is challenging since the performance of software is influenced by non-deterministic factors. Therefore, not every performance change is measurable with reasonable effort. In this work, we discuss which performance changes are measurable at code level with reasonable measurement effort and how to identify them. We present (1) an analysis of the boundaries of measuring performance changes, (2) an approach for determining a configuration for reproducible performance change identification, and (3) an evaluation comparing of how well our approach is able to identify performance changes in the application server Jetty compared with the usage of Jetty’s own performance regression benchmarks.Thereby, we find (1) that small performance differences are only measurable by fine-grained measurement workloads, (2) that performance changes caused by the change of one operation can be identified using a unit-test-sized workload definition and a suitable configuration, and (3) that using our approach identifies small performance regressions more efficiently than using Jetty’s performance regression benchmarks.","PeriodicalId":143812,"journal":{"name":"2022 IEEE 22nd International Conference on Software Quality, Reliability and Security (QRS)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122530498","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}