ACM Transactions on Software Engineering and Methodology (TOSEM)最新文献_第3页

Opinion Mining for Software Development: A Systematic Literature Review 软件开发中的意见挖掘:系统的文献综述

ACM Transactions on Software Engineering and Methodology (TOSEM) Pub Date : 2022-03-07 DOI: 10.1145/3490388

B. Lin, Nathan Cassee, Alexander Serebrenik, G. Bavota, Nicole Novielli, Michele Lanza

{"title":"Opinion Mining for Software Development: A Systematic Literature Review","authors":"B. Lin, Nathan Cassee, Alexander Serebrenik, G. Bavota, Nicole Novielli, Michele Lanza","doi":"10.1145/3490388","DOIUrl":"https://doi.org/10.1145/3490388","url":null,"abstract":"Opinion mining, sometimes referred to as sentiment analysis, has gained increasing attention in software engineering (SE) studies. SE researchers have applied opinion mining techniques in various contexts, such as identifying developers’ emotions expressed in code comments and extracting users’ critics toward mobile apps. Given the large amount of relevant studies available, it can take considerable time for researchers and developers to figure out which approaches they can adopt in their own studies and what perils these approaches entail. We conducted a systematic literature review involving 185 papers. More specifically, we present (1) well-defined categories of opinion mining-related software development activities, (2) available opinion mining approaches, whether they are evaluated when adopted in other studies, and how their performance is compared, (3) available datasets for performance evaluation and tool customization, and (4) concerns or limitations SE researchers might need to take into account when applying/customizing these opinion mining techniques. The results of our study serve as references to choose suitable opinion mining tools for software development activities and provide critical insights for the further development of opinion mining techniques in the SE domain.","PeriodicalId":7398,"journal":{"name":"ACM Transactions on Software Engineering and Methodology (TOSEM)","volume":"39 1","pages":"1 - 41"},"PeriodicalIF":0.0,"publicationDate":"2022-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79873296","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 13

Verification of Distributed Systems via Sequential Emulation 基于顺序仿真的分布式系统验证

ACM Transactions on Software Engineering and Methodology (TOSEM) Pub Date : 2022-03-07 DOI: 10.1145/3490387

Luca Di Stefano, R. De Nicola, Omar Inverso

{"title":"Verification of Distributed Systems via Sequential Emulation","authors":"Luca Di Stefano, R. De Nicola, Omar Inverso","doi":"10.1145/3490387","DOIUrl":"https://doi.org/10.1145/3490387","url":null,"abstract":"Sequential emulation is a semantics-based technique to automatically reduce property checking of distributed systems to the analysis of sequential programs. An automated procedure takes as input a formal specification of a distributed system, a property of interest, and the structural operational semantics of the specification language and generates a sequential program whose execution traces emulate the possible evolutions of the considered system. The problem as to whether the property of interest holds for the system can then be expressed either as a reachability or as a termination query on the program. This allows to immediately adapt mature verification techniques developed for general-purpose languages to domain-specific languages, and to effortlessly integrate new techniques as soon as they become available. We test our approach on a selection of concurrent systems originated from different contexts from population protocols to models of flocking behaviour. By combining a comprehensive range of program verification techniques, from traditional symbolic execution to modern inductive-based methods such as property-directed reachability, we are able to draw consistent and correct verification verdicts for the considered systems.","PeriodicalId":7398,"journal":{"name":"ACM Transactions on Software Engineering and Methodology (TOSEM)","volume":"9 1","pages":"1 - 41"},"PeriodicalIF":0.0,"publicationDate":"2022-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89468964","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

L2S: A Framework for Synthesizing the Most Probable Program under a Specification 在一个规范下合成最可能程序的框架

ACM Transactions on Software Engineering and Methodology (TOSEM) Pub Date : 2022-03-07 DOI: 10.1145/3487570

Yingfei Xiong, Bo Wang

{"title":"L2S: A Framework for Synthesizing the Most Probable Program under a Specification","authors":"Yingfei Xiong, Bo Wang","doi":"10.1145/3487570","DOIUrl":"https://doi.org/10.1145/3487570","url":null,"abstract":"In many scenarios, we need to find the most likely program that meets a specification under a local context, where the local context can be an incomplete program, a partial specification, natural language description, and so on. We call such a problem program estimation. In this article, we propose a framework, LingLong Synthesis Framework (L2S), to address this problem. Compared with existing work, our work is novel in the following aspects. (1) We propose a theory of expansion rules to describe how to decompose a program into choices. (2) We propose an approach based on abstract interpretation to efficiently prune off the program sub-space that does not satisfy the specification. (3) We prove that the probability of a program is the product of the probabilities of choosing expansion rules, regardless of the choosing order. (4) We reduce the program estimation problem to a pathfinding problem, enabling existing pathfinding algorithms to solve this problem. L2S has been applied to program generation and program repair. In this article, we report our instantiation of this framework for synthesizing conditional expressions (L2S-Cond) and repairing conditional statements (L2S-Hanabi). The experiments on L2S-Cond show that each option enabled by L2S, including the expansion rules, the pruning technique, and the use of different pathfinding algorithms, plays a major role in the performance of the approach. The default configuration of L2S-Cond correctly predicts nearly 60% of the conditional expressions in the top 5 candidates. Moreover, we evaluate L2S-Hanabi on 272 bugs from two real-world Java defects benchmarks, namely Defects4J and Bugs.jar. L2S-Hanabi correctly fixes 32 bugs with a high precision of 84%. In terms of repairing conditional statement bugs, L2S-Hanabi significantly outperforms all existing approaches in both precision and recall.","PeriodicalId":7398,"journal":{"name":"ACM Transactions on Software Engineering and Methodology (TOSEM)","volume":"29 1","pages":"1 - 45"},"PeriodicalIF":0.0,"publicationDate":"2022-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91493150","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 11

Stateful Serverless Computing with Crucial 有状态无服务器计算与关键

ACM Transactions on Software Engineering and Methodology (TOSEM) Pub Date : 2022-03-07 DOI: 10.1145/3490386

Daniel Barcelona Pons, P. Sutra, Marc Sánchez Artigas, Gerard París, P. López

{"title":"Stateful Serverless Computing with Crucial","authors":"Daniel Barcelona Pons, P. Sutra, Marc Sánchez Artigas, Gerard París, P. López","doi":"10.1145/3490386","DOIUrl":"https://doi.org/10.1145/3490386","url":null,"abstract":"Serverless computing greatly simplifies the use of cloud resources. In particular, Function-as-a-Service (FaaS) platforms enable programmers to develop applications as individual functions that can run and scale independently. Unfortunately, applications that require fine-grained support for mutable state and synchronization, such as machine learning (ML) and scientific computing, are notoriously hard to build with this new paradigm. In this work, we aim at bridging this gap. We present Crucial, a system to program highly-parallel stateful serverless applications. Crucial retains the simplicity of serverless computing. It is built upon the key insight that FaaS resembles to concurrent programming at the scale of a datacenter. Accordingly, a distributed shared memory layer is the natural answer to the needs for fine-grained state management and synchronization. Crucial allows to port effortlessly a multi-threaded code base to serverless, where it can benefit from the scalability and pay-per-use model of FaaS platforms. We validate Crucial with the help of micro-benchmarks and by considering various stateful applications. Beyond classical parallel tasks (e.g., a Monte Carlo simulation), these applications include representative ML algorithms such as k-means and logistic regression. Our evaluation shows that Crucial obtains superior or comparable performance to Apache Spark at similar cost (18%–40% faster). We also use Crucial to port (part of) a state-of-the-art multi-threaded ML library to serverless. The ported application is up to 30% faster than with a dedicated high-end server. Finally, we attest that Crucial can rival in performance with a single-machine, multi-threaded implementation of a complex coordination problem. Overall, Crucial delivers all these benefits with less than 6% of changes in the code bases of the evaluated applications.","PeriodicalId":7398,"journal":{"name":"ACM Transactions on Software Engineering and Methodology (TOSEM)","volume":"344 1","pages":"1 - 38"},"PeriodicalIF":0.0,"publicationDate":"2022-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76918123","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 16

Industry–Academia Research Collaboration and Knowledge Co-creation: Patterns and Anti-patterns 产学研合作与知识共同创造:模式与反模式

ACM Transactions on Software Engineering and Methodology (TOSEM) Pub Date : 2022-03-07 DOI: 10.1145/3494519

D. Marijan, Sagar Sen

{"title":"Industry–Academia Research Collaboration and Knowledge Co-creation: Patterns and Anti-patterns","authors":"D. Marijan, Sagar Sen","doi":"10.1145/3494519","DOIUrl":"https://doi.org/10.1145/3494519","url":null,"abstract":"Increasing the impact of software engineering research in the software industry and the society at large has long been a concern of high priority for the software engineering community. The problem of two cultures, research conducted in a vacuum (disconnected from the real world), or misaligned time horizons are just some of the many complex challenges standing in the way of successful industry–academia collaborations. This article reports on the experience of research collaboration and knowledge co-creation between industry and academia in software engineering as a way to bridge the research–practice collaboration gap. Our experience spans 14 years of collaboration between researchers in software engineering and the European and Norwegian software and IT industry. Using the participant observation and interview methods, we have collected and afterwards analyzed an extensive record of qualitative data. Drawing upon the findings made and the experience gained, we provide a set of 14 patterns and 14 anti-patterns for industry–academia collaborations, aimed to support other researchers and practitioners in establishing and running research collaboration projects in software engineering.","PeriodicalId":7398,"journal":{"name":"ACM Transactions on Software Engineering and Methodology (TOSEM)","volume":"24 1","pages":"1 - 52"},"PeriodicalIF":0.0,"publicationDate":"2022-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90951508","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

Boosting Compiler Testing via Compiler Optimization Exploration 通过编译器优化探索促进编译器测试

ACM Transactions on Software Engineering and Methodology (TOSEM) Pub Date : 2022-03-05 DOI: 10.1145/3508362

Junjie Chen, Chenyao Suo

{"title":"Boosting Compiler Testing via Compiler Optimization Exploration","authors":"Junjie Chen, Chenyao Suo","doi":"10.1145/3508362","DOIUrl":"https://doi.org/10.1145/3508362","url":null,"abstract":"Compilers are a kind of important software, and similar to the quality assurance of other software, compiler testing is one of the most widely-used ways of guaranteeing their quality. Compiler bugs tend to occur in compiler optimizations. Detecting optimization bugs needs to consider two main factors: (1) the optimization flags controlling the accessability of the compiler buggy code should be turned on; and (2) the test program should be able to trigger the buggy code. However, existing compiler testing approaches only consider the latter to generate effective test programs, but just run them under several pre-defined optimization levels (e.g., -O0, -O1, -O2, -O3, -Os in GCC). To better understand the influence of compiler optimizations on compiler testing, we conduct the first empirical study, and find that (1) all the bugs detected under the widely-used optimization levels are also detected under the explored optimization settings (we call a combination of optimization flags turned on for compilation an optimization setting), while 83.54% of bugs are only detected under the latter; (2) there exist both inhibition effect and promotion effect among optimization flags for compiler testing, indicating the necessity and challenges of considering the factor of compiler optimizations in compiler testing. We then propose the first approach, called COTest, by considering both factors to test compilers. Specifically, COTest first adopts machine-learning (the XGBoost algorithm) to model the relationship between test programs and optimization settings, to predict the bug-triggering probability of a test program under an optimization setting. Then, it designs a diversity augmentation strategy to select a set of diverse candidate optimization settings for prediction for a test program. Finally, Top-K optimization settings are selected for compiler testing according to the predicted bug-triggering probabilities. Then, it designs a diversity augmentation strategy to select a set of diverse candidate optimization settings for prediction for a test program. Finally, Top-K optimization settings are selected for compiler testing according to the predicted bug-triggering probabilities. The experiments on GCC and LLVM demonstrate its effectiveness, especially COTest detects 17 previously unknown bugs, 11 of which have been fixed or confirmed by developers.","PeriodicalId":7398,"journal":{"name":"ACM Transactions on Software Engineering and Methodology (TOSEM)","volume":"37 1","pages":"1 - 33"},"PeriodicalIF":0.0,"publicationDate":"2022-03-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73725119","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 13

Buddy Stacks: Protecting Return Addresses with Efficient Thread-Local Storage and Runtime Re-Randomization 伙伴栈:用有效的线程本地存储和运行时再随机化保护返回地址

ACM Transactions on Software Engineering and Methodology (TOSEM) Pub Date : 2022-03-04 DOI: 10.1145/3494516

Changwei Zou, Xudong Wang, Yaoqing Gao, Jingling Xue

{"title":"Buddy Stacks: Protecting Return Addresses with Efficient Thread-Local Storage and Runtime Re-Randomization","authors":"Changwei Zou, Xudong Wang, Yaoqing Gao, Jingling Xue","doi":"10.1145/3494516","DOIUrl":"https://doi.org/10.1145/3494516","url":null,"abstract":"Shadow stacks play an important role in protecting return addresses to mitigate ROP attacks. Parallel shadow stacks, which shadow the call stack of each thread at the same constant offset for all threads, are known not to support multi-threading well. On the other hand, compact shadow stacks must maintain a separate shadow stack pointer in thread-local storage (TLS), which can be implemented in terms of a register or the per-thread Thread-Control-Block (TCB), suffering from poor compatibility in the former or high performance overhead in the latter. In addition, shadow stacks are vulnerable to information disclosure attacks. In this paper, we propose to mitigate ROP attacks for single- and multi-threaded server programs running on general-purpose computing systems by using a novel stack layout, called a buddy stack (referred to as Bustk), that is highly performant, compatible with existing code, and provides meaningful security. These goals are met due to three novel design aspects in Bustk. First, Bustk places a parallel shadow stack just below a thread’s call stack (as each other’s buddies allocated together), avoiding the need to maintain a separate shadow stack pointer and making it now well-suited for multi-threading. Second, Bustk uses an efficient stack-based thread-local storage mechanism, denoted STK-TLS, to store thread-specific metadata in two TLS sections just below the shadow stack in dual redundancy (as each other’s buddies), so that both can be accessed and updated in a lightweight manner from the call stack pointer rsp alone. Finally, Bustk re-randomizes continuously (on the order of milliseconds) the return addresses on the shadow stack by using a new microsecond-level runtime re-randomization technique, denoted STK-MSR. This mechanism aims to obsolete leaked information, making it extremely unlikely for the attacker to hijack return addresses, particularly against a server program that sits often tens of milliseconds away from the attacker. Our evaluation using web servers, Nginx and Apache Httpd, shows that Bustk works well in terms of performance, compatibility, and security provided, with its parallel shadow stacks incurring acceptable memory overhead for real-world applications and its STK-TLS mechanism costing only two pages per thread. In particular, Bustk can protect the Nginx and Apache servers with an adaptive 1-ms re-randomization policy (without observable overheads when IO is intensive, with about 17,000 requests per second). In addition, we have also evaluated Bustk using other non-server applications, Firefox, Python, LLVM, JDK and SPEC CPU2006, to demonstrate further the same degree of performance and compatibility provided, but the protection provided for, say, browsers, is weaker (since network-access delays can no longer be assumed).","PeriodicalId":7398,"journal":{"name":"ACM Transactions on Software Engineering and Methodology (TOSEM)","volume":"16 1","pages":"1 - 37"},"PeriodicalIF":0.0,"publicationDate":"2022-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87322726","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

A Common Terminology for Software Risk Management 软件风险管理的通用术语

ACM Transactions on Software Engineering and Methodology (TOSEM) Pub Date : 2022-02-12 DOI: 10.1145/3498539

J. Masso, F. García, César J. Pardo, F. Pino, M. Piattini

{"title":"A Common Terminology for Software Risk Management","authors":"J. Masso, F. García, César J. Pardo, F. Pino, M. Piattini","doi":"10.1145/3498539","DOIUrl":"https://doi.org/10.1145/3498539","url":null,"abstract":"In order to improve and sustain their competitiveness over time, organisations nowadays need to undertake different initiatives to adopt frameworks, models and standards that will allow them to align and improve their business processes. In spite of these efforts, organisations may still encounter governance and management problems. This is where Risk Management (RM) can play a major role, since its purpose is to contribute to the creation and preservation of value in the context of the organisation's processes. RM is a complex and subjective activity that requires experience and a high level of knowledge about risks, and it is for this reason that standardisation institutions and researchers have made great efforts to define initiatives to overcome these challenges. However, the RM field nevertheless presents a lack of uniformity in its terms and concepts, due to the different contexts and scopes of application, a situation that can generate ambiguities and misunderstandings. To address these issues, this paper aims to present an ontology called SRMO (Software Risk Management Ontology), which seeks to unify the terms and concepts associated with RM and provide an integrated and holistic view of risk. In doing so, the Pipeline framework has been applied in order to assure and verify the quality of the proposed ontology, and it has been implemented in Protégé and validated by means of competency questions. Three application scenarios of this ontology demonstrating their usefulness in the software engineering field are presented in this paper. We believe that this ontology can be useful for organisations that are interested in: (i) establishing an RM strategy from an integrated approach, (ii) defining the elements that help to identify risks and the criteria that support decision-making in risk assessment, and (iii) helping the involved stakeholders during the process of risk management.","PeriodicalId":7398,"journal":{"name":"ACM Transactions on Software Engineering and Methodology (TOSEM)","volume":"23 1","pages":"1 - 47"},"PeriodicalIF":0.0,"publicationDate":"2022-02-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88891007","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

How Do Successful and Failed Projects Differ? A Socio-Technical Analysis 成功和失败的项目有何不同?社会技术分析

ACM Transactions on Software Engineering and Methodology (TOSEM) Pub Date : 2022-02-08 DOI: 10.1145/3504003

Mitchell Joblin, S. Apel

{"title":"How Do Successful and Failed Projects Differ? A Socio-Technical Analysis","authors":"Mitchell Joblin, S. Apel","doi":"10.1145/3504003","DOIUrl":"https://doi.org/10.1145/3504003","url":null,"abstract":"Software development is at the intersection of the social realm, involving people who develop the software, and the technical realm, involving artifacts (code, docs, etc.) that are being produced. It has been shown that a socio-technical perspective provides rich information about the state of a software project. In particular, we are interested in socio-technical factors that are associated with project success. For this purpose, we frame the task as a network classification problem. We show how a set of heterogeneous networks composed of social and technical entities can be jointly embedded in a single vector space enabling mathematically sound comparisons between distinct software projects. Our approach is specifically designed using intuitive metrics stemming from network analysis and statistics to ease the interpretation of results in the context of software engineering wisdom. Based on a selection of 32 open source projects, we perform an empirical study to validate our approach considering three prediction scenarios to test the classification model’s ability generalizing to (1) randomly held-out project snapshots, (2) future project states, and (3) entirely new projects. Our results provide evidence that a socio-technical perspective is superior to a pure social or technical perspective when it comes to early indicators of future project success. To our surprise, the methodology proposed here even shows evidence of being able to generalize to entirely novel (project hold-out set) software projects reaching predication accuracies of 80%, which is a further testament to the efficacy of our approach and beyond what has been possible so far. In addition, we identify key features that are strongly associated with project success. Our results indicate that even relatively simple socio-technical networks capture highly relevant and interpretable information about the early indicators of future project success.","PeriodicalId":7398,"journal":{"name":"ACM Transactions on Software Engineering and Methodology (TOSEM)","volume":"21 1","pages":"1 - 24"},"PeriodicalIF":0.0,"publicationDate":"2022-02-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88569009","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 6

NPC: Neuron Path Coverage via Characterizing Decision Logic of Deep Neural Networks 通过表征深度神经网络决策逻辑的神经元路径覆盖

ACM Transactions on Software Engineering and Methodology (TOSEM) Pub Date : 2022-01-31 DOI: 10.1145/3490489

Xiaofei Xie, Tianlin Li, Jian Wang, L. Ma, Qing Guo, Felix Juefei-Xu, Yang Liu

{"title":"NPC: Neuron Path Coverage via Characterizing Decision Logic of Deep Neural Networks","authors":"Xiaofei Xie, Tianlin Li, Jian Wang, L. Ma, Qing Guo, Felix Juefei-Xu, Yang Liu","doi":"10.1145/3490489","DOIUrl":"https://doi.org/10.1145/3490489","url":null,"abstract":"Deep learning has recently been widely applied to many applications across different domains, e.g., image classification and audio recognition. However, the quality of Deep Neural Networks (DNNs) still raises concerns in the practical operational environment, which calls for systematic testing, especially in safety-critical scenarios. Inspired by software testing, a number of structural coverage criteria are designed and proposed to measure the test adequacy of DNNs. However, due to the blackbox nature of DNN, the existing structural coverage criteria are difficult to interpret, making it hard to understand the underlying principles of these criteria. The relationship between the structural coverage and the decision logic of DNNs is unknown. Moreover, recent studies have further revealed the non-existence of correlation between the structural coverage and DNN defect detection, which further posts concerns on what a suitable DNN testing criterion should be. In this article, we propose the interpretable coverage criteria through constructing the decision structure of a DNN. Mirroring the control flow graph of the traditional program, we first extract a decision graph from a DNN based on its interpretation, where a path of the decision graph represents a decision logic of the DNN. Based on the control flow and data flow of the decision graph, we propose two variants of path coverage to measure the adequacy of the test cases in exercising the decision logic. The higher the path coverage, the more diverse decision logic the DNN is expected to be explored. Our large-scale evaluation results demonstrate that: The path in the decision graph is effective in characterizing the decision of the DNN, and the proposed coverage criteria are also sensitive with errors, including natural errors and adversarial examples, and strongly correlate with the output impartiality.","PeriodicalId":7398,"journal":{"name":"ACM Transactions on Software Engineering and Methodology (TOSEM)","volume":"31 1","pages":"1 - 27"},"PeriodicalIF":0.0,"publicationDate":"2022-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85755737","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 17