CriticalFuzz: A critical neuron coverage-guided fuzz testing framework for deep neural networks

IF 3.8 2区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Information and Software Technology Pub Date : 2024-04-24 DOI:10.1016/j.infsof.2024.107476

Tongtong Bai , Song Huang , Yifan Huang , Xingya Wang , Chunyan Xia , Yubin Qu , Zhen Yang

{"title":"CriticalFuzz: A critical neuron coverage-guided fuzz testing framework for deep neural networks","authors":"Tongtong Bai , Song Huang , Yifan Huang , Xingya Wang , Chunyan Xia , Yubin Qu , Zhen Yang","doi":"10.1016/j.infsof.2024.107476","DOIUrl":null,"url":null,"abstract":"<div><h3>Context:</h3><p>Deep neural networks (DNN) have been widely deployed in safety-critical domains, such as autonomous cars and healthcare, where error behaviors can lead to serious accidents, testing DNN is extremely important. Neuron coverage-guided fuzz testing (NCFT) has become an effective whitebox testing approach for testing DNN, which iteratively generates new test cases with the guidance of neuron coverage to explore different logics of DNN, and has found numerous defects. However, existing NCFT approaches ignore that the role of neurons is distinct for the final output of DNN. Given an input, only a fraction of neurons determines the final output of the DNN. These neurons hold the essential logic of the DNN.</p></div><div><h3>Objective:</h3><p>To ensure the quality of DNN and improve testing efficiency, NCFT should first cover neurons containing major logic of DNN.</p></div><div><h3>Method:</h3><p>In this paper, we propose the critical neurons that hold essential logic of DNN. In order to prioritize the detection of potential defects of critical neurons, we propose a fuzz testing framework, named CriticalFuzz, which mainly contains the energy-based test case generation and the critical neuron coverage criteria. The energy-based test case generation has the capability to produce test cases that are more likely to cover critical neurons and involves energy-based seed selection, power schedule, and seed mutation. The critical neuron coverage as a mechanism for providing feedback to guide the CriticalFuzz in prioritizing the coverage of critical neurons. To evaluate the significance of critical neurons and the performance of CriticalFuzz, we conducted experiments on popular DNNs and datasets.</p></div><div><h3>Results:</h3><p>The experiment results show that (1) the critical neurons have a 100% impact on the output of models, while the non-critical neurons have a lesser effect; (2) CriticalFuzz is effective in achieving 100% coverage of critical neurons and covering 10 classes of critical neurons, outperforming both DeepHunter and TensorFuzz. (3) CriticalFuzz exhibits exceptional error detection capabilities, successfully identifying thousands of errors across 10 diverse error classes within DNN.</p></div><div><h3>Conclusion:</h3><p>The critical neurons defined in this paper hold more significant logic of DNN than non-critical neurons. CriticalFuzz can preferentially cover critical neurons, thereby improving the efficiency of the NCFT process. Additionally, CriticalFuzz is capable of identifying a greater number of errors, thus enhancing the reliability and effectiveness of the NCFT.</p></div>","PeriodicalId":54983,"journal":{"name":"Information and Software Technology","volume":"172 ","pages":"Article 107476"},"PeriodicalIF":3.8000,"publicationDate":"2024-04-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information and Software Technology","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0950584924000818","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

Context:

Deep neural networks (DNN) have been widely deployed in safety-critical domains, such as autonomous cars and healthcare, where error behaviors can lead to serious accidents, testing DNN is extremely important. Neuron coverage-guided fuzz testing (NCFT) has become an effective whitebox testing approach for testing DNN, which iteratively generates new test cases with the guidance of neuron coverage to explore different logics of DNN, and has found numerous defects. However, existing NCFT approaches ignore that the role of neurons is distinct for the final output of DNN. Given an input, only a fraction of neurons determines the final output of the DNN. These neurons hold the essential logic of the DNN.

Objective:

To ensure the quality of DNN and improve testing efficiency, NCFT should first cover neurons containing major logic of DNN.

Method:

In this paper, we propose the critical neurons that hold essential logic of DNN. In order to prioritize the detection of potential defects of critical neurons, we propose a fuzz testing framework, named CriticalFuzz, which mainly contains the energy-based test case generation and the critical neuron coverage criteria. The energy-based test case generation has the capability to produce test cases that are more likely to cover critical neurons and involves energy-based seed selection, power schedule, and seed mutation. The critical neuron coverage as a mechanism for providing feedback to guide the CriticalFuzz in prioritizing the coverage of critical neurons. To evaluate the significance of critical neurons and the performance of CriticalFuzz, we conducted experiments on popular DNNs and datasets.

Results:

The experiment results show that (1) the critical neurons have a 100% impact on the output of models, while the non-critical neurons have a lesser effect; (2) CriticalFuzz is effective in achieving 100% coverage of critical neurons and covering 10 classes of critical neurons, outperforming both DeepHunter and TensorFuzz. (3) CriticalFuzz exhibits exceptional error detection capabilities, successfully identifying thousands of errors across 10 diverse error classes within DNN.

Conclusion:

The critical neurons defined in this paper hold more significant logic of DNN than non-critical neurons. CriticalFuzz can preferentially cover critical neurons, thereby improving the efficiency of the NCFT process. Additionally, CriticalFuzz is capable of identifying a greater number of errors, thus enhancing the reliability and effectiveness of the NCFT.

查看原文本刊更多论文

CriticalFuzz：用于深度神经网络的临界神经元覆盖引导模糊测试框架

背景：深度神经网络（DNN）已被广泛部署在自动驾驶汽车和医疗保健等安全关键领域，在这些领域中，错误行为可能导致严重事故，因此对DNN的测试极为重要。神经元覆盖引导模糊测试（NCFT）已成为一种有效的 DNN 白盒测试方法，它在神经元覆盖的引导下迭代生成新的测试用例，探索 DNN 的不同逻辑，发现了大量缺陷。然而，现有的 NCFT 方法忽视了神经元对 DNN 最终输出的不同作用。给定输入后，只有一部分神经元决定 DNN 的最终输出。方法：本文提出了 DNN 重要逻辑的关键神经元。为了优先检测关键神经元的潜在缺陷，我们提出了一个模糊测试框架，命名为 CriticalFuzz，主要包括基于能量的测试用例生成和关键神经元覆盖标准。基于能量的测试用例生成能够生成更有可能覆盖临界神经元的测试用例，包括基于能量的种子选择、功率调度和种子突变。临界神经元覆盖率作为一种反馈机制，可指导 CriticalFuzz 优先覆盖临界神经元。为了评估临界神经元的意义和 CriticalFuzz 的性能，我们在流行的 DNN 和数据集上进行了实验。结果：实验结果表明：（1）临界神经元对模型输出的影响是 100% 的，而非临界神经元的影响较小；（2）CriticalFuzz 能够有效地实现临界神经元的 100% 覆盖，并覆盖 10 类临界神经元，性能优于 DeepHunter 和 TensorFuzz。(结论：与非临界神经元相比，本文定义的临界神经元拥有更重要的 DNN 逻辑。CriticalFuzz 可以优先覆盖关键神经元，从而提高 NCFT 过程的效率。此外，CriticalFuzz 还能识别更多的错误，从而提高 NCFT 的可靠性和有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Information and Software Technology 工程技术-计算机：软件工程

CiteScore

9.10

自引率

7.70%

发文量

164

审稿时长

9.6 weeks

期刊介绍： Information and Software Technology is the international archival journal focusing on research and experience that contributes to the improvement of software development practices. The journal''s scope includes methods and techniques to better engineer software and manage its development. Articles submitted for review should have a clear component of software engineering or address ways to improve the engineering and management of software development. Areas covered by the journal include: • Software management, quality and metrics, • Software processes, • Software architecture, modelling, specification, design and programming • Functional and non-functional software requirements • Software testing and verification & validation • Empirical studies of all aspects of engineering and managing software development Short Communications is a new section dedicated to short papers addressing new ideas, controversial opinions, "Negative" results and much more. Read the Guide for authors for more information. The journal encourages and welcomes submissions of systematic literature studies (reviews and maps) within the scope of the journal. Information and Software Technology is the premiere outlet for systematic literature studies in software engineering.