Tongtong Bai , Song Huang , Yifan Huang , Xingya Wang , Chunyan Xia , Yubin Qu , Zhen Yang
{"title":"CriticalFuzz: A critical neuron coverage-guided fuzz testing framework for deep neural networks","authors":"Tongtong Bai , Song Huang , Yifan Huang , Xingya Wang , Chunyan Xia , Yubin Qu , Zhen Yang","doi":"10.1016/j.infsof.2024.107476","DOIUrl":null,"url":null,"abstract":"<div><h3>Context:</h3><p>Deep neural networks (DNN) have been widely deployed in safety-critical domains, such as autonomous cars and healthcare, where error behaviors can lead to serious accidents, testing DNN is extremely important. Neuron coverage-guided fuzz testing (NCFT) has become an effective whitebox testing approach for testing DNN, which iteratively generates new test cases with the guidance of neuron coverage to explore different logics of DNN, and has found numerous defects. However, existing NCFT approaches ignore that the role of neurons is distinct for the final output of DNN. Given an input, only a fraction of neurons determines the final output of the DNN. These neurons hold the essential logic of the DNN.</p></div><div><h3>Objective:</h3><p>To ensure the quality of DNN and improve testing efficiency, NCFT should first cover neurons containing major logic of DNN.</p></div><div><h3>Method:</h3><p>In this paper, we propose the critical neurons that hold essential logic of DNN. In order to prioritize the detection of potential defects of critical neurons, we propose a fuzz testing framework, named CriticalFuzz, which mainly contains the energy-based test case generation and the critical neuron coverage criteria. The energy-based test case generation has the capability to produce test cases that are more likely to cover critical neurons and involves energy-based seed selection, power schedule, and seed mutation. The critical neuron coverage as a mechanism for providing feedback to guide the CriticalFuzz in prioritizing the coverage of critical neurons. To evaluate the significance of critical neurons and the performance of CriticalFuzz, we conducted experiments on popular DNNs and datasets.</p></div><div><h3>Results:</h3><p>The experiment results show that (1) the critical neurons have a 100% impact on the output of models, while the non-critical neurons have a lesser effect; (2) CriticalFuzz is effective in achieving 100% coverage of critical neurons and covering 10 classes of critical neurons, outperforming both DeepHunter and TensorFuzz. (3) CriticalFuzz exhibits exceptional error detection capabilities, successfully identifying thousands of errors across 10 diverse error classes within DNN.</p></div><div><h3>Conclusion:</h3><p>The critical neurons defined in this paper hold more significant logic of DNN than non-critical neurons. CriticalFuzz can preferentially cover critical neurons, thereby improving the efficiency of the NCFT process. Additionally, CriticalFuzz is capable of identifying a greater number of errors, thus enhancing the reliability and effectiveness of the NCFT.</p></div>","PeriodicalId":54983,"journal":{"name":"Information and Software Technology","volume":"172 ","pages":"Article 107476"},"PeriodicalIF":3.8000,"publicationDate":"2024-04-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information and Software Technology","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0950584924000818","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Context:
Deep neural networks (DNN) have been widely deployed in safety-critical domains, such as autonomous cars and healthcare, where error behaviors can lead to serious accidents, testing DNN is extremely important. Neuron coverage-guided fuzz testing (NCFT) has become an effective whitebox testing approach for testing DNN, which iteratively generates new test cases with the guidance of neuron coverage to explore different logics of DNN, and has found numerous defects. However, existing NCFT approaches ignore that the role of neurons is distinct for the final output of DNN. Given an input, only a fraction of neurons determines the final output of the DNN. These neurons hold the essential logic of the DNN.
Objective:
To ensure the quality of DNN and improve testing efficiency, NCFT should first cover neurons containing major logic of DNN.
Method:
In this paper, we propose the critical neurons that hold essential logic of DNN. In order to prioritize the detection of potential defects of critical neurons, we propose a fuzz testing framework, named CriticalFuzz, which mainly contains the energy-based test case generation and the critical neuron coverage criteria. The energy-based test case generation has the capability to produce test cases that are more likely to cover critical neurons and involves energy-based seed selection, power schedule, and seed mutation. The critical neuron coverage as a mechanism for providing feedback to guide the CriticalFuzz in prioritizing the coverage of critical neurons. To evaluate the significance of critical neurons and the performance of CriticalFuzz, we conducted experiments on popular DNNs and datasets.
Results:
The experiment results show that (1) the critical neurons have a 100% impact on the output of models, while the non-critical neurons have a lesser effect; (2) CriticalFuzz is effective in achieving 100% coverage of critical neurons and covering 10 classes of critical neurons, outperforming both DeepHunter and TensorFuzz. (3) CriticalFuzz exhibits exceptional error detection capabilities, successfully identifying thousands of errors across 10 diverse error classes within DNN.
Conclusion:
The critical neurons defined in this paper hold more significant logic of DNN than non-critical neurons. CriticalFuzz can preferentially cover critical neurons, thereby improving the efficiency of the NCFT process. Additionally, CriticalFuzz is capable of identifying a greater number of errors, thus enhancing the reliability and effectiveness of the NCFT.
期刊介绍:
Information and Software Technology is the international archival journal focusing on research and experience that contributes to the improvement of software development practices. The journal''s scope includes methods and techniques to better engineer software and manage its development. Articles submitted for review should have a clear component of software engineering or address ways to improve the engineering and management of software development. Areas covered by the journal include:
• Software management, quality and metrics,
• Software processes,
• Software architecture, modelling, specification, design and programming
• Functional and non-functional software requirements
• Software testing and verification & validation
• Empirical studies of all aspects of engineering and managing software development
Short Communications is a new section dedicated to short papers addressing new ideas, controversial opinions, "Negative" results and much more. Read the Guide for authors for more information.
The journal encourages and welcomes submissions of systematic literature studies (reviews and maps) within the scope of the journal. Information and Software Technology is the premiere outlet for systematic literature studies in software engineering.