{"title":"BallPri: test cases prioritization for deep neuron networks via tolerant ball in variable space","authors":"Chengyu Jia, Jinyin Chen, Xiaohao Li, Haibin Zheng, Luxin Zhang","doi":"10.1007/s10515-025-00498-5","DOIUrl":null,"url":null,"abstract":"<div><p>Deep neural networks (DNNs) have gained widespread adoption in various applications, including some safety-critical domains such as autonomous driving. However, despite their impressive capabilities and outstanding performance, DNNs could also exhibit incorrect behaviors that may lead to serious accidents. As a result, it requires security assurance urgently when applied to safety-critical applications. Deep testing has been developed as an effective technique for detecting incorrectness in DNN behaviors and improving their robustness when necessary, but it needs a large amount of labeled test cases that are expensive to obtain due to the labor-intensive data labeling process. Test case prioritization has been proposed to identify more error-exposed test cases earlier in advance, and several techniques such as DeepGini and PRIMA have been developed that achieve effective and efficient prioritization for classification tasks. However, these methods still face challenges such as unreliable validity, limited application scenarios, and high time complexity. To tackle these issues, we present a novel test prioritization method <i>BallPri</i> by using tolerant ball in variable space for DNNs. It extracts tolerant ball of different test cases and use minimum non-parametric likelihood ratio (MinLR) to further enlarge the difference of distribution in variable space, to achieve effective and general test cases prioritizing. Extensive experiments on benchmark datasets and models validate that <i>BallPri</i> outperforms the state-of-the-art methods in three key aspects: (1) <i>Effective</i>—it leverages tolerant ball in variable space to identify malicious bug-revealing inputs. <i>BallPri</i> significantly improves 47.83% prioritization effectiveness and 37.27% prioritization efficiency on average compared with baselines. (2) <i>Extensible</i>—it can be applied to various tasks, data and models. We verify the superiority of <i>BallPri</i> on classification and regression task, convolutional neural network and recurrent neural network model, image, text and speech dataset. (3) <i>Efficient</i>—it achieves a low time complexity compared with existing methods. We further evaluate <i>BallPri</i> against potential adaptive attacks and provide guidance for its accuracy and robustness. The open-source code of <i>BallPri</i> could be downloaded at https://github.com/lixiaohaao/BallPri.</p></div>","PeriodicalId":55414,"journal":{"name":"Automated Software Engineering","volume":"32 1","pages":""},"PeriodicalIF":2.0000,"publicationDate":"2025-03-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Automated Software Engineering","FirstCategoryId":"94","ListUrlMain":"https://link.springer.com/article/10.1007/s10515-025-00498-5","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}
引用次数: 0
Abstract
Deep neural networks (DNNs) have gained widespread adoption in various applications, including some safety-critical domains such as autonomous driving. However, despite their impressive capabilities and outstanding performance, DNNs could also exhibit incorrect behaviors that may lead to serious accidents. As a result, it requires security assurance urgently when applied to safety-critical applications. Deep testing has been developed as an effective technique for detecting incorrectness in DNN behaviors and improving their robustness when necessary, but it needs a large amount of labeled test cases that are expensive to obtain due to the labor-intensive data labeling process. Test case prioritization has been proposed to identify more error-exposed test cases earlier in advance, and several techniques such as DeepGini and PRIMA have been developed that achieve effective and efficient prioritization for classification tasks. However, these methods still face challenges such as unreliable validity, limited application scenarios, and high time complexity. To tackle these issues, we present a novel test prioritization method BallPri by using tolerant ball in variable space for DNNs. It extracts tolerant ball of different test cases and use minimum non-parametric likelihood ratio (MinLR) to further enlarge the difference of distribution in variable space, to achieve effective and general test cases prioritizing. Extensive experiments on benchmark datasets and models validate that BallPri outperforms the state-of-the-art methods in three key aspects: (1) Effective—it leverages tolerant ball in variable space to identify malicious bug-revealing inputs. BallPri significantly improves 47.83% prioritization effectiveness and 37.27% prioritization efficiency on average compared with baselines. (2) Extensible—it can be applied to various tasks, data and models. We verify the superiority of BallPri on classification and regression task, convolutional neural network and recurrent neural network model, image, text and speech dataset. (3) Efficient—it achieves a low time complexity compared with existing methods. We further evaluate BallPri against potential adaptive attacks and provide guidance for its accuracy and robustness. The open-source code of BallPri could be downloaded at https://github.com/lixiaohaao/BallPri.
期刊介绍:
This journal details research, tutorial papers, survey and accounts of significant industrial experience in the foundations, techniques, tools and applications of automated software engineering technology. This includes the study of techniques for constructing, understanding, adapting, and modeling software artifacts and processes.
Coverage in Automated Software Engineering examines both automatic systems and collaborative systems as well as computational models of human software engineering activities. In addition, it presents knowledge representations and artificial intelligence techniques applicable to automated software engineering, and formal techniques that support or provide theoretical foundations. The journal also includes reviews of books, software, conferences and workshops.