Search-Based DNN Testing and Retraining With GAN-Enhanced Simulations

IF 6.5 1区计算机科学 Q1 COMPUTER SCIENCE, SOFTWARE ENGINEERING

IEEE Transactions on Software Engineering Pub Date : 2025-02-11 DOI:10.1109/TSE.2025.3540549

Mohammed Oualid Attaoui;Fabrizio Pastore;Lionel C. Briand

{"title":"Search-Based DNN Testing and Retraining With GAN-Enhanced Simulations","authors":"Mohammed Oualid Attaoui;Fabrizio Pastore;Lionel C. Briand","doi":"10.1109/TSE.2025.3540549","DOIUrl":null,"url":null,"abstract":"In safety-critical systems (e.g., autonomous vehicles and robots), Deep Neural Networks (DNNs) are becoming a key component for computer vision tasks, particularly semantic segmentation. Further, since DNN behavior cannot be assessed through code inspection and analysis, test automation has become an essential activity to gain confidence in the reliability of DNNs. Unfortunately, state-of-the-art automated testing solutions largely rely on simulators, whose fidelity is always imperfect, thus affecting the validity of test results. To address such limitations, we propose to combine meta-heuristic search, used to explore the input space using simulators, with Generative Adversarial Networks (GANs), to transform the data generated by simulators into realistic input images. Such images can be used both to assess the DNN accuracy and to retrain the DNN more effectively. We applied our approach to a state-of-the-art DNN performing semantic segmentation, in two different case studies, and demonstrated that it outperforms a state-of-the-art GAN-based testing solution and several other baselines. Specifically, it leads to the largest number of diverse images leading to the worst DNN accuracy. Further, the images generated with our approach, lead to the highest improvement in DNN accuracy when used for retraining. In conclusion, we suggest to always integrate a trained GAN to transform test inputs when performing search-driven, simulator-based testing.","PeriodicalId":13324,"journal":{"name":"IEEE Transactions on Software Engineering","volume":"51 4","pages":"1086-1103"},"PeriodicalIF":6.5000,"publicationDate":"2025-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Software Engineering","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10880098/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}

引用次数: 0

Abstract

In safety-critical systems (e.g., autonomous vehicles and robots), Deep Neural Networks (DNNs) are becoming a key component for computer vision tasks, particularly semantic segmentation. Further, since DNN behavior cannot be assessed through code inspection and analysis, test automation has become an essential activity to gain confidence in the reliability of DNNs. Unfortunately, state-of-the-art automated testing solutions largely rely on simulators, whose fidelity is always imperfect, thus affecting the validity of test results. To address such limitations, we propose to combine meta-heuristic search, used to explore the input space using simulators, with Generative Adversarial Networks (GANs), to transform the data generated by simulators into realistic input images. Such images can be used both to assess the DNN accuracy and to retrain the DNN more effectively. We applied our approach to a state-of-the-art DNN performing semantic segmentation, in two different case studies, and demonstrated that it outperforms a state-of-the-art GAN-based testing solution and several other baselines. Specifically, it leads to the largest number of diverse images leading to the worst DNN accuracy. Further, the images generated with our approach, lead to the highest improvement in DNN accuracy when used for retraining. In conclusion, we suggest to always integrate a trained GAN to transform test inputs when performing search-driven, simulator-based testing.

查看原文本刊更多论文

基于搜索的深度神经网络测试和再训练与gan增强模拟

在安全关键系统（如自动驾驶汽车和机器人）中，深度神经网络（dnn）正在成为计算机视觉任务的关键组成部分，特别是语义分割。此外，由于DNN的行为不能通过代码检查和分析来评估，测试自动化已成为对DNN可靠性获得信心的必要活动。不幸的是，最先进的自动化测试解决方案很大程度上依赖于模拟器，其保真度总是不完美的，从而影响了测试结果的有效性。为了解决这些限制，我们建议将元启发式搜索（用于使用模拟器探索输入空间）与生成对抗网络（gan）相结合，将模拟器生成的数据转换为真实的输入图像。这样的图像既可以用来评估深度神经网络的准确性，也可以更有效地重新训练深度神经网络。在两个不同的案例研究中，我们将我们的方法应用于执行语义分割的最先进的深度神经网络，并证明它优于最先进的基于gan的测试解决方案和其他几个基线。具体来说，它会导致大量不同的图像，从而导致最差的DNN精度。此外，用我们的方法生成的图像，当用于再训练时，DNN的准确性得到了最大的提高。总之，我们建议在执行搜索驱动的、基于模拟器的测试时，始终集成经过训练的GAN来转换测试输入。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Transactions on Software Engineering 工程技术-工程：电子与电气

CiteScore

9.70

自引率

10.80%

发文量

724

审稿时长

6 months

期刊介绍： IEEE Transactions on Software Engineering seeks contributions comprising well-defined theoretical results and empirical studies with potential impacts on software construction, analysis, or management. The scope of this Transactions extends from fundamental mechanisms to the development of principles and their application in specific environments. Specific topic areas include: a) Development and maintenance methods and models: Techniques and principles for specifying, designing, and implementing software systems, encompassing notations and process models. b) Assessment methods: Software tests, validation, reliability models, test and diagnosis procedures, software redundancy, design for error control, and measurements and evaluation of process and product aspects. c) Software project management: Productivity factors, cost models, schedule and organizational issues, and standards. d) Tools and environments: Specific tools, integrated tool environments, associated architectures, databases, and parallel and distributed processing issues. e) System issues: Hardware-software trade-offs. f) State-of-the-art surveys: Syntheses and comprehensive reviews of the historical development within specific areas of interest.