Semantic-aware testing for object detection systems

IF 4.3 2区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Information and Software Technology Pub Date : 2025-09-28 DOI:10.1016/j.infsof.2025.107888

Xiaoxia Liu , Jingyi Wang , Hsiao-Ying Lin , Chengfang Fang , Jie Shi , Xiaodong Zhang , Wenhai Wang

{"title":"Semantic-aware testing for object detection systems","authors":"Xiaoxia Liu , Jingyi Wang , Hsiao-Ying Lin , Chengfang Fang , Jie Shi , Xiaodong Zhang , Wenhai Wang","doi":"10.1016/j.infsof.2025.107888","DOIUrl":null,"url":null,"abstract":"<div><h3>Context:</h3><div>Deep Learning-based object detection (OD) module is rapidly being the common basis for many popular autonomous systems such as self-driving cars and drones. Comprehensive robust testing is essential before deploying an OD module in safety-critical applications.</div></div><div><h3>Objective:</h3><div>We aim to address the following limitations of existing OD testing works: (1) they focus primarily on 2D OD with single-camera inputs rather than 3D OD with multi-camera fusion; (2) they rely on limited environmental changes or GAN transformations that inadequately cover diverse and complex real-world input; and (3) existing testing metrics remain unevaluated in the OD setting.</div></div><div><h3>Methods:</h3><div>We propose and develop a systematic semantic-aware testing framework named <span>SeaT-OD</span> capable of testing practical 3D OD systems based on fused image input by tackling several key technical challenges. Our approach introduces: (1) novel semantic-aware metrics defined over deep feature spaces applicable across diverse OD models; (2) a test generation algorithm using <em>deep semantic transformation</em> to enhance input semantic coverage; and (3) metric-guided test case selection for efficient model robustness improvement through targeted retraining.</div></div><div><h3>Result:</h3><div>We evaluated <span>SeaT-OD</span> on state-of-the-art commonly adopted 2D and 3D OD models based on fused image input in popular datasets from autonomous driving. Extensive experimental results show that existing OD testing works are insufficient, and <span>SeaT-OD</span> is effective in measuring the adequacy of testing practical 3D OD systems, generating high-quality test cases, and selecting test cases meaningful for improving the system robustness.</div></div><div><h3>Conclusion:</h3><div>Based on the results, we emphasize the importance of testing OD systems. Additionally, we present several observations that can direct future research and developments.</div></div>","PeriodicalId":54983,"journal":{"name":"Information and Software Technology","volume":"189 ","pages":"Article 107888"},"PeriodicalIF":4.3000,"publicationDate":"2025-09-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information and Software Technology","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0950584925002277","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

Context:

Deep Learning-based object detection (OD) module is rapidly being the common basis for many popular autonomous systems such as self-driving cars and drones. Comprehensive robust testing is essential before deploying an OD module in safety-critical applications.

Objective:

We aim to address the following limitations of existing OD testing works: (1) they focus primarily on 2D OD with single-camera inputs rather than 3D OD with multi-camera fusion; (2) they rely on limited environmental changes or GAN transformations that inadequately cover diverse and complex real-world input; and (3) existing testing metrics remain unevaluated in the OD setting.

Methods:

We propose and develop a systematic semantic-aware testing framework named SeaT-OD capable of testing practical 3D OD systems based on fused image input by tackling several key technical challenges. Our approach introduces: (1) novel semantic-aware metrics defined over deep feature spaces applicable across diverse OD models; (2) a test generation algorithm using deep semantic transformation to enhance input semantic coverage; and (3) metric-guided test case selection for efficient model robustness improvement through targeted retraining.

Result:

We evaluated SeaT-OD on state-of-the-art commonly adopted 2D and 3D OD models based on fused image input in popular datasets from autonomous driving. Extensive experimental results show that existing OD testing works are insufficient, and SeaT-OD is effective in measuring the adequacy of testing practical 3D OD systems, generating high-quality test cases, and selecting test cases meaningful for improving the system robustness.

Conclusion:

Based on the results, we emphasize the importance of testing OD systems. Additionally, we present several observations that can direct future research and developments.

查看原文本刊更多论文

对象检测系统的语义感知测试

背景：基于深度学习的对象检测（OD）模块正迅速成为许多流行的自动驾驶系统（如自动驾驶汽车和无人机）的通用基础。在安全关键型应用中部署OD模块之前，全面的健壮测试是必不可少的。目的：我们旨在解决现有外径测试工作的以下局限性：(1)它们主要关注单摄像头输入的2D外径，而不是多摄像头融合的3D外径；(2)它们依赖于有限的环境变化或GAN转换，不能充分覆盖多样化和复杂的现实世界输入；(3)现有的测试指标在外径环境下仍未得到评估。方法：通过解决几个关键技术难题，提出并开发了一个系统的语义感知测试框架SeaT-OD，能够测试基于融合图像输入的实际3D OD系统。我们的方法引入：(1)在适用于不同OD模型的深度特征空间上定义的新的语义感知度量；(2)采用深度语义转换增强输入语义覆盖率的测试生成算法；(3)度量导向的测试用例选择，通过有针对性的再训练有效地提高模型的鲁棒性。结果：基于自动驾驶流行数据集的融合图像输入，我们在最先进的常用2D和3D OD模型上评估了SeaT-OD。大量的实验结果表明，现有的OD测试工作不足，SeaT-OD可以有效地衡量实际3D OD系统测试的充分性，生成高质量的测试用例，并选择对提高系统鲁棒性有意义的测试用例。结论：在此基础上，强调了OD系统检测的重要性。此外，我们还提出了一些可以指导未来研究和发展的观察结果。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Information and Software Technology 工程技术-计算机：软件工程

CiteScore

9.10

自引率

7.70%

发文量

164

审稿时长

9.6 weeks

期刊介绍： Information and Software Technology is the international archival journal focusing on research and experience that contributes to the improvement of software development practices. The journal''s scope includes methods and techniques to better engineer software and manage its development. Articles submitted for review should have a clear component of software engineering or address ways to improve the engineering and management of software development. Areas covered by the journal include: • Software management, quality and metrics, • Software processes, • Software architecture, modelling, specification, design and programming • Functional and non-functional software requirements • Software testing and verification & validation • Empirical studies of all aspects of engineering and managing software development Short Communications is a new section dedicated to short papers addressing new ideas, controversial opinions, "Negative" results and much more. Read the Guide for authors for more information. The journal encourages and welcomes submissions of systematic literature studies (reviews and maps) within the scope of the journal. Information and Software Technology is the premiere outlet for systematic literature studies in software engineering.