SFBDA：一种语义解耦的无人机红外小目标检测数据增强框架

IF 4.4

IEEE geoscience and remote sensing letters : a publication of the IEEE Geoscience and Remote Sensing Society Pub Date : 2025-08-11 DOI:10.1109/LGRS.2025.3597530

Zhenhai Weng;Weijie He;Jianfeng Lv;Dong Zhou;Zhongliang Yu

{"title":"SFBDA：一种语义解耦的无人机红外小目标检测数据增强框架","authors":"Zhenhai Weng;Weijie He;Jianfeng Lv;Dong Zhou;Zhongliang Yu","doi":"10.1109/LGRS.2025.3597530","DOIUrl":null,"url":null,"abstract":"Few-shot object detection (FSOD) is a critical frontier in computer vision research. However, the task of an infrared (IR) FSOD presents significant technical challenges, primarily due to the following: 1) few annotated training samples and 2) low-texture nature of thermal imaging. To address these issues, we propose a semantic-guided foreground–background decoupling augmentation (SFBDA) framework. This method includes an instance-level foreground separation (ILFS) module that utilizes the segment anything model (SAM) to separate the objects, as well as a semantic-constrained background generation network that employs adversarial learning to synthesize contextually compatible backgrounds. To address the insufficiency of scenario diversity in existing uncrewed aerial vehicle (UAV)-based IR object detection datasets, we introduce multiscene IR UAV object detection (MSIR-UAVDET), a novel multiscene IR UAV detection benchmark. This dataset encompasses 16 object categories across diverse environments (terrestrial, maritime, and aerial). To validate the efficacy of the proposed data augmentation methodology, we integrated our approach with existing FSOD frameworks, and comparative experiments were conducted to benchmark our method with existing data augmentation methods. The code and dataset can be publicly available at: <uri>https://github.com/Sea814/SFBDA.git</uri>","PeriodicalId":91017,"journal":{"name":"IEEE geoscience and remote sensing letters : a publication of the IEEE Geoscience and Remote Sensing Society","volume":"22 ","pages":"1-5"},"PeriodicalIF":4.4000,"publicationDate":"2025-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"SFBDA: A Semantic-Decoupled Data Augmentation Framework for Infrared Few-Shot Object Detection on UAVs\",\"authors\":\"Zhenhai Weng;Weijie He;Jianfeng Lv;Dong Zhou;Zhongliang Yu\",\"doi\":\"10.1109/LGRS.2025.3597530\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Few-shot object detection (FSOD) is a critical frontier in computer vision research. However, the task of an infrared (IR) FSOD presents significant technical challenges, primarily due to the following: 1) few annotated training samples and 2) low-texture nature of thermal imaging. To address these issues, we propose a semantic-guided foreground–background decoupling augmentation (SFBDA) framework. This method includes an instance-level foreground separation (ILFS) module that utilizes the segment anything model (SAM) to separate the objects, as well as a semantic-constrained background generation network that employs adversarial learning to synthesize contextually compatible backgrounds. To address the insufficiency of scenario diversity in existing uncrewed aerial vehicle (UAV)-based IR object detection datasets, we introduce multiscene IR UAV object detection (MSIR-UAVDET), a novel multiscene IR UAV detection benchmark. This dataset encompasses 16 object categories across diverse environments (terrestrial, maritime, and aerial). To validate the efficacy of the proposed data augmentation methodology, we integrated our approach with existing FSOD frameworks, and comparative experiments were conducted to benchmark our method with existing data augmentation methods. The code and dataset can be publicly available at: <uri>https://github.com/Sea814/SFBDA.git</uri>\",\"PeriodicalId\":91017,\"journal\":{\"name\":\"IEEE geoscience and remote sensing letters : a publication of the IEEE Geoscience and Remote Sensing Society\",\"volume\":\"22 \",\"pages\":\"1-5\"},\"PeriodicalIF\":4.4000,\"publicationDate\":\"2025-08-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE geoscience and remote sensing letters : a publication of the IEEE Geoscience and Remote Sensing Society\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/11121893/\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE geoscience and remote sensing letters : a publication of the IEEE Geoscience and Remote Sensing Society","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/11121893/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

少镜头目标检测（FSOD）是计算机视觉研究的一个重要前沿。然而，红外（IR） FSOD的任务提出了重大的技术挑战，主要是由于以下几点：1)很少有注释的训练样本和2)热成像的低纹理特性。为了解决这些问题，我们提出了一个语义引导的前景-背景解耦增强（SFBDA）框架。该方法包括一个实例级前景分离（ILFS）模块，该模块利用分段任意模型（SAM）分离对象，以及一个语义约束的背景生成网络，该网络采用对抗学习来合成上下文兼容的背景。针对现有基于无人机（UAV）的红外目标检测数据集场景多样性不足的问题，提出了一种新的多场景红外无人机目标检测基准——MSIR-UAVDET。该数据集包含不同环境（陆地、海洋和空中）的16个对象类别。为了验证所提出的数据增强方法的有效性，我们将我们的方法与现有的FSOD框架相结合，并进行了对比实验，对我们的方法与现有的数据增强方法进行了基准测试。代码和数据集可以在https://github.com/Sea814/SFBDA.git上公开获取

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

SFBDA: A Semantic-Decoupled Data Augmentation Framework for Infrared Few-Shot Object Detection on UAVs

Few-shot object detection (FSOD) is a critical frontier in computer vision research. However, the task of an infrared (IR) FSOD presents significant technical challenges, primarily due to the following: 1) few annotated training samples and 2) low-texture nature of thermal imaging. To address these issues, we propose a semantic-guided foreground–background decoupling augmentation (SFBDA) framework. This method includes an instance-level foreground separation (ILFS) module that utilizes the segment anything model (SAM) to separate the objects, as well as a semantic-constrained background generation network that employs adversarial learning to synthesize contextually compatible backgrounds. To address the insufficiency of scenario diversity in existing uncrewed aerial vehicle (UAV)-based IR object detection datasets, we introduce multiscene IR UAV object detection (MSIR-UAVDET), a novel multiscene IR UAV detection benchmark. This dataset encompasses 16 object categories across diverse environments (terrestrial, maritime, and aerial). To validate the efficacy of the proposed data augmentation methodology, we integrated our approach with existing FSOD frameworks, and comparative experiments were conducted to benchmark our method with existing data augmentation methods. The code and dataset can be publicly available at: https://github.com/Sea814/SFBDA.git

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE geoscience and remote sensing letters : a publication of the IEEE Geoscience and Remote Sensing Society

自引率

0.00%

发文量