SFBDA:一种语义解耦的无人机红外小目标检测数据增强框架

IF 4.4
Zhenhai Weng;Weijie He;Jianfeng Lv;Dong Zhou;Zhongliang Yu
{"title":"SFBDA:一种语义解耦的无人机红外小目标检测数据增强框架","authors":"Zhenhai Weng;Weijie He;Jianfeng Lv;Dong Zhou;Zhongliang Yu","doi":"10.1109/LGRS.2025.3597530","DOIUrl":null,"url":null,"abstract":"Few-shot object detection (FSOD) is a critical frontier in computer vision research. However, the task of an infrared (IR) FSOD presents significant technical challenges, primarily due to the following: 1) few annotated training samples and 2) low-texture nature of thermal imaging. To address these issues, we propose a semantic-guided foreground–background decoupling augmentation (SFBDA) framework. This method includes an instance-level foreground separation (ILFS) module that utilizes the segment anything model (SAM) to separate the objects, as well as a semantic-constrained background generation network that employs adversarial learning to synthesize contextually compatible backgrounds. To address the insufficiency of scenario diversity in existing uncrewed aerial vehicle (UAV)-based IR object detection datasets, we introduce multiscene IR UAV object detection (MSIR-UAVDET), a novel multiscene IR UAV detection benchmark. This dataset encompasses 16 object categories across diverse environments (terrestrial, maritime, and aerial). To validate the efficacy of the proposed data augmentation methodology, we integrated our approach with existing FSOD frameworks, and comparative experiments were conducted to benchmark our method with existing data augmentation methods. The code and dataset can be publicly available at: <uri>https://github.com/Sea814/SFBDA.git</uri>","PeriodicalId":91017,"journal":{"name":"IEEE geoscience and remote sensing letters : a publication of the IEEE Geoscience and Remote Sensing Society","volume":"22 ","pages":"1-5"},"PeriodicalIF":4.4000,"publicationDate":"2025-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"SFBDA: A Semantic-Decoupled Data Augmentation Framework for Infrared Few-Shot Object Detection on UAVs\",\"authors\":\"Zhenhai Weng;Weijie He;Jianfeng Lv;Dong Zhou;Zhongliang Yu\",\"doi\":\"10.1109/LGRS.2025.3597530\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Few-shot object detection (FSOD) is a critical frontier in computer vision research. However, the task of an infrared (IR) FSOD presents significant technical challenges, primarily due to the following: 1) few annotated training samples and 2) low-texture nature of thermal imaging. To address these issues, we propose a semantic-guided foreground–background decoupling augmentation (SFBDA) framework. This method includes an instance-level foreground separation (ILFS) module that utilizes the segment anything model (SAM) to separate the objects, as well as a semantic-constrained background generation network that employs adversarial learning to synthesize contextually compatible backgrounds. To address the insufficiency of scenario diversity in existing uncrewed aerial vehicle (UAV)-based IR object detection datasets, we introduce multiscene IR UAV object detection (MSIR-UAVDET), a novel multiscene IR UAV detection benchmark. This dataset encompasses 16 object categories across diverse environments (terrestrial, maritime, and aerial). To validate the efficacy of the proposed data augmentation methodology, we integrated our approach with existing FSOD frameworks, and comparative experiments were conducted to benchmark our method with existing data augmentation methods. The code and dataset can be publicly available at: <uri>https://github.com/Sea814/SFBDA.git</uri>\",\"PeriodicalId\":91017,\"journal\":{\"name\":\"IEEE geoscience and remote sensing letters : a publication of the IEEE Geoscience and Remote Sensing Society\",\"volume\":\"22 \",\"pages\":\"1-5\"},\"PeriodicalIF\":4.4000,\"publicationDate\":\"2025-08-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE geoscience and remote sensing letters : a publication of the IEEE Geoscience and Remote Sensing Society\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/11121893/\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE geoscience and remote sensing letters : a publication of the IEEE Geoscience and Remote Sensing Society","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/11121893/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

少镜头目标检测(FSOD)是计算机视觉研究的一个重要前沿。然而,红外(IR) FSOD的任务提出了重大的技术挑战,主要是由于以下几点:1)很少有注释的训练样本和2)热成像的低纹理特性。为了解决这些问题,我们提出了一个语义引导的前景-背景解耦增强(SFBDA)框架。该方法包括一个实例级前景分离(ILFS)模块,该模块利用分段任意模型(SAM)分离对象,以及一个语义约束的背景生成网络,该网络采用对抗学习来合成上下文兼容的背景。针对现有基于无人机(UAV)的红外目标检测数据集场景多样性不足的问题,提出了一种新的多场景红外无人机目标检测基准——MSIR-UAVDET。该数据集包含不同环境(陆地、海洋和空中)的16个对象类别。为了验证所提出的数据增强方法的有效性,我们将我们的方法与现有的FSOD框架相结合,并进行了对比实验,对我们的方法与现有的数据增强方法进行了基准测试。代码和数据集可以在https://github.com/Sea814/SFBDA.git上公开获取
本文章由计算机程序翻译,如有差异,请以英文原文为准。
SFBDA: A Semantic-Decoupled Data Augmentation Framework for Infrared Few-Shot Object Detection on UAVs
Few-shot object detection (FSOD) is a critical frontier in computer vision research. However, the task of an infrared (IR) FSOD presents significant technical challenges, primarily due to the following: 1) few annotated training samples and 2) low-texture nature of thermal imaging. To address these issues, we propose a semantic-guided foreground–background decoupling augmentation (SFBDA) framework. This method includes an instance-level foreground separation (ILFS) module that utilizes the segment anything model (SAM) to separate the objects, as well as a semantic-constrained background generation network that employs adversarial learning to synthesize contextually compatible backgrounds. To address the insufficiency of scenario diversity in existing uncrewed aerial vehicle (UAV)-based IR object detection datasets, we introduce multiscene IR UAV object detection (MSIR-UAVDET), a novel multiscene IR UAV detection benchmark. This dataset encompasses 16 object categories across diverse environments (terrestrial, maritime, and aerial). To validate the efficacy of the proposed data augmentation methodology, we integrated our approach with existing FSOD frameworks, and comparative experiments were conducted to benchmark our method with existing data augmentation methods. The code and dataset can be publicly available at: https://github.com/Sea814/SFBDA.git
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信