{"title":"SFBDA:一种语义解耦的无人机红外小目标检测数据增强框架","authors":"Zhenhai Weng;Weijie He;Jianfeng Lv;Dong Zhou;Zhongliang Yu","doi":"10.1109/LGRS.2025.3597530","DOIUrl":null,"url":null,"abstract":"Few-shot object detection (FSOD) is a critical frontier in computer vision research. However, the task of an infrared (IR) FSOD presents significant technical challenges, primarily due to the following: 1) few annotated training samples and 2) low-texture nature of thermal imaging. To address these issues, we propose a semantic-guided foreground–background decoupling augmentation (SFBDA) framework. This method includes an instance-level foreground separation (ILFS) module that utilizes the segment anything model (SAM) to separate the objects, as well as a semantic-constrained background generation network that employs adversarial learning to synthesize contextually compatible backgrounds. To address the insufficiency of scenario diversity in existing uncrewed aerial vehicle (UAV)-based IR object detection datasets, we introduce multiscene IR UAV object detection (MSIR-UAVDET), a novel multiscene IR UAV detection benchmark. This dataset encompasses 16 object categories across diverse environments (terrestrial, maritime, and aerial). To validate the efficacy of the proposed data augmentation methodology, we integrated our approach with existing FSOD frameworks, and comparative experiments were conducted to benchmark our method with existing data augmentation methods. The code and dataset can be publicly available at: <uri>https://github.com/Sea814/SFBDA.git</uri>","PeriodicalId":91017,"journal":{"name":"IEEE geoscience and remote sensing letters : a publication of the IEEE Geoscience and Remote Sensing Society","volume":"22 ","pages":"1-5"},"PeriodicalIF":4.4000,"publicationDate":"2025-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"SFBDA: A Semantic-Decoupled Data Augmentation Framework for Infrared Few-Shot Object Detection on UAVs\",\"authors\":\"Zhenhai Weng;Weijie He;Jianfeng Lv;Dong Zhou;Zhongliang Yu\",\"doi\":\"10.1109/LGRS.2025.3597530\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Few-shot object detection (FSOD) is a critical frontier in computer vision research. However, the task of an infrared (IR) FSOD presents significant technical challenges, primarily due to the following: 1) few annotated training samples and 2) low-texture nature of thermal imaging. To address these issues, we propose a semantic-guided foreground–background decoupling augmentation (SFBDA) framework. This method includes an instance-level foreground separation (ILFS) module that utilizes the segment anything model (SAM) to separate the objects, as well as a semantic-constrained background generation network that employs adversarial learning to synthesize contextually compatible backgrounds. To address the insufficiency of scenario diversity in existing uncrewed aerial vehicle (UAV)-based IR object detection datasets, we introduce multiscene IR UAV object detection (MSIR-UAVDET), a novel multiscene IR UAV detection benchmark. This dataset encompasses 16 object categories across diverse environments (terrestrial, maritime, and aerial). To validate the efficacy of the proposed data augmentation methodology, we integrated our approach with existing FSOD frameworks, and comparative experiments were conducted to benchmark our method with existing data augmentation methods. The code and dataset can be publicly available at: <uri>https://github.com/Sea814/SFBDA.git</uri>\",\"PeriodicalId\":91017,\"journal\":{\"name\":\"IEEE geoscience and remote sensing letters : a publication of the IEEE Geoscience and Remote Sensing Society\",\"volume\":\"22 \",\"pages\":\"1-5\"},\"PeriodicalIF\":4.4000,\"publicationDate\":\"2025-08-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE geoscience and remote sensing letters : a publication of the IEEE Geoscience and Remote Sensing Society\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/11121893/\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE geoscience and remote sensing letters : a publication of the IEEE Geoscience and Remote Sensing Society","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/11121893/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
SFBDA: A Semantic-Decoupled Data Augmentation Framework for Infrared Few-Shot Object Detection on UAVs
Few-shot object detection (FSOD) is a critical frontier in computer vision research. However, the task of an infrared (IR) FSOD presents significant technical challenges, primarily due to the following: 1) few annotated training samples and 2) low-texture nature of thermal imaging. To address these issues, we propose a semantic-guided foreground–background decoupling augmentation (SFBDA) framework. This method includes an instance-level foreground separation (ILFS) module that utilizes the segment anything model (SAM) to separate the objects, as well as a semantic-constrained background generation network that employs adversarial learning to synthesize contextually compatible backgrounds. To address the insufficiency of scenario diversity in existing uncrewed aerial vehicle (UAV)-based IR object detection datasets, we introduce multiscene IR UAV object detection (MSIR-UAVDET), a novel multiscene IR UAV detection benchmark. This dataset encompasses 16 object categories across diverse environments (terrestrial, maritime, and aerial). To validate the efficacy of the proposed data augmentation methodology, we integrated our approach with existing FSOD frameworks, and comparative experiments were conducted to benchmark our method with existing data augmentation methods. The code and dataset can be publicly available at: https://github.com/Sea814/SFBDA.git