{"title":"基于遥感图像的目标检测中更精确的自监督学习约束","authors":"Shangdong Zheng;Zebin Wu;Yang Xu;Qian Liu;Zhihui Wei","doi":"10.1109/JSTARS.2025.3564368","DOIUrl":null,"url":null,"abstract":"Self-supervised learning (SSL) automatically generates internal labels by exploring the potential auxiliary task of the network itself, and trains the same model through these annotations to learn the latent representation of the data, which greatly improves the accuracy of object detection in remote sensing images (RSIs). However, most existing methods suffer from guaranteeing the quality of the generated SSL pseudoannotations, and the constructed auxiliary tasks are not detection-oriented, which is difficult to enhance the feature representations that are beneficial for object detection. In this article, we focus on generating more accurate constraints by excavating the intercorrelation between fully and weakly supervised learning (WSL) to improve the performance of object detection in RSIs. Initially, WSL assigns the pseudoinstance-level annotations for the high-scoring positive bags to model the detector, which can be regarded as a weakened version of the region proposal network (RPN). Fortunately, RPN can be constrained by the ground truth (GT) of bounding boxes in fully supervised learning (FSL), and the high-quality supervisions it provides are unavailable in any WSL methods. Moreover, we construct a proposal generation module, which further filters the unreliable bounding boxes, predicted by RPN, to supplement high-quality constraints into the GT and SS generated candidate boxes to supervise the optimization of WSL branch. By constructing an interactive learning paradigm of WSL and FSL, the former has more accurate constraints to learn an efficient auxiliary task, while the latter enjoys a richer representation form of data provided by WSL, which is undoubtedly a win–win process. Finally, we cascade the losses of WSL and FSL to further explore the intrinsic correlation between them by sharing the same feature extraction network. Experimental comparisons on DOTA and DIOR datasets demonstrate that our method achieves superior performance than many recent object detection approaches by the significant margin.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"18 ","pages":"12303-12314"},"PeriodicalIF":4.7000,"publicationDate":"2025-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10976539","citationCount":"0","resultStr":"{\"title\":\"More Accurate Constraints for Self-Supervised Learning in Remote Sensing Images-Based Object Detection\",\"authors\":\"Shangdong Zheng;Zebin Wu;Yang Xu;Qian Liu;Zhihui Wei\",\"doi\":\"10.1109/JSTARS.2025.3564368\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Self-supervised learning (SSL) automatically generates internal labels by exploring the potential auxiliary task of the network itself, and trains the same model through these annotations to learn the latent representation of the data, which greatly improves the accuracy of object detection in remote sensing images (RSIs). However, most existing methods suffer from guaranteeing the quality of the generated SSL pseudoannotations, and the constructed auxiliary tasks are not detection-oriented, which is difficult to enhance the feature representations that are beneficial for object detection. In this article, we focus on generating more accurate constraints by excavating the intercorrelation between fully and weakly supervised learning (WSL) to improve the performance of object detection in RSIs. Initially, WSL assigns the pseudoinstance-level annotations for the high-scoring positive bags to model the detector, which can be regarded as a weakened version of the region proposal network (RPN). Fortunately, RPN can be constrained by the ground truth (GT) of bounding boxes in fully supervised learning (FSL), and the high-quality supervisions it provides are unavailable in any WSL methods. Moreover, we construct a proposal generation module, which further filters the unreliable bounding boxes, predicted by RPN, to supplement high-quality constraints into the GT and SS generated candidate boxes to supervise the optimization of WSL branch. By constructing an interactive learning paradigm of WSL and FSL, the former has more accurate constraints to learn an efficient auxiliary task, while the latter enjoys a richer representation form of data provided by WSL, which is undoubtedly a win–win process. Finally, we cascade the losses of WSL and FSL to further explore the intrinsic correlation between them by sharing the same feature extraction network. Experimental comparisons on DOTA and DIOR datasets demonstrate that our method achieves superior performance than many recent object detection approaches by the significant margin.\",\"PeriodicalId\":13116,\"journal\":{\"name\":\"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing\",\"volume\":\"18 \",\"pages\":\"12303-12314\"},\"PeriodicalIF\":4.7000,\"publicationDate\":\"2025-04-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10976539\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10976539/\",\"RegionNum\":2,\"RegionCategory\":\"地球科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10976539/","RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
More Accurate Constraints for Self-Supervised Learning in Remote Sensing Images-Based Object Detection
Self-supervised learning (SSL) automatically generates internal labels by exploring the potential auxiliary task of the network itself, and trains the same model through these annotations to learn the latent representation of the data, which greatly improves the accuracy of object detection in remote sensing images (RSIs). However, most existing methods suffer from guaranteeing the quality of the generated SSL pseudoannotations, and the constructed auxiliary tasks are not detection-oriented, which is difficult to enhance the feature representations that are beneficial for object detection. In this article, we focus on generating more accurate constraints by excavating the intercorrelation between fully and weakly supervised learning (WSL) to improve the performance of object detection in RSIs. Initially, WSL assigns the pseudoinstance-level annotations for the high-scoring positive bags to model the detector, which can be regarded as a weakened version of the region proposal network (RPN). Fortunately, RPN can be constrained by the ground truth (GT) of bounding boxes in fully supervised learning (FSL), and the high-quality supervisions it provides are unavailable in any WSL methods. Moreover, we construct a proposal generation module, which further filters the unreliable bounding boxes, predicted by RPN, to supplement high-quality constraints into the GT and SS generated candidate boxes to supervise the optimization of WSL branch. By constructing an interactive learning paradigm of WSL and FSL, the former has more accurate constraints to learn an efficient auxiliary task, while the latter enjoys a richer representation form of data provided by WSL, which is undoubtedly a win–win process. Finally, we cascade the losses of WSL and FSL to further explore the intrinsic correlation between them by sharing the same feature extraction network. Experimental comparisons on DOTA and DIOR datasets demonstrate that our method achieves superior performance than many recent object detection approaches by the significant margin.
期刊介绍:
The IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing addresses the growing field of applications in Earth observations and remote sensing, and also provides a venue for the rapidly expanding special issues that are being sponsored by the IEEE Geosciences and Remote Sensing Society. The journal draws upon the experience of the highly successful “IEEE Transactions on Geoscience and Remote Sensing” and provide a complementary medium for the wide range of topics in applied earth observations. The ‘Applications’ areas encompasses the societal benefit areas of the Global Earth Observations Systems of Systems (GEOSS) program. Through deliberations over two years, ministers from 50 countries agreed to identify nine areas where Earth observation could positively impact the quality of life and health of their respective countries. Some of these are areas not traditionally addressed in the IEEE context. These include biodiversity, health and climate. Yet it is the skill sets of IEEE members, in areas such as observations, communications, computers, signal processing, standards and ocean engineering, that form the technical underpinnings of GEOSS. Thus, the Journal attracts a broad range of interests that serves both present members in new ways and expands the IEEE visibility into new areas.