{"title":"Few-shot Remote Sensing Imagery Recognition with Compositionality Inductive Bias in Hierarchical Representation Space","authors":"Shichao Zhou;Zhuowei Wang;Zekai Zhang;Wenzheng Wang;Yingrui Zhao;Yunpu Zhang","doi":"10.1109/JSTARS.2024.3524573","DOIUrl":null,"url":null,"abstract":"Remote sensing scenes from aerial perspective can be constructed by distinct visual parts in a combinatorial number of different ways. Such combinatorial explosion poses great challenges to understanding remote sensing imagery (RSI) with few prior instances (i.e., few-shot RSI recognition). Despite empirical success of existing methods such as data augmentation and knowledge transfer, no large-scale dataset can cover all possible combinations of visual parts. In this case, the prior knowledge learned from these data-driven methods may exhibit dataset bias, resulting in inadequate generalization to the current recognition task. Different from the naive data-driven strategies mentioned above, we alternatively devote to delicate feature modeling by constraining the mapping behavior of deep neural networks. Specifically, we embed inductive bias of compositionality into hierarchical latent representation space, which operates on two aspects: 1) disentangled and reusable representation. We establish a clustering-oriented factorized representation with a mixture model to represent multipart distributions of tokens. Each cluster centroid represents a re-occurring part. New patches are allocated to the nearest cluster centroid, and then we obtain the posterior representation; 2) compositional and discriminative representation. We introduce a hierarchical context prediction mechanism for compositional representation learning, utilizing a predictive NCE loss function to encourage global remote sensing scenes to accurately predict similar local parts, and thus automatically inferring compositional representations of high-level but discriminative latent concepts. Extensive experiments, including comparative experiments with SOTA, sensitivity evaluations, and ablation studies, demonstrate comparable or even superior performance of our method in few-shot RSI recognition.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"18 ","pages":"3544-3555"},"PeriodicalIF":4.7000,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10819630","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10819630/","RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0
Abstract
Remote sensing scenes from aerial perspective can be constructed by distinct visual parts in a combinatorial number of different ways. Such combinatorial explosion poses great challenges to understanding remote sensing imagery (RSI) with few prior instances (i.e., few-shot RSI recognition). Despite empirical success of existing methods such as data augmentation and knowledge transfer, no large-scale dataset can cover all possible combinations of visual parts. In this case, the prior knowledge learned from these data-driven methods may exhibit dataset bias, resulting in inadequate generalization to the current recognition task. Different from the naive data-driven strategies mentioned above, we alternatively devote to delicate feature modeling by constraining the mapping behavior of deep neural networks. Specifically, we embed inductive bias of compositionality into hierarchical latent representation space, which operates on two aspects: 1) disentangled and reusable representation. We establish a clustering-oriented factorized representation with a mixture model to represent multipart distributions of tokens. Each cluster centroid represents a re-occurring part. New patches are allocated to the nearest cluster centroid, and then we obtain the posterior representation; 2) compositional and discriminative representation. We introduce a hierarchical context prediction mechanism for compositional representation learning, utilizing a predictive NCE loss function to encourage global remote sensing scenes to accurately predict similar local parts, and thus automatically inferring compositional representations of high-level but discriminative latent concepts. Extensive experiments, including comparative experiments with SOTA, sensitivity evaluations, and ablation studies, demonstrate comparable or even superior performance of our method in few-shot RSI recognition.
期刊介绍:
The IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing addresses the growing field of applications in Earth observations and remote sensing, and also provides a venue for the rapidly expanding special issues that are being sponsored by the IEEE Geosciences and Remote Sensing Society. The journal draws upon the experience of the highly successful “IEEE Transactions on Geoscience and Remote Sensing” and provide a complementary medium for the wide range of topics in applied earth observations. The ‘Applications’ areas encompasses the societal benefit areas of the Global Earth Observations Systems of Systems (GEOSS) program. Through deliberations over two years, ministers from 50 countries agreed to identify nine areas where Earth observation could positively impact the quality of life and health of their respective countries. Some of these are areas not traditionally addressed in the IEEE context. These include biodiversity, health and climate. Yet it is the skill sets of IEEE members, in areas such as observations, communications, computers, signal processing, standards and ocean engineering, that form the technical underpinnings of GEOSS. Thus, the Journal attracts a broad range of interests that serves both present members in new ways and expands the IEEE visibility into new areas.