{"title":"DKETFormer: Salient object detection in optical remote sensing images based on discriminative knowledge extraction and transfer","authors":"Yuze Sun, Hongwei Zhao, Jianhang Zhou","doi":"10.1016/j.neucom.2025.129558","DOIUrl":null,"url":null,"abstract":"<div><div>Generally, most methods for salient object detection in optical remote sensing images (ORSI-SOD) are based on convolutional neural networks (CNNs). However, CNNs, due to their architectural characteristics, can only encode local semantic information, which leads to a lack of exploration of discriminative features on a large scale. Therefore, to encode the long-term dependency within the detection image, enhance the extraction of discriminative knowledge, and transfer it at multiple scales, we introduce a Transformer architecture called DKETFormer. Specifically, DKETFormer utilizes the Transformer backbone to obtain multi-scale feature maps that have encoded long-term dependency relationships. Then, it constructs a decoder using the Cross-spatial Knowledge Extraction Module (CKEM) and the Inter-layer Feature Transfer Module (IFTM). The CKEM is capable of extracting discriminative information across receptive fields while preserving knowledge from each channel. It also utilizes global information encoding to calibrate channel weights, resulting in improved knowledge aggregation and capturing of pixel-level pairwise relationships. The IFTM utilizes encoded and extracted information from the backbone and CKEM, employing a self-attention mechanism with cosine similarity knowledge to model and propagate discriminative features. Finally, we generated the final detection map using a salient object detector. The results of comparative experiments and ablation experiments demonstrate the effectiveness of the proposed DKETFormer and its internal modules.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"625 ","pages":"Article 129558"},"PeriodicalIF":5.5000,"publicationDate":"2025-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neurocomputing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0925231225002309","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Generally, most methods for salient object detection in optical remote sensing images (ORSI-SOD) are based on convolutional neural networks (CNNs). However, CNNs, due to their architectural characteristics, can only encode local semantic information, which leads to a lack of exploration of discriminative features on a large scale. Therefore, to encode the long-term dependency within the detection image, enhance the extraction of discriminative knowledge, and transfer it at multiple scales, we introduce a Transformer architecture called DKETFormer. Specifically, DKETFormer utilizes the Transformer backbone to obtain multi-scale feature maps that have encoded long-term dependency relationships. Then, it constructs a decoder using the Cross-spatial Knowledge Extraction Module (CKEM) and the Inter-layer Feature Transfer Module (IFTM). The CKEM is capable of extracting discriminative information across receptive fields while preserving knowledge from each channel. It also utilizes global information encoding to calibrate channel weights, resulting in improved knowledge aggregation and capturing of pixel-level pairwise relationships. The IFTM utilizes encoded and extracted information from the backbone and CKEM, employing a self-attention mechanism with cosine similarity knowledge to model and propagate discriminative features. Finally, we generated the final detection map using a salient object detector. The results of comparative experiments and ablation experiments demonstrate the effectiveness of the proposed DKETFormer and its internal modules.
期刊介绍:
Neurocomputing publishes articles describing recent fundamental contributions in the field of neurocomputing. Neurocomputing theory, practice and applications are the essential topics being covered.