Cross-Model Nested Fusion Network for Salient Object Detection in Optical Remote Sensing Images.

IF 10.5 1区计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS

IEEE Transactions on Cybernetics Pub Date : 2025-09-12 DOI:10.1109/tcyb.2025.3571913

Mingzhu Xu,Sen Wang,Yupeng Hu,Haoyu Tang,Runmin Cong,Liqiang Nie

{"title":"Cross-Model Nested Fusion Network for Salient Object Detection in Optical Remote Sensing Images.","authors":"Mingzhu Xu,Sen Wang,Yupeng Hu,Haoyu Tang,Runmin Cong,Liqiang Nie","doi":"10.1109/tcyb.2025.3571913","DOIUrl":null,"url":null,"abstract":"Recently, salient object detection (SOD) in optical remote sensing images, dubbed ORSI-SOD, has attracted increasing research interest. Although deep-based models have achieved impressive performance, several limitations remain: a single image contains multiple objects with varying scales, complex topological structures, and background interference. These unresolved issues render ORSI-SOD a challenging task. To address these challenges, we introduce a distinctive cross-model nested fusion network (CMNFNet), which leverages heterogeneous features to increase the performance of ORSI-SOD. Specifically, the proposed model comprises two heterogeneous encoders, a conventional CNN-based encoder that can model local features, and a specially designed graph convolutional network (GCN)-based encoder with local and global receptive fields that can model local and global features simultaneously. To effectively differentiate between multiple salient objects of different sizes or complex topological structures within an image, we project the image into two different graphs with different receptive fields and conduct message passing through two parallel graph convolutions. Finally, the heterogeneous features extracted from the two encoders are fused in the well-designed attention enhanced cross model nested fusion module (AECMNFM). This module is meticulously crafted to integrate features progressively, allowing the model to adaptively eliminate background interference while simultaneously refining the feature representations. We conducted comprehensive experimental analyzes on benchmark datasets. The results demonstrate the superiority of our CMNFNet over 16 state-of-the-art (SOTA) models.","PeriodicalId":13112,"journal":{"name":"IEEE Transactions on Cybernetics","volume":"66 1","pages":""},"PeriodicalIF":10.5000,"publicationDate":"2025-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Cybernetics","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1109/tcyb.2025.3571913","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

Recently, salient object detection (SOD) in optical remote sensing images, dubbed ORSI-SOD, has attracted increasing research interest. Although deep-based models have achieved impressive performance, several limitations remain: a single image contains multiple objects with varying scales, complex topological structures, and background interference. These unresolved issues render ORSI-SOD a challenging task. To address these challenges, we introduce a distinctive cross-model nested fusion network (CMNFNet), which leverages heterogeneous features to increase the performance of ORSI-SOD. Specifically, the proposed model comprises two heterogeneous encoders, a conventional CNN-based encoder that can model local features, and a specially designed graph convolutional network (GCN)-based encoder with local and global receptive fields that can model local and global features simultaneously. To effectively differentiate between multiple salient objects of different sizes or complex topological structures within an image, we project the image into two different graphs with different receptive fields and conduct message passing through two parallel graph convolutions. Finally, the heterogeneous features extracted from the two encoders are fused in the well-designed attention enhanced cross model nested fusion module (AECMNFM). This module is meticulously crafted to integrate features progressively, allowing the model to adaptively eliminate background interference while simultaneously refining the feature representations. We conducted comprehensive experimental analyzes on benchmark datasets. The results demonstrate the superiority of our CMNFNet over 16 state-of-the-art (SOTA) models.

查看原文本刊更多论文

光学遥感图像中显著目标检测的跨模型嵌套融合网络。

近年来，光学遥感图像中的显著目标检测（SOD）技术引起了越来越多的研究兴趣。尽管基于深度的模型取得了令人印象深刻的性能，但仍然存在一些局限性：单个图像包含多个不同尺度的对象，复杂的拓扑结构，以及背景干扰。这些未解决的问题使ORSI-SOD成为一项具有挑战性的任务。为了应对这些挑战，我们引入了一种独特的跨模型嵌套融合网络（CMNFNet），该网络利用异构特征来提高ORSI-SOD的性能。具体来说，该模型包括两个异构编码器，一个是传统的基于cnn的编码器，可以对局部特征进行建模，另一个是专门设计的基于图卷积网络（GCN）的编码器，具有局部和全局接受域，可以同时对局部和全局特征进行建模。为了有效区分图像中不同大小或复杂拓扑结构的多个显著对象，我们将图像投影到具有不同接受域的两个不同图中，并通过两个并行图卷积进行信息传递。最后，在精心设计的注意力增强交叉模型嵌套融合模块（AECMNFM）中融合从两个编码器中提取的异构特征。该模块经过精心设计，逐步整合特征，使模型能够自适应地消除背景干扰，同时细化特征表示。我们对基准数据集进行了全面的实验分析。结果表明我们的CMNFNet优于16个最先进的（SOTA）模型。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Transactions on Cybernetics COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE-COMPUTER SCIENCE, CYBERNETICS

CiteScore

25.40

自引率

11.00%

发文量

1869

期刊介绍： The scope of the IEEE Transactions on Cybernetics includes computational approaches to the field of cybernetics. Specifically, the transactions welcomes papers on communication and control across machines or machine, human, and organizations. The scope includes such areas as computational intelligence, computer vision, neural networks, genetic algorithms, machine learning, fuzzy systems, cognitive systems, decision making, and robotics, to the extent that they contribute to the theme of cybernetics or demonstrate an application of cybernetics principles.