ECNet：用于协同伪装目标检测的边缘引导和交叉图像感知网络

IF 4.2 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Image and Vision Computing Pub Date : 2025-08-14 DOI:10.1016/j.imavis.2025.105697

Shiyuan Li , Hongbo Bi , Disen Mo , Cong Zhang , Yue Li

{"title":"ECNet：用于协同伪装目标检测的边缘引导和交叉图像感知网络","authors":"Shiyuan Li , Hongbo Bi , Disen Mo , Cong Zhang , Yue Li","doi":"10.1016/j.imavis.2025.105697","DOIUrl":null,"url":null,"abstract":"<div><div>Traditional camouflaged object detection (COD) methods typically focus on individual images, ignoring the contextual information from multiple related images. However, objects are often captured in multiple images or from different viewpoints in real scenarios. Leveraging collaborative information from multiple images can achieve more robust and accurate detection. This collaborative approach, known as “Collaborative Camouflaged Object Detection (CoCOD)”, addresses the limitations of single-image methods by exploiting complementary information from multiple images, enhancing detection performance. Recent advancements in CoCOD have shown notable progress. However, challenges remain in effectively extracting multi-scale features and facilitating cross-attention feature interactions. To address these limitations, we propose a novel framework, named the Edge-Guided and Cross-Image Perception Network (ECNet). The ECNet consists of two core components: the edge-guided scale module (EGSM) and the cross-image perception enhancement module (CPEM). Specifically, EGSM enhances feature extraction by integrating edge-aware guidance with multi-scale asymmetric convolutions. Meanwhile, CPEM strengthens cross-image feature interaction by introducing collaborative attention, which reinforces semantic consistency among correlated targets and suppresses distracting background information. By integrating edge-aware features across multiple spatial scales and cross-image semantic consistency, ECNet effectively addresses the challenges of camouflage detection in visually complex scenarios. Extensive experiments on the CoCOD8K dataset demonstrate that our proposed ECNet outperforms 18 state-of-the-art COD methods, 11 co-salient object detection (CoSOD) models, and 4 CoCOD approaches, as evaluated by six widely used metrics.</div></div>","PeriodicalId":50374,"journal":{"name":"Image and Vision Computing","volume":"162 ","pages":"Article 105697"},"PeriodicalIF":4.2000,"publicationDate":"2025-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"ECNet: An edge-guided and cross-image perception network for collaborative camouflaged object detection\",\"authors\":\"Shiyuan Li , Hongbo Bi , Disen Mo , Cong Zhang , Yue Li\",\"doi\":\"10.1016/j.imavis.2025.105697\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Traditional camouflaged object detection (COD) methods typically focus on individual images, ignoring the contextual information from multiple related images. However, objects are often captured in multiple images or from different viewpoints in real scenarios. Leveraging collaborative information from multiple images can achieve more robust and accurate detection. This collaborative approach, known as “Collaborative Camouflaged Object Detection (CoCOD)”, addresses the limitations of single-image methods by exploiting complementary information from multiple images, enhancing detection performance. Recent advancements in CoCOD have shown notable progress. However, challenges remain in effectively extracting multi-scale features and facilitating cross-attention feature interactions. To address these limitations, we propose a novel framework, named the Edge-Guided and Cross-Image Perception Network (ECNet). The ECNet consists of two core components: the edge-guided scale module (EGSM) and the cross-image perception enhancement module (CPEM). Specifically, EGSM enhances feature extraction by integrating edge-aware guidance with multi-scale asymmetric convolutions. Meanwhile, CPEM strengthens cross-image feature interaction by introducing collaborative attention, which reinforces semantic consistency among correlated targets and suppresses distracting background information. By integrating edge-aware features across multiple spatial scales and cross-image semantic consistency, ECNet effectively addresses the challenges of camouflage detection in visually complex scenarios. Extensive experiments on the CoCOD8K dataset demonstrate that our proposed ECNet outperforms 18 state-of-the-art COD methods, 11 co-salient object detection (CoSOD) models, and 4 CoCOD approaches, as evaluated by six widely used metrics.</div></div>\",\"PeriodicalId\":50374,\"journal\":{\"name\":\"Image and Vision Computing\",\"volume\":\"162 \",\"pages\":\"Article 105697\"},\"PeriodicalIF\":4.2000,\"publicationDate\":\"2025-08-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Image and Vision Computing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0262885625002859\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Image and Vision Computing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0262885625002859","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

传统的伪装目标检测方法通常只关注单个图像，而忽略了来自多个相关图像的上下文信息。然而，在真实场景中，物体通常被捕获在多个图像中或从不同的角度。利用来自多个图像的协作信息可以实现更稳健和准确的检测。这种协作方法被称为“协同伪装目标检测（CoCOD）”，通过利用多幅图像的互补信息，解决了单图像方法的局限性，提高了检测性能。最近在cocd方面取得的进展是显著的。然而，如何有效地提取多尺度特征和促进跨注意特征的交互仍然存在挑战。为了解决这些限制，我们提出了一个新的框架，称为边缘引导和交叉图像感知网络（ECNet）。ECNet由两个核心组件组成：边缘引导尺度模块（EGSM）和交叉图像感知增强模块（CPEM）。具体而言，EGSM通过将边缘感知制导与多尺度非对称卷积相结合来增强特征提取。同时，CPEM通过引入协同注意来加强图像间特征的交互，增强了相关目标之间的语义一致性，抑制了背景信息的干扰。通过集成跨多个空间尺度的边缘感知特征和跨图像语义一致性，ECNet有效地解决了视觉复杂场景中伪装检测的挑战。在CoCOD8K数据集上进行的大量实验表明，我们提出的ECNet优于18种最先进的COD方法，11种共同显著目标检测（CoSOD）模型和4种CoCOD方法，并通过6个广泛使用的指标进行了评估。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

ECNet: An edge-guided and cross-image perception network for collaborative camouflaged object detection

Traditional camouflaged object detection (COD) methods typically focus on individual images, ignoring the contextual information from multiple related images. However, objects are often captured in multiple images or from different viewpoints in real scenarios. Leveraging collaborative information from multiple images can achieve more robust and accurate detection. This collaborative approach, known as “Collaborative Camouflaged Object Detection (CoCOD)”, addresses the limitations of single-image methods by exploiting complementary information from multiple images, enhancing detection performance. Recent advancements in CoCOD have shown notable progress. However, challenges remain in effectively extracting multi-scale features and facilitating cross-attention feature interactions. To address these limitations, we propose a novel framework, named the Edge-Guided and Cross-Image Perception Network (ECNet). The ECNet consists of two core components: the edge-guided scale module (EGSM) and the cross-image perception enhancement module (CPEM). Specifically, EGSM enhances feature extraction by integrating edge-aware guidance with multi-scale asymmetric convolutions. Meanwhile, CPEM strengthens cross-image feature interaction by introducing collaborative attention, which reinforces semantic consistency among correlated targets and suppresses distracting background information. By integrating edge-aware features across multiple spatial scales and cross-image semantic consistency, ECNet effectively addresses the challenges of camouflage detection in visually complex scenarios. Extensive experiments on the CoCOD8K dataset demonstrate that our proposed ECNet outperforms 18 state-of-the-art COD methods, 11 co-salient object detection (CoSOD) models, and 4 CoCOD approaches, as evaluated by six widely used metrics.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Image and Vision Computing 工程技术-工程：电子与电气

CiteScore

8.50

自引率

8.50%

发文量

143

审稿时长

7.8 months

期刊介绍： Image and Vision Computing has as a primary aim the provision of an effective medium of interchange for the results of high quality theoretical and applied research fundamental to all aspects of image interpretation and computer vision. The journal publishes work that proposes new image interpretation and computer vision methodology or addresses the application of such methods to real world scenes. It seeks to strengthen a deeper understanding in the discipline by encouraging the quantitative comparison and performance evaluation of the proposed methodology. The coverage includes: image interpretation, scene modelling, object recognition and tracking, shape analysis, monitoring and surveillance, active vision and robotic systems, SLAM, biologically-inspired computer vision, motion analysis, stereo vision, document image understanding, character and handwritten text recognition, face and gesture recognition, biometrics, vision-based human-computer interaction, human activity and behavior understanding, data fusion from multiple sensor inputs, image databases.