Shiyuan Li , Hongbo Bi , Disen Mo , Cong Zhang , Yue Li
{"title":"ECNet:用于协同伪装目标检测的边缘引导和交叉图像感知网络","authors":"Shiyuan Li , Hongbo Bi , Disen Mo , Cong Zhang , Yue Li","doi":"10.1016/j.imavis.2025.105697","DOIUrl":null,"url":null,"abstract":"<div><div>Traditional camouflaged object detection (COD) methods typically focus on individual images, ignoring the contextual information from multiple related images. However, objects are often captured in multiple images or from different viewpoints in real scenarios. Leveraging collaborative information from multiple images can achieve more robust and accurate detection. This collaborative approach, known as “Collaborative Camouflaged Object Detection (CoCOD)”, addresses the limitations of single-image methods by exploiting complementary information from multiple images, enhancing detection performance. Recent advancements in CoCOD have shown notable progress. However, challenges remain in effectively extracting multi-scale features and facilitating cross-attention feature interactions. To address these limitations, we propose a novel framework, named the Edge-Guided and Cross-Image Perception Network (ECNet). The ECNet consists of two core components: the edge-guided scale module (EGSM) and the cross-image perception enhancement module (CPEM). Specifically, EGSM enhances feature extraction by integrating edge-aware guidance with multi-scale asymmetric convolutions. Meanwhile, CPEM strengthens cross-image feature interaction by introducing collaborative attention, which reinforces semantic consistency among correlated targets and suppresses distracting background information. By integrating edge-aware features across multiple spatial scales and cross-image semantic consistency, ECNet effectively addresses the challenges of camouflage detection in visually complex scenarios. Extensive experiments on the CoCOD8K dataset demonstrate that our proposed ECNet outperforms 18 state-of-the-art COD methods, 11 co-salient object detection (CoSOD) models, and 4 CoCOD approaches, as evaluated by six widely used metrics.</div></div>","PeriodicalId":50374,"journal":{"name":"Image and Vision Computing","volume":"162 ","pages":"Article 105697"},"PeriodicalIF":4.2000,"publicationDate":"2025-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"ECNet: An edge-guided and cross-image perception network for collaborative camouflaged object detection\",\"authors\":\"Shiyuan Li , Hongbo Bi , Disen Mo , Cong Zhang , Yue Li\",\"doi\":\"10.1016/j.imavis.2025.105697\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Traditional camouflaged object detection (COD) methods typically focus on individual images, ignoring the contextual information from multiple related images. However, objects are often captured in multiple images or from different viewpoints in real scenarios. Leveraging collaborative information from multiple images can achieve more robust and accurate detection. This collaborative approach, known as “Collaborative Camouflaged Object Detection (CoCOD)”, addresses the limitations of single-image methods by exploiting complementary information from multiple images, enhancing detection performance. Recent advancements in CoCOD have shown notable progress. However, challenges remain in effectively extracting multi-scale features and facilitating cross-attention feature interactions. To address these limitations, we propose a novel framework, named the Edge-Guided and Cross-Image Perception Network (ECNet). The ECNet consists of two core components: the edge-guided scale module (EGSM) and the cross-image perception enhancement module (CPEM). Specifically, EGSM enhances feature extraction by integrating edge-aware guidance with multi-scale asymmetric convolutions. Meanwhile, CPEM strengthens cross-image feature interaction by introducing collaborative attention, which reinforces semantic consistency among correlated targets and suppresses distracting background information. By integrating edge-aware features across multiple spatial scales and cross-image semantic consistency, ECNet effectively addresses the challenges of camouflage detection in visually complex scenarios. Extensive experiments on the CoCOD8K dataset demonstrate that our proposed ECNet outperforms 18 state-of-the-art COD methods, 11 co-salient object detection (CoSOD) models, and 4 CoCOD approaches, as evaluated by six widely used metrics.</div></div>\",\"PeriodicalId\":50374,\"journal\":{\"name\":\"Image and Vision Computing\",\"volume\":\"162 \",\"pages\":\"Article 105697\"},\"PeriodicalIF\":4.2000,\"publicationDate\":\"2025-08-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Image and Vision Computing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0262885625002859\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Image and Vision Computing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0262885625002859","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
ECNet: An edge-guided and cross-image perception network for collaborative camouflaged object detection
Traditional camouflaged object detection (COD) methods typically focus on individual images, ignoring the contextual information from multiple related images. However, objects are often captured in multiple images or from different viewpoints in real scenarios. Leveraging collaborative information from multiple images can achieve more robust and accurate detection. This collaborative approach, known as “Collaborative Camouflaged Object Detection (CoCOD)”, addresses the limitations of single-image methods by exploiting complementary information from multiple images, enhancing detection performance. Recent advancements in CoCOD have shown notable progress. However, challenges remain in effectively extracting multi-scale features and facilitating cross-attention feature interactions. To address these limitations, we propose a novel framework, named the Edge-Guided and Cross-Image Perception Network (ECNet). The ECNet consists of two core components: the edge-guided scale module (EGSM) and the cross-image perception enhancement module (CPEM). Specifically, EGSM enhances feature extraction by integrating edge-aware guidance with multi-scale asymmetric convolutions. Meanwhile, CPEM strengthens cross-image feature interaction by introducing collaborative attention, which reinforces semantic consistency among correlated targets and suppresses distracting background information. By integrating edge-aware features across multiple spatial scales and cross-image semantic consistency, ECNet effectively addresses the challenges of camouflage detection in visually complex scenarios. Extensive experiments on the CoCOD8K dataset demonstrate that our proposed ECNet outperforms 18 state-of-the-art COD methods, 11 co-salient object detection (CoSOD) models, and 4 CoCOD approaches, as evaluated by six widely used metrics.
期刊介绍:
Image and Vision Computing has as a primary aim the provision of an effective medium of interchange for the results of high quality theoretical and applied research fundamental to all aspects of image interpretation and computer vision. The journal publishes work that proposes new image interpretation and computer vision methodology or addresses the application of such methods to real world scenes. It seeks to strengthen a deeper understanding in the discipline by encouraging the quantitative comparison and performance evaluation of the proposed methodology. The coverage includes: image interpretation, scene modelling, object recognition and tracking, shape analysis, monitoring and surveillance, active vision and robotic systems, SLAM, biologically-inspired computer vision, motion analysis, stereo vision, document image understanding, character and handwritten text recognition, face and gesture recognition, biometrics, vision-based human-computer interaction, human activity and behavior understanding, data fusion from multiple sensor inputs, image databases.