SPANet: Spatial perceptual activation network for camouflaged object detection

IF 1.3 4区计算机科学 Q4 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

IET Computer Vision Pub Date : 2024-09-18 DOI:10.1049/cvi2.12310

Jianhao Zhang, Gang Yang, Xun Dai, Pengyu Yang

{"title":"SPANet: Spatial perceptual activation network for camouflaged object detection","authors":"Jianhao Zhang, Gang Yang, Xun Dai, Pengyu Yang","doi":"10.1049/cvi2.12310","DOIUrl":null,"url":null,"abstract":"<p>Camouflaged object detection (COD) aims to segment objects embedded in the environment from the background. Most existing methods are easily affected by background interference in cluttered environments and cannot accurately locate camouflage areas, resulting in over-segmentation or incomplete segmentation structures. To effectively improve the performance of COD, we propose a spatial perceptual activation network (SPANet). SPANet extracts the spatial positional relationship between each object in the scene by activating spatial perception and uses it as global information to guide segmentation. It mainly consists of three modules: perceptual activation module (PAM), feature inference module (FIM), and interaction recovery module (IRM). Specifically, the authors design a PAM to model the positional relationship between the camouflaged object and the surrounding environment to obtain semantic correlation information. Then, a FIM that can effectively combine correlation information to suppress background interference and re-encode to generate multi-scale features is proposed. In addition, to further fuse multi-scale features, an IRM to mine the complementary information and differences between features at different scales is designed. Extensive experimental results on four widely used benchmark datasets (i.e. CAMO, CHAMELEON, COD10K, and NC4K) show that the authors’ method outperforms 13 state-of-the-art methods.</p>","PeriodicalId":56304,"journal":{"name":"IET Computer Vision","volume":"18 8","pages":"1300-1312"},"PeriodicalIF":1.3000,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/cvi2.12310","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IET Computer Vision","FirstCategoryId":"94","ListUrlMain":"https://ietresearch.onlinelibrary.wiley.com/doi/10.1049/cvi2.12310","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Camouflaged object detection (COD) aims to segment objects embedded in the environment from the background. Most existing methods are easily affected by background interference in cluttered environments and cannot accurately locate camouflage areas, resulting in over-segmentation or incomplete segmentation structures. To effectively improve the performance of COD, we propose a spatial perceptual activation network (SPANet). SPANet extracts the spatial positional relationship between each object in the scene by activating spatial perception and uses it as global information to guide segmentation. It mainly consists of three modules: perceptual activation module (PAM), feature inference module (FIM), and interaction recovery module (IRM). Specifically, the authors design a PAM to model the positional relationship between the camouflaged object and the surrounding environment to obtain semantic correlation information. Then, a FIM that can effectively combine correlation information to suppress background interference and re-encode to generate multi-scale features is proposed. In addition, to further fuse multi-scale features, an IRM to mine the complementary information and differences between features at different scales is designed. Extensive experimental results on four widely used benchmark datasets (i.e. CAMO, CHAMELEON, COD10K, and NC4K) show that the authors’ method outperforms 13 state-of-the-art methods.

Abstract Image

查看原文本刊更多论文

用于伪装目标检测的空间感知激活网络

伪装目标检测（COD）旨在将嵌入环境中的目标从背景中分割出来。现有方法在杂乱环境下容易受到背景干扰的影响，无法准确定位伪装区域，导致分割过度或分割结构不完整。为了有效地提高COD的性能，我们提出了一种空间感知激活网络（SPANet）。SPANet通过激活空间感知来提取场景中各个物体之间的空间位置关系，并以此作为全局信息来指导分割。它主要包括三个模块：感知激活模块（PAM）、特征推理模块（FIM）和交互恢复模块（IRM）。具体而言，作者设计了一个PAM来模拟伪装对象与周围环境之间的位置关系，以获得语义相关信息。在此基础上，提出了一种能够有效结合相关信息抑制背景干扰并重新编码生成多尺度特征的FIM方法。此外，为了进一步融合多尺度特征，设计了一种IRM来挖掘不同尺度特征之间的互补信息和差异。在四种广泛使用的基准数据集（即CAMO， CHAMELEON， COD10K和NC4K）上进行的大量实验结果表明，作者的方法优于13种最先进的方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IET Computer Vision 工程技术-工程：电子与电气

CiteScore

3.30

自引率

11.80%

发文量

审稿时长

3.4 months

期刊介绍： IET Computer Vision seeks original research papers in a wide range of areas of computer vision. The vision of the journal is to publish the highest quality research work that is relevant and topical to the field, but not forgetting those works that aim to introduce new horizons and set the agenda for future avenues of research in computer vision. IET Computer Vision welcomes submissions on the following topics: Biologically and perceptually motivated approaches to low level vision (feature detection, etc.); Perceptual grouping and organisation Representation, analysis and matching of 2D and 3D shape Shape-from-X Object recognition Image understanding Learning with visual inputs Motion analysis and object tracking Multiview scene analysis Cognitive approaches in low, mid and high level vision Control in visual systems Colour, reflectance and light Statistical and probabilistic models Face and gesture Surveillance Biometrics and security Robotics Vehicle guidance Automatic model aquisition Medical image analysis and understanding Aerial scene analysis and remote sensing Deep learning models in computer vision Both methodological and applications orientated papers are welcome. Manuscripts submitted are expected to include a detailed and analytical review of the literature and state-of-the-art exposition of the original proposed research and its methodology, its thorough experimental evaluation, and last but not least, comparative evaluation against relevant and state-of-the-art methods. Submissions not abiding by these minimum requirements may be returned to authors without being sent to review. Special Issues Current Call for Papers: Computer Vision for Smart Cameras and Camera Networks - https://digital-library.theiet.org/files/IET_CVI_SC.pdf Computer Vision for the Creative Industries - https://digital-library.theiet.org/files/IET_CVI_CVCI.pdf