具有频率和尺度意识的高精度二分图像分割技术

IF 8.9 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

IEEE transactions on neural networks and learning systems Pub Date : 2024-08-16 DOI:10.1109/TNNLS.2024.3426529

Qiuping Jiang;Jinguang Cheng;Zongwei Wu;Runmin Cong;Radu Timofte

{"title":"具有频率和尺度意识的高精度二分图像分割技术","authors":"Qiuping Jiang;Jinguang Cheng;Zongwei Wu;Runmin Cong;Radu Timofte","doi":"10.1109/TNNLS.2024.3426529","DOIUrl":null,"url":null,"abstract":"Dichotomous image segmentation (DIS) with rich fine-grained details within a single image is a challenging task. Despite the plausible results achieved by deep learning-based methods, most of them fail to segment generic objects when the boundary is cluttered with the background. In fact, the gradual decrease in feature map resolution during the encoding stage and the misleading texture clue may be the main issues. To handle these issues, we devise a novel frequency- and scale-aware deep neural network (FSANet) for high-precision DIS. The core of our proposed FSANet is twofold. First, a multimodality fusion (MF) module that integrates the information in spatial and frequency domains is adopted to enhance the representation capability of image features. Second, a collaborative scale fusion module (CSFM) which deviates from the traditional serial structures is introduced to maintain high resolution during the entire feature encoding stage. In the decoder side, we introduce hierarchical context fusion (HCF) and selective feature fusion (SFF) modules to infer the segmentation results from the output features of the CSFM module. We conduct extensive experiments on several benchmark datasets and compare our proposed method with existing state-of-the-art (SOTA) methods. The experimental results demonstrate that our FSANet achieves superior performance both qualitatively and quantitatively. The code will be made available at <uri>https://github.com/chasecjg/FSANet</uri>.","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"36 5","pages":"8619-8631"},"PeriodicalIF":8.9000,"publicationDate":"2024-08-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"High-Precision Dichotomous Image Segmentation With Frequency and Scale Awareness\",\"authors\":\"Qiuping Jiang;Jinguang Cheng;Zongwei Wu;Runmin Cong;Radu Timofte\",\"doi\":\"10.1109/TNNLS.2024.3426529\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Dichotomous image segmentation (DIS) with rich fine-grained details within a single image is a challenging task. Despite the plausible results achieved by deep learning-based methods, most of them fail to segment generic objects when the boundary is cluttered with the background. In fact, the gradual decrease in feature map resolution during the encoding stage and the misleading texture clue may be the main issues. To handle these issues, we devise a novel frequency- and scale-aware deep neural network (FSANet) for high-precision DIS. The core of our proposed FSANet is twofold. First, a multimodality fusion (MF) module that integrates the information in spatial and frequency domains is adopted to enhance the representation capability of image features. Second, a collaborative scale fusion module (CSFM) which deviates from the traditional serial structures is introduced to maintain high resolution during the entire feature encoding stage. In the decoder side, we introduce hierarchical context fusion (HCF) and selective feature fusion (SFF) modules to infer the segmentation results from the output features of the CSFM module. We conduct extensive experiments on several benchmark datasets and compare our proposed method with existing state-of-the-art (SOTA) methods. The experimental results demonstrate that our FSANet achieves superior performance both qualitatively and quantitatively. The code will be made available at <uri>https://github.com/chasecjg/FSANet</uri>.\",\"PeriodicalId\":13303,\"journal\":{\"name\":\"IEEE transactions on neural networks and learning systems\",\"volume\":\"36 5\",\"pages\":\"8619-8631\"},\"PeriodicalIF\":8.9000,\"publicationDate\":\"2024-08-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE transactions on neural networks and learning systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10638122/\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on neural networks and learning systems","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10638122/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

在单幅图像中进行具有丰富细粒度细节的二分图像分割（DIS）是一项具有挑战性的任务。尽管基于深度学习的方法取得了似是而非的结果，但当边界与背景杂乱无章时，大多数方法都无法分割一般物体。事实上，主要问题可能在于编码阶段特征图分辨率的逐渐降低以及纹理线索的误导。为了解决这些问题，我们为高精度 DIS 设计了一种新颖的频率和尺度感知深度神经网络（FSANet）。我们提出的 FSANet 有两个核心。首先，采用多模态融合（MF）模块，整合空间域和频率域的信息，增强图像特征的表示能力。其次，我们引入了偏离传统串行结构的协同尺度融合模块（CSFM），以在整个特征编码阶段保持高分辨率。在解码器方面，我们引入了分层上下文融合（HCF）和选择性特征融合（SFF）模块，以便从 CSFM 模块的输出特征推断分割结果。我们在多个基准数据集上进行了广泛的实验，并将我们提出的方法与现有的最先进（SOTA）方法进行了比较。实验结果表明，我们的 FSANet 在定性和定量方面都取得了优异的性能。代码将公布在 https://github.com/chasecjg/FSANet 网站上。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

High-Precision Dichotomous Image Segmentation With Frequency and Scale Awareness

Dichotomous image segmentation (DIS) with rich fine-grained details within a single image is a challenging task. Despite the plausible results achieved by deep learning-based methods, most of them fail to segment generic objects when the boundary is cluttered with the background. In fact, the gradual decrease in feature map resolution during the encoding stage and the misleading texture clue may be the main issues. To handle these issues, we devise a novel frequency- and scale-aware deep neural network (FSANet) for high-precision DIS. The core of our proposed FSANet is twofold. First, a multimodality fusion (MF) module that integrates the information in spatial and frequency domains is adopted to enhance the representation capability of image features. Second, a collaborative scale fusion module (CSFM) which deviates from the traditional serial structures is introduced to maintain high resolution during the entire feature encoding stage. In the decoder side, we introduce hierarchical context fusion (HCF) and selective feature fusion (SFF) modules to infer the segmentation results from the output features of the CSFM module. We conduct extensive experiments on several benchmark datasets and compare our proposed method with existing state-of-the-art (SOTA) methods. The experimental results demonstrate that our FSANet achieves superior performance both qualitatively and quantitatively. The code will be made available at https://github.com/chasecjg/FSANet.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE transactions on neural networks and learning systems COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE-COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

CiteScore

23.80

自引率

9.60%

发文量

2102

审稿时长

3-8 weeks

期刊介绍： The focus of IEEE Transactions on Neural Networks and Learning Systems is to present scholarly articles discussing the theory, design, and applications of neural networks as well as other learning systems. The journal primarily highlights technical and scientific research in this domain.