FRISNET: A Fast Real-Time Instance Segmentation Network Fusing Frequency Domain and Multilevel Features

IF 5.9 2区 工程技术 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC
Ying Xie;Jingkai Shang;Ruixiang Deng;Xianlun Tang;Wuqiang Yang
{"title":"FRISNET: A Fast Real-Time Instance Segmentation Network Fusing Frequency Domain and Multilevel Features","authors":"Ying Xie;Jingkai Shang;Ruixiang Deng;Xianlun Tang;Wuqiang Yang","doi":"10.1109/TIM.2025.3583292","DOIUrl":null,"url":null,"abstract":"It is challenging to obtain accurate location information of instances and segmentation masks, considering the intricacy and diversity of practical scenarios. This article presents a fast real-time instance segmentation network (FRISNET) by fusing the information from the frequency domain and space domain. Based on you only look at coefficients (YOLACT), which is the fastest instance segmentation method, the frequency domain representation is introduced into a convolutional neural network (CNN). By fast Fourier transform (FFT), features of different frequencies extracted from the frequency domain are fused with the characteristic map of the spatial domain. Accurate global location information and clear semantic information are obtained using CNN. To take advantage of the high-resolution information of target location and feature information from the bottom level, as well as the supervision information located at the entirely identical level, a brand fresh bottom-up feature fusion branch and skip connection at the same level are introduced based on the top-down feature pyramid network (FPN) feature fusion network, enabling the feature extraction network to possess diverse feature representation. The proposed instance segmentation model is trained on open standard datasets of PASCAL segmentation boundary detection (PASCAL SBD) and Microsoft Common Objects in Context (MS COCO). The results show that the proposed method improves instance segmentation accuracy. It achieves a mean average precision (mAP) of 34.5 at 31.77 frames/s (FPS) on MS COCO. This performance is 1.17% higher than YOLACT++ with the ResNet-50 architecture. The model’s speed also shows potential for use in motion planning and tactile sensors for robotic grasping tasks. This could further enhance execution efficiency and operational reliability.","PeriodicalId":13341,"journal":{"name":"IEEE Transactions on Instrumentation and Measurement","volume":"74 ","pages":"1-14"},"PeriodicalIF":5.9000,"publicationDate":"2025-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Instrumentation and Measurement","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/11053186/","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0

Abstract

It is challenging to obtain accurate location information of instances and segmentation masks, considering the intricacy and diversity of practical scenarios. This article presents a fast real-time instance segmentation network (FRISNET) by fusing the information from the frequency domain and space domain. Based on you only look at coefficients (YOLACT), which is the fastest instance segmentation method, the frequency domain representation is introduced into a convolutional neural network (CNN). By fast Fourier transform (FFT), features of different frequencies extracted from the frequency domain are fused with the characteristic map of the spatial domain. Accurate global location information and clear semantic information are obtained using CNN. To take advantage of the high-resolution information of target location and feature information from the bottom level, as well as the supervision information located at the entirely identical level, a brand fresh bottom-up feature fusion branch and skip connection at the same level are introduced based on the top-down feature pyramid network (FPN) feature fusion network, enabling the feature extraction network to possess diverse feature representation. The proposed instance segmentation model is trained on open standard datasets of PASCAL segmentation boundary detection (PASCAL SBD) and Microsoft Common Objects in Context (MS COCO). The results show that the proposed method improves instance segmentation accuracy. It achieves a mean average precision (mAP) of 34.5 at 31.77 frames/s (FPS) on MS COCO. This performance is 1.17% higher than YOLACT++ with the ResNet-50 architecture. The model’s speed also shows potential for use in motion planning and tactile sensors for robotic grasping tasks. This could further enhance execution efficiency and operational reliability.
FRISNET:一个融合频域和多电平特征的快速实时实例分割网络
考虑到实际场景的复杂性和多样性,获取实例和分割掩码的准确位置信息是一项挑战。本文提出了一种融合频域和空间信息的快速实时实例分割网络(FRISNET)。基于你只看系数(YOLACT),这是最快的实例分割方法,将频域表示引入卷积神经网络(CNN)。通过快速傅里叶变换(FFT),将从频域提取的不同频率的特征与空间域的特征映射融合在一起。利用CNN获得准确的全球位置信息和清晰的语义信息。为了利用目标位置和底层特征信息的高分辨率信息,以及位于完全同一层的监管信息,在自顶向下的特征金字塔网络(FPN)特征融合网络的基础上,引入了全新的自底向上的特征融合分支和同一层的跳过连接,使特征提取网络具有多样化的特征表示。本文提出的实例分割模型在PASCAL分割边界检测(PASCAL SBD)和微软公共对象上下文(MS COCO)的开放标准数据集上进行训练。结果表明,该方法提高了实例分割的精度。它在MS COCO上以31.77帧/秒(FPS)的速度实现了平均精度(mAP) 34.5。该性能比使用ResNet-50架构的yolact++提高1.17%。该模型的速度也显示出在机器人抓取任务的运动规划和触觉传感器中使用的潜力。这可以进一步提高执行效率和操作可靠性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
IEEE Transactions on Instrumentation and Measurement
IEEE Transactions on Instrumentation and Measurement 工程技术-工程:电子与电气
CiteScore
9.00
自引率
23.20%
发文量
1294
审稿时长
3.9 months
期刊介绍: Papers are sought that address innovative solutions to the development and use of electrical and electronic instruments and equipment to measure, monitor and/or record physical phenomena for the purpose of advancing measurement science, methods, functionality and applications. The scope of these papers may encompass: (1) theory, methodology, and practice of measurement; (2) design, development and evaluation of instrumentation and measurement systems and components used in generating, acquiring, conditioning and processing signals; (3) analysis, representation, display, and preservation of the information obtained from a set of measurements; and (4) scientific and technical support to establishment and maintenance of technical standards in the field of Instrumentation and Measurement.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信