Ying Xie;Jingkai Shang;Ruixiang Deng;Xianlun Tang;Wuqiang Yang
{"title":"FRISNET:一个融合频域和多电平特征的快速实时实例分割网络","authors":"Ying Xie;Jingkai Shang;Ruixiang Deng;Xianlun Tang;Wuqiang Yang","doi":"10.1109/TIM.2025.3583292","DOIUrl":null,"url":null,"abstract":"It is challenging to obtain accurate location information of instances and segmentation masks, considering the intricacy and diversity of practical scenarios. This article presents a fast real-time instance segmentation network (FRISNET) by fusing the information from the frequency domain and space domain. Based on you only look at coefficients (YOLACT), which is the fastest instance segmentation method, the frequency domain representation is introduced into a convolutional neural network (CNN). By fast Fourier transform (FFT), features of different frequencies extracted from the frequency domain are fused with the characteristic map of the spatial domain. Accurate global location information and clear semantic information are obtained using CNN. To take advantage of the high-resolution information of target location and feature information from the bottom level, as well as the supervision information located at the entirely identical level, a brand fresh bottom-up feature fusion branch and skip connection at the same level are introduced based on the top-down feature pyramid network (FPN) feature fusion network, enabling the feature extraction network to possess diverse feature representation. The proposed instance segmentation model is trained on open standard datasets of PASCAL segmentation boundary detection (PASCAL SBD) and Microsoft Common Objects in Context (MS COCO). The results show that the proposed method improves instance segmentation accuracy. It achieves a mean average precision (mAP) of 34.5 at 31.77 frames/s (FPS) on MS COCO. This performance is 1.17% higher than YOLACT++ with the ResNet-50 architecture. The model’s speed also shows potential for use in motion planning and tactile sensors for robotic grasping tasks. This could further enhance execution efficiency and operational reliability.","PeriodicalId":13341,"journal":{"name":"IEEE Transactions on Instrumentation and Measurement","volume":"74 ","pages":"1-14"},"PeriodicalIF":5.9000,"publicationDate":"2025-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"FRISNET: A Fast Real-Time Instance Segmentation Network Fusing Frequency Domain and Multilevel Features\",\"authors\":\"Ying Xie;Jingkai Shang;Ruixiang Deng;Xianlun Tang;Wuqiang Yang\",\"doi\":\"10.1109/TIM.2025.3583292\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"It is challenging to obtain accurate location information of instances and segmentation masks, considering the intricacy and diversity of practical scenarios. This article presents a fast real-time instance segmentation network (FRISNET) by fusing the information from the frequency domain and space domain. Based on you only look at coefficients (YOLACT), which is the fastest instance segmentation method, the frequency domain representation is introduced into a convolutional neural network (CNN). By fast Fourier transform (FFT), features of different frequencies extracted from the frequency domain are fused with the characteristic map of the spatial domain. Accurate global location information and clear semantic information are obtained using CNN. To take advantage of the high-resolution information of target location and feature information from the bottom level, as well as the supervision information located at the entirely identical level, a brand fresh bottom-up feature fusion branch and skip connection at the same level are introduced based on the top-down feature pyramid network (FPN) feature fusion network, enabling the feature extraction network to possess diverse feature representation. The proposed instance segmentation model is trained on open standard datasets of PASCAL segmentation boundary detection (PASCAL SBD) and Microsoft Common Objects in Context (MS COCO). The results show that the proposed method improves instance segmentation accuracy. It achieves a mean average precision (mAP) of 34.5 at 31.77 frames/s (FPS) on MS COCO. This performance is 1.17% higher than YOLACT++ with the ResNet-50 architecture. The model’s speed also shows potential for use in motion planning and tactile sensors for robotic grasping tasks. This could further enhance execution efficiency and operational reliability.\",\"PeriodicalId\":13341,\"journal\":{\"name\":\"IEEE Transactions on Instrumentation and Measurement\",\"volume\":\"74 \",\"pages\":\"1-14\"},\"PeriodicalIF\":5.9000,\"publicationDate\":\"2025-06-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Instrumentation and Measurement\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/11053186/\",\"RegionNum\":2,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Instrumentation and Measurement","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/11053186/","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
FRISNET: A Fast Real-Time Instance Segmentation Network Fusing Frequency Domain and Multilevel Features
It is challenging to obtain accurate location information of instances and segmentation masks, considering the intricacy and diversity of practical scenarios. This article presents a fast real-time instance segmentation network (FRISNET) by fusing the information from the frequency domain and space domain. Based on you only look at coefficients (YOLACT), which is the fastest instance segmentation method, the frequency domain representation is introduced into a convolutional neural network (CNN). By fast Fourier transform (FFT), features of different frequencies extracted from the frequency domain are fused with the characteristic map of the spatial domain. Accurate global location information and clear semantic information are obtained using CNN. To take advantage of the high-resolution information of target location and feature information from the bottom level, as well as the supervision information located at the entirely identical level, a brand fresh bottom-up feature fusion branch and skip connection at the same level are introduced based on the top-down feature pyramid network (FPN) feature fusion network, enabling the feature extraction network to possess diverse feature representation. The proposed instance segmentation model is trained on open standard datasets of PASCAL segmentation boundary detection (PASCAL SBD) and Microsoft Common Objects in Context (MS COCO). The results show that the proposed method improves instance segmentation accuracy. It achieves a mean average precision (mAP) of 34.5 at 31.77 frames/s (FPS) on MS COCO. This performance is 1.17% higher than YOLACT++ with the ResNet-50 architecture. The model’s speed also shows potential for use in motion planning and tactile sensors for robotic grasping tasks. This could further enhance execution efficiency and operational reliability.
期刊介绍:
Papers are sought that address innovative solutions to the development and use of electrical and electronic instruments and equipment to measure, monitor and/or record physical phenomena for the purpose of advancing measurement science, methods, functionality and applications. The scope of these papers may encompass: (1) theory, methodology, and practice of measurement; (2) design, development and evaluation of instrumentation and measurement systems and components used in generating, acquiring, conditioning and processing signals; (3) analysis, representation, display, and preservation of the information obtained from a set of measurements; and (4) scientific and technical support to establishment and maintenance of technical standards in the field of Instrumentation and Measurement.