{"title":"FFBGNet: Full-Flow Bidirectional Feature Fusion Grasp Detection Network Based on Hybrid Architecture","authors":"Qin Wan;Shunxing Ning;Haoran Tan;Yaonan Wang;Xiaogang Duan;Zhi Li;Yang Yang;Jianhua Qiu","doi":"10.1109/LRA.2024.3511410","DOIUrl":null,"url":null,"abstract":"Effectively integrating the complementary information from RGB-D images presents a significant challenge in robotic grasping. In this letter, we propose a full-flow bidirectional feature fusion grasp detection network (FFBGNet) based on a hybrid architecture to generate accurate grasp poses from RGB-D images. First, we construct an efficient Cross-Modal Feature fusion module as a bridge for information interaction in the full flow of the two branches, where fusion is applied to each encoding and decoding layer. Then, the two branches can fully leverage the appearance information in the RGB images and the geometry information from the depth images. Second, a hybrid architecture module for CNNs and Transformer parallel is developed to achieve better local feature and global information representations. Finally, we conduct qualitative and quantitative comparative experiments on the Cornell and Jacquard datasets, achieving grasping detection accuracies of 99.2\n<inline-formula><tex-math>${\\%}$</tex-math></inline-formula>\n and 96.5\n<inline-formula><tex-math>${\\%}$</tex-math></inline-formula>\n, respectively. Simultaneously, in physical grasping experiments, the FFBGNet achieves a 96.7\n<inline-formula><tex-math>${\\%}$</tex-math></inline-formula>\n success rate in cluttered scenes, which further demonstrates the reliability of the proposed method.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"10 2","pages":"971-978"},"PeriodicalIF":4.6000,"publicationDate":"2024-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Robotics and Automation Letters","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10777534/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ROBOTICS","Score":null,"Total":0}
引用次数: 0
Abstract
Effectively integrating the complementary information from RGB-D images presents a significant challenge in robotic grasping. In this letter, we propose a full-flow bidirectional feature fusion grasp detection network (FFBGNet) based on a hybrid architecture to generate accurate grasp poses from RGB-D images. First, we construct an efficient Cross-Modal Feature fusion module as a bridge for information interaction in the full flow of the two branches, where fusion is applied to each encoding and decoding layer. Then, the two branches can fully leverage the appearance information in the RGB images and the geometry information from the depth images. Second, a hybrid architecture module for CNNs and Transformer parallel is developed to achieve better local feature and global information representations. Finally, we conduct qualitative and quantitative comparative experiments on the Cornell and Jacquard datasets, achieving grasping detection accuracies of 99.2
${\%}$
and 96.5
${\%}$
, respectively. Simultaneously, in physical grasping experiments, the FFBGNet achieves a 96.7
${\%}$
success rate in cluttered scenes, which further demonstrates the reliability of the proposed method.
期刊介绍:
The scope of this journal is to publish peer-reviewed articles that provide a timely and concise account of innovative research ideas and application results, reporting significant theoretical findings and application case studies in areas of robotics and automation.