Xiaoge Cao;Tao Lu;Liming Zheng;Yinghao Cai;Shuo Wang
{"title":"PLOT: Human-Like Push-Grasping Synergy Learning in Clutter With One-Shot Target Recognition","authors":"Xiaoge Cao;Tao Lu;Liming Zheng;Yinghao Cai;Shuo Wang","doi":"10.1109/TCDS.2024.3357084","DOIUrl":null,"url":null,"abstract":"In unstructured environments, robotic grasping tasks are frequently required to interactively search for and retrieve specific objects from a cluttered workspace under the condition that only partial information about the target is available, like images, text descriptions, 3-D models, etc. It is a great challenge to correctly recognize the targets with limited information and learn synergies between different action primitives to grasp the targets from densely occluding objects efficiently. In this article, we propose a novel human-like push-grasping method that could grasp unknown objects in clutter using only one target RGB with Depth (RGB-D) image, called push-grasping synergy learning in clutter with one-shot target recognition (PLOT). First, we propose a target recognition (TR) method which automatically segments the objects both from the query image and workspace image, and extract the robust features of each segmented object. Through the designed feature matching criterion, the targets could be quickly located in the workspace. Second, we introduce a self-supervised target-oriented grasping system based on synergies between push and grasp actions. In this system, we propose a salient Q (SQ)-learning framework that focuses the \n<italic>Q</i>\n value learning in the area including targets and a coordination mechanism (CM) that selects the proper actions to search and isolate the targets from the surrounding objects, even in the condition of targets invisible. Our method is inspired by the working memory mechanism of human brain and can grasp any target object shown through the image and has good generality in application. Experimental results in simulation and real-world show that our method achieved the best performance compared with the baselines in finding the unknown target objects from the cluttered environment with only one demonstrated target RGB-D image and had the high efficiency of grasping under the synergies of push and grasp actions.","PeriodicalId":54300,"journal":{"name":"IEEE Transactions on Cognitive and Developmental Systems","volume":null,"pages":null},"PeriodicalIF":5.0000,"publicationDate":"2024-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Cognitive and Developmental Systems","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10411941/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
In unstructured environments, robotic grasping tasks are frequently required to interactively search for and retrieve specific objects from a cluttered workspace under the condition that only partial information about the target is available, like images, text descriptions, 3-D models, etc. It is a great challenge to correctly recognize the targets with limited information and learn synergies between different action primitives to grasp the targets from densely occluding objects efficiently. In this article, we propose a novel human-like push-grasping method that could grasp unknown objects in clutter using only one target RGB with Depth (RGB-D) image, called push-grasping synergy learning in clutter with one-shot target recognition (PLOT). First, we propose a target recognition (TR) method which automatically segments the objects both from the query image and workspace image, and extract the robust features of each segmented object. Through the designed feature matching criterion, the targets could be quickly located in the workspace. Second, we introduce a self-supervised target-oriented grasping system based on synergies between push and grasp actions. In this system, we propose a salient Q (SQ)-learning framework that focuses the
Q
value learning in the area including targets and a coordination mechanism (CM) that selects the proper actions to search and isolate the targets from the surrounding objects, even in the condition of targets invisible. Our method is inspired by the working memory mechanism of human brain and can grasp any target object shown through the image and has good generality in application. Experimental results in simulation and real-world show that our method achieved the best performance compared with the baselines in finding the unknown target objects from the cluttered environment with only one demonstrated target RGB-D image and had the high efficiency of grasping under the synergies of push and grasp actions.
期刊介绍:
The IEEE Transactions on Cognitive and Developmental Systems (TCDS) focuses on advances in the study of development and cognition in natural (humans, animals) and artificial (robots, agents) systems. It welcomes contributions from multiple related disciplines including cognitive systems, cognitive robotics, developmental and epigenetic robotics, autonomous and evolutionary robotics, social structures, multi-agent and artificial life systems, computational neuroscience, and developmental psychology. Articles on theoretical, computational, application-oriented, and experimental studies as well as reviews in these areas are considered.