{"title":"Semi-Supervised Language-Conditioned Grasping With Curriculum-Scheduled Augmentation and Geometric Consistency","authors":"Jialong Xie;Fengyu Zhou;Jin Liu;Chaoqun Wang","doi":"10.1109/LRA.2025.3547619","DOIUrl":null,"url":null,"abstract":"Language-Conditioned Grasping (LCG) is an essential skill for robotic manipulation and has attracted increasing interest. Recent LCG models have made great progress, but need numerous paired image-text-pose annotations for fully supervised learning, which are tedious and expensive. Semi-supervised learning has provided a viable solution, while they still encounter the following challenges for LCG: (i) Over-distorted data perturbations result in slow and unstable convergence for multi-modal inputs in the early stage. (ii) Inconsistency between the perceptive and grasping locations leads to a degradation of grasp accuracy. In this letter, we propose a semi-supervised language-conditioned grasping framework that achieves data-efficient object grounding and grasping detection based on language description. Concretely, we introduce a Curriculum-Scheduled augmentation and Geometric Consistency (CSGC) strategy to address the above problems. Concretely, We design a curriculum-scheduled augmentation to progressively improve data diversity from easy to difficult, facilitating stable knowledge distillation and model convergence. Meanwhile, we present a geometry-aware consistency regularization to constrain the region alignment between object perception and grasping confidence, improving the quality of pseudo-labels and grasp accuracy. Extensive experimental results demonstrate the effectiveness and practicability of our proposed method in the limited labeled data.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"10 4","pages":"4021-4028"},"PeriodicalIF":4.6000,"publicationDate":"2025-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Robotics and Automation Letters","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10909187/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ROBOTICS","Score":null,"Total":0}
引用次数: 0
Abstract
Language-Conditioned Grasping (LCG) is an essential skill for robotic manipulation and has attracted increasing interest. Recent LCG models have made great progress, but need numerous paired image-text-pose annotations for fully supervised learning, which are tedious and expensive. Semi-supervised learning has provided a viable solution, while they still encounter the following challenges for LCG: (i) Over-distorted data perturbations result in slow and unstable convergence for multi-modal inputs in the early stage. (ii) Inconsistency between the perceptive and grasping locations leads to a degradation of grasp accuracy. In this letter, we propose a semi-supervised language-conditioned grasping framework that achieves data-efficient object grounding and grasping detection based on language description. Concretely, we introduce a Curriculum-Scheduled augmentation and Geometric Consistency (CSGC) strategy to address the above problems. Concretely, We design a curriculum-scheduled augmentation to progressively improve data diversity from easy to difficult, facilitating stable knowledge distillation and model convergence. Meanwhile, we present a geometry-aware consistency regularization to constrain the region alignment between object perception and grasping confidence, improving the quality of pseudo-labels and grasp accuracy. Extensive experimental results demonstrate the effectiveness and practicability of our proposed method in the limited labeled data.
期刊介绍:
The scope of this journal is to publish peer-reviewed articles that provide a timely and concise account of innovative research ideas and application results, reporting significant theoretical findings and application case studies in areas of robotics and automation.