{"title":"Exploring the Role of Robot's Movements for a Transparent Affective Communication","authors":"Luca Raggioli;Raffaella Esposito;Alessandra Rossi;Silvia Rossi","doi":"10.1109/LRA.2025.3548412","DOIUrl":"https://doi.org/10.1109/LRA.2025.3548412","url":null,"abstract":"Robots operating in human-populated environments must be able to convey their intentions clearly. Displaying emotions can be an effective way for robots to express their internal state and a means to react to humans' behaviors. While facial expressions provide an immediate representation of the robot's “feelings,” there might be situations where only facial expressions are not enough to express the robot's intent appropriately, and multi-modal affective modalities are required. However, the characterization of the robot's movements has not been sufficiently and thoroughly investigated. In this work, we argue that transparent non-verbal behaviors, with particular attention to the robot's movements (e.g., arms, head, velocity), can be crucial for effective communication between robots and humans. We collected responses from N = 967 people observing the robot during a science fair. Our results outline how movements can contribute to conveying emotions transparently. This is especially possible when no conflicting signals are present. However, facial expression is still the most dominant modality when other modalities are not aligned with the movement's intended emotion.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"10 5","pages":"4364-4371"},"PeriodicalIF":4.6,"publicationDate":"2025-03-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10910153","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143688078","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Qingxin Meng;Xuefeng Sun;Yawu Wang;Jundong Wu;Chun-Yi Su
{"title":"Modeling and Neural-Network-Based Tail Oscillation Control of a Fish-Like Bionic Soft Actuation Mechanism","authors":"Qingxin Meng;Xuefeng Sun;Yawu Wang;Jundong Wu;Chun-Yi Su","doi":"10.1109/LRA.2025.3548407","DOIUrl":"https://doi.org/10.1109/LRA.2025.3548407","url":null,"abstract":"With the progress in ocean exploration, bionic soft robotic fish have garnered significant attention, with their key feature being the actuation mechanism made from soft materials. However, the complex properties of these materials pose challenges in modeling and control. In this letter, we design and fabricate a Fish-like Bionic Soft Actuation Mechanism (FBSAM) and aim to achieve its tail oscillation control. First, we construct an experimental platform to collect data on FBSAM's motion characteristics, revealing complex nonlinear hysteresis influenced by varying liquid environments. Next, we develop a phenomenological model for FBSAM based on the Hammerstein architecture and identify its parameters via nonlinear least squares algorithm. Subsequently, we propose an integral sliding mode hybrid control strategy, introducing an inverse hysteresis compensator to address hysteresis issue and using the neural network to estimate uncertain disturbances caused by liquid environments. Finally, experimental results demonstrate that the designed FBSAM can oscillate in water like a real fish, and the proposed control strategy adapts to various external environments, maintaining excellent performance even in dynamic flow conditions, showcasing its effectiveness and superiority.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"10 4","pages":"3827-3834"},"PeriodicalIF":4.6,"publicationDate":"2025-03-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143611898","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Semi-Supervised Language-Conditioned Grasping With Curriculum-Scheduled Augmentation and Geometric Consistency","authors":"Jialong Xie;Fengyu Zhou;Jin Liu;Chaoqun Wang","doi":"10.1109/LRA.2025.3547619","DOIUrl":"https://doi.org/10.1109/LRA.2025.3547619","url":null,"abstract":"Language-Conditioned Grasping (LCG) is an essential skill for robotic manipulation and has attracted increasing interest. Recent LCG models have made great progress, but need numerous paired image-text-pose annotations for fully supervised learning, which are tedious and expensive. Semi-supervised learning has provided a viable solution, while they still encounter the following challenges for LCG: (i) Over-distorted data perturbations result in slow and unstable convergence for multi-modal inputs in the early stage. (ii) Inconsistency between the perceptive and grasping locations leads to a degradation of grasp accuracy. In this letter, we propose a semi-supervised language-conditioned grasping framework that achieves data-efficient object grounding and grasping detection based on language description. Concretely, we introduce a Curriculum-Scheduled augmentation and Geometric Consistency (CSGC) strategy to address the above problems. Concretely, We design a curriculum-scheduled augmentation to progressively improve data diversity from easy to difficult, facilitating stable knowledge distillation and model convergence. Meanwhile, we present a geometry-aware consistency regularization to constrain the region alignment between object perception and grasping confidence, improving the quality of pseudo-labels and grasp accuracy. Extensive experimental results demonstrate the effectiveness and practicability of our proposed method in the limited labeled data.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"10 4","pages":"4021-4028"},"PeriodicalIF":4.6,"publicationDate":"2025-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143645287","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Automated Layout and Control Co-Design of Robust Multi-UAV Transportation Systems","authors":"Carlo Bosio;Mark W. Mueller","doi":"10.1109/LRA.2025.3547307","DOIUrl":"https://doi.org/10.1109/LRA.2025.3547307","url":null,"abstract":"The joint optimization of physical parameters and controllers in robotic systems is challenging. This is due to the difficulties of predicting the effect that changes in physical parameters have on final performances. At the same time, physical and morphological modifications can improve robot capabilities, perhaps completely unlocking new skills and tasks. We present a novel approach to co-optimize the physical layout and the control of a cooperative aerial transportation system. The goal is to achieve the most precise and robust flight when carrying a payload. We assume the agents are connected to the payload through rigid attachments, essentially transforming the whole system into a larger flying object with “thrust modules” at the attachment locations of the quadcopters. We investigate the optimal arrangement of the thrust modules around the payload, so that the resulting system achieves the best disturbance rejection capabilities. We propose a novel metric of robustness inspired by <inline-formula><tex-math>$mathcal {H}_{2}$</tex-math></inline-formula> control, and propose an algorithm to optimize the layout of the vehicles around the object and their controller altogether. We experimentally validate the effectiveness of our approach using fleets of three and four quadcopters and payloads of diverse shapes.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"10 4","pages":"3956-3963"},"PeriodicalIF":4.6,"publicationDate":"2025-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143621915","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jiawei Li;Zhaojin Liu;Yuxiao Li;Yuanyue Li;Yimin Huang;Gang Wang
{"title":"Localized Coverage Planning for a Heat Transfer Tube Inspection Robot","authors":"Jiawei Li;Zhaojin Liu;Yuxiao Li;Yuanyue Li;Yimin Huang;Gang Wang","doi":"10.1109/LRA.2025.3547675","DOIUrl":"https://doi.org/10.1109/LRA.2025.3547675","url":null,"abstract":"The heat transfer tubes of the steam generator are critical components of the nuclear power system and require regular inspection to ensure safety. The SG-Climbot, a quadruped heat transfer tube inspection robot, is equipped with a guiding device capable of simultaneously aligning with and inspecting two heat transfer tubes. Furthermore, The guiding device must execute hundreds of pose configuration transformations to complete a localized coverage inspection, thereby presenting challenges to the robot's efficient autonomous planning. This letter presents a planning framework for the SG-Climbot's localized coverage inspection task. The framework consists of four planning levels: pair planning, position and orientation planning for the guiding device, inspection sequence planning, and time-optimal trajectory planning. A maximum matching algorithm suitable for robotic arms equipped with dual execution devices to perform tasks has been proposed, achieving the optimal pairing of heat transfer tubes and reducing inspection time by over 48 minutes (18.32% improvement). In addition, we analyze the impact of various Traveling Salesman Problem (TSP) solving algorithms on sequence planning issues that require reaching numerous nodes within short operation times, reducing the arm operating time by 33.20 s (6.99% improvement). Finally, the effectiveness of the proposed planning algorithm was validated through simulations and experiments.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"10 4","pages":"3916-3923"},"PeriodicalIF":4.6,"publicationDate":"2025-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143621723","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Donghwi Jung;Andres Pulido;Jane Shin;Seong-Woo Kim
{"title":"Point Cloud Structural Similarity-Based Underwater Sonar Loop Detection","authors":"Donghwi Jung;Andres Pulido;Jane Shin;Seong-Woo Kim","doi":"10.1109/LRA.2025.3547304","DOIUrl":"https://doi.org/10.1109/LRA.2025.3547304","url":null,"abstract":"In this letter, we propose a point cloud structural similarity-based loop detection method for underwater Simultaneous Localization and Mapping using sonar sensors. Existing sonar-based loop detection approaches often rely on 2D projection and keypoint extraction, which can lead to data loss and poor performance in feature-scarce environments. Additionally, methods based on neural networks or Bag-of-Words require extensive preprocessing, such as model training or vocabulary creation, reducing adaptability to new environments. To address these challenges, our method directly utilizes 3D sonar point clouds without projection and computes point-wise structural feature maps based on geometry, normals, and curvature. By leveraging rotation-invariant similarity comparisons, the proposed approach eliminates the need for keypoint detection and ensures robust loop detection across diverse underwater terrains. We validate our method using two real-world datasets: the Antarctica dataset obtained from deep underwater and the Seaward dataset collected from rivers and lakes. Experimental results show that our method achieves the highest loop detection performance compared to existing keypoint-based and learning-based approaches while requiring no additional training or preprocessing.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"10 4","pages":"3859-3866"},"PeriodicalIF":4.6,"publicationDate":"2025-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143621910","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zhijie Yan;Shufei Li;Zuoxu Wang;Lixiu Wu;Han Wang;Jun Zhu;Lijiang Chen;Jihong Liu
{"title":"Dynamic Open-Vocabulary 3D Scene Graphs for Long-Term Language-Guided Mobile Manipulation","authors":"Zhijie Yan;Shufei Li;Zuoxu Wang;Lixiu Wu;Han Wang;Jun Zhu;Lijiang Chen;Jihong Liu","doi":"10.1109/LRA.2025.3547643","DOIUrl":"https://doi.org/10.1109/LRA.2025.3547643","url":null,"abstract":"Enabling mobile robots to perform long-term tasks in dynamic real-world environments is a formidable challenge, especially when the environment changes frequently due to human-robot interactions or the robot's own actions. Traditional methods typically assume static scenes, which limits their applicability in the continuously changing real world. To overcome these limitations, we present <monospace>DovSG</monospace>, a novel mobile manipulation framework that leverages dynamic open-vocabulary 3D scene graphs and a language-guided task planning module for long-term task execution. <monospace>DovSG</monospace> takes RGB-D sequences as input and utilizes vision-language models (VLMs) for object detection to obtain high-level object semantic features. Based on the segmented objects, a structured 3D scene graph is generated for low-level spatial relationships. Furthermore, an efficient mechanism for locally updating the scene graph, allows the robot to adjust parts of the graph dynamically during interactions without the need for full scene reconstruction. This mechanism is particularly valuable in dynamic environments, enabling the robot to continually adapt to scene changes and effectively support the execution of long-term tasks. We validated our system in real-world environments with varying degrees of manual modifications, demonstrating its effectiveness and superior performance in long-term tasks.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"10 5","pages":"4252-4259"},"PeriodicalIF":4.6,"publicationDate":"2025-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143688080","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"BEV-DWPVO: BEV-Based Differentiable Weighted Procrustes for Low Scale-Drift Monocular Visual Odometry on Ground","authors":"Yufei Wei;Sha Lu;Wangtao Lu;Rong Xiong;Yue Wang","doi":"10.1109/LRA.2025.3547696","DOIUrl":"https://doi.org/10.1109/LRA.2025.3547696","url":null,"abstract":"Monocular Visual Odometry (MVO) provides a cost-effective, real-time positioning solution for autonomous vehicles. However, MVO systems face the common issue of lacking inherent scale information from monocular cameras. Traditional methods have good interpretability but can only obtain relative scale and suffer from severe scale drift in long-distance tasks. Learning-based methods under perspective view leverage large amounts of training data to acquire prior knowledge and estimate absolute scale by predicting depth values. However, their generalization ability is limited due to the need to accurately estimate the depth of each point. In contrast, we propose a novel MVO system called BEV-DWPVO. Our approach leverages the common assumption of a ground plane, using Bird's-Eye View (BEV) feature maps to represent the environment in a grid-based structure with a unified scale. This enables us to reduce the complexity of pose estimation from 6 Degrees of Freedom (DoF) to 3-DoF. Keypoints are extracted and matched within the BEV space, followed by pose estimation through a differentiable weighted Procrustes solver. The entire system is fully differentiable, supporting end-to-end training with only pose supervision and no auxiliary tasks. We validate BEV-DWPVO on the challenging long-sequence datasets NCLT, Oxford, and KITTI, achieving superior results over existing MVO methods on most evaluation metrics.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"10 5","pages":"4244-4251"},"PeriodicalIF":4.6,"publicationDate":"2025-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143688166","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Liam Roy;Elizabeth A. Croft;Alex Ramirez;Dana Kulić
{"title":"GPT-Driven Gestures: Leveraging Large Language Models to Generate Expressive Robot Motion for Enhanced Human-Robot Interaction","authors":"Liam Roy;Elizabeth A. Croft;Alex Ramirez;Dana Kulić","doi":"10.1109/LRA.2025.3547631","DOIUrl":"https://doi.org/10.1109/LRA.2025.3547631","url":null,"abstract":"Expressive robot motion is a form of nonverbal communication that enables robots to convey their internal states, fostering effective human-robot interaction. A key step in designing expressive robot motions is developing a mapping from the desired states the robot will express to the robot's hardware and available degrees of freedom (design space). This letter introduces a novel framework to autonomously generate this mapping by leveraging a large language model (LLM) to select motion parameters and their values for target robot states. We evaluate expressive robot body language displayed on a Unitree Go1 quadruped as generated by a Generative Pre-trained Transformer (GPT) provided with a set of adjustable motion parameters. Through a two-part study (N = 120), we compared LLM-generated expressive motions with both randomly selected and human-selected expressions. Our results show that participants viewing LLM-generated expressions achieve a significantly higher state classification accuracy over random baselines and perform comparably with human-generated expressions. Additionally, in our post-hoc analysis we find that the Earth Movers Distance provides a useful metric for identifying similar expressions in the design space that lead to classification confusion.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"10 5","pages":"4172-4179"},"PeriodicalIF":4.6,"publicationDate":"2025-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143667418","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Context-Aware Graph Inference and Generative Adversarial Imitation Learning for Object-Goal Navigation in Unfamiliar Environment","authors":"Yiyue Meng;Chi Guo;Aolin Li;Yarong Luo","doi":"10.1109/LRA.2025.3546860","DOIUrl":"https://doi.org/10.1109/LRA.2025.3546860","url":null,"abstract":"Object-goal navigation aims to guide an agent to find a specific target object in an unfamiliar environment based on first-person visual observations. It requires the agent to learn informative visual representations and robust navigation policy. To promote these two components, we proposed two complementary techniques, context-aware graph inference (CGI) and generative adversarial imitation learning (GAIL). CGI improves visual representation learning by integrating object relationships, including category proximity and spatial correlation. It uses the translation on hyperplane (TransH) method to infer context-aware object relationships under the guidance of various contexts over navigation episodes, including image, action, and memory. Both CGI and GAIL aim to improve robust navigation policy, enabling the agent to escape from deadlock states, such as looping or getting stuck. GAIL is an imitation learning (IL) technique that enables the agent to learn from expert demonstrations. Specifically, we propose GAIL to address the non-discriminative reward problem that exists in object-goal navigation. GAIL designs a dynamic reward function and combines it with environment rewards, thus providing guidance for effective navigation policy. Experiments in the AI2-Thor and RoboThor environments demonstrate that our method significantly improves the effectiveness and efficiency of navigation in unfamiliar environments.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"10 4","pages":"3803-3810"},"PeriodicalIF":4.6,"publicationDate":"2025-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143601942","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}