Saeed Saadatnejad;Reyhaneh Hosseininejad;Jose Barreiros;Katherine M. Tsui;Alexandre Alahi
{"title":"HHI-Assist: A Dataset and Benchmark of Human-Human Interaction in Physical Assistance Scenario","authors":"Saeed Saadatnejad;Reyhaneh Hosseininejad;Jose Barreiros;Katherine M. Tsui;Alexandre Alahi","doi":"10.1109/LRA.2025.3586011","DOIUrl":"https://doi.org/10.1109/LRA.2025.3586011","url":null,"abstract":"The increasing labor shortage and aging population underline the need for assistive robots to support human care recipients. To enable safe and responsive assistance, robots require accurate human motion prediction in physical interaction scenarios. However, this remains a challenging task due to the variability of assistive settings and the complexity of coupled dynamics in physical interactions. In this work, we address these challenges through two key contributions: (1) <bold>HHI-Assist</b>, a dataset comprising motion capture clips of human-human interactions in assistive tasks; and (2) a conditional Transformer-based denoising diffusion model for predicting the poses of interacting agents. Our model effectively captures the coupled dynamics between caregivers and care receivers, demonstrating improvements over baselines and strong generalization to unseen scenarios. By advancing interaction-aware motion prediction and introducing a new dataset, our work has the potential to significantly enhance robotic assistance policies.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"10 9","pages":"8746-8753"},"PeriodicalIF":4.6,"publicationDate":"2025-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144671226","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Corrections to “Learning the Inverse Hitting Problem”","authors":"Harshit Khurana;James Hermus;Maxime Gautier;André Schakkal;Aude Billard","doi":"10.1109/LRA.2025.3583517","DOIUrl":"https://doi.org/10.1109/LRA.2025.3583517","url":null,"abstract":"The author list in [1] is updated by adding André Schakkal, due to his contribution to the work.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"10 8","pages":"8187-8187"},"PeriodicalIF":4.6,"publicationDate":"2025-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11071941","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144572980","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Image-Based Roadmaps for Vision-Only Planning and Control of Robotic Manipulators","authors":"Sreejani Chatterjee;Abhinav Gandhi;Berk Calli;Constantinos Chamzas","doi":"10.1109/LRA.2025.3585760","DOIUrl":"https://doi.org/10.1109/LRA.2025.3585760","url":null,"abstract":"This work presents a motion planning framework for robotic manipulators that computes collision-free paths directly in image space. The generated paths can then be tracked using vision-based control, eliminating the need for an explicit robot model or proprioceptive sensing. At the core of our approach is the construction of a roadmap entirely in image space. To achieve this, we explicitly define sampling, nearest-neighbor selection, and collision checking based on visual features rather than geometric models. We first collect a set of image space samples by moving the robot within its workspace, capturing keypoints along its body at different configurations. These samples serve as nodes in the roadmap, which we construct using either learned or predefined distance metrics. At runtime, the roadmap generates collision-free paths directly in image space, removing the need for a robot model or joint encoders. We validate our approach through an experimental study in which a robotic arm follows planned paths using an adaptive vision-based control scheme to avoid obstacles. The results show that paths generated with the learned-distance roadmap achieved 100% success in control convergence, whereas the predefined image space distance roadmap enabled faster transient responses but had a lower success rate in convergence.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"10 8","pages":"8530-8537"},"PeriodicalIF":4.6,"publicationDate":"2025-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144634782","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Zero-Shot Recognition of Test Tube Types by Automatically Collecting and Labeling RGB Data","authors":"Yu Tang;Weiwei Wan;Hao Chen;Masaki Matsushita;Jun Takahashi;Takeyuki Kotaka;Kensuke Harada","doi":"10.1109/LRA.2025.3585759","DOIUrl":"https://doi.org/10.1109/LRA.2025.3585759","url":null,"abstract":"This work presents a method for automatically detecting and recognizing test tube types in a rack. It leverages automatic segmentation, clustering, and labeling processes to eliminate the need for explicitly preparing training data. These processes are addressed by using combined global prediction and local cropping, where global prediction estimates the slot occupation states of a rack, and local cropping extracts tube pictures in the local regions of each slot for clustering and labeling. With the help of the proposed method, the robotic tube manipulation system no longer needs tailored data and explicit training in the presence of new tubes, thus achieving flexibility and efficiency. Experimental evaluations conducted with a RealSense D405 camera and the UFactory xArm Lite6 robot manipulator confirm the method's effectiveness in accurately identifying novel test tube types under real-world conditions.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"10 8","pages":"8276-8283"},"PeriodicalIF":4.6,"publicationDate":"2025-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144606210","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"GC-GAT: Multimodal Vehicular Trajectory Prediction Using Graph Goal Conditioning and Cross-Context Attention","authors":"Mahir Gulzar;Yar Muhammad;Naveed Muhammad","doi":"10.1109/LRA.2025.3585757","DOIUrl":"https://doi.org/10.1109/LRA.2025.3585757","url":null,"abstract":"Predicting future trajectories of surrounding vehicles heavily relies on what contextual information is given to a motion prediction model. The context itself can be static (lanes, regulatory elements, etc) or dynamic (traffic participants). This letter presents a lane graph-based motion prediction model that first predicts graph-based goal proposals and later fuses them with cross attention over multiple contextual elements. We follow the famous encoder-interactor-decoder architecture where the encoder encodes scene context using lightweight Gated Recurrent Units, the interactor applies cross-context attention over encoded scene features and graph goal proposals, and the decoder regresses multimodal trajectories via Laplacian Mixture Density Network from the aggregated encodings. Using cross-attention over graph-based goal proposals gives robust trajectory estimates since the model learns to attend to future goal-relevant scene elements for the intended agent. We evaluate our work on nuScenes motion prediction dataset, achieving state-of-the-art results.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"10 8","pages":"8316-8323"},"PeriodicalIF":4.6,"publicationDate":"2025-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144597757","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Modular Actuator for Multimodal Proprioceptive and Kinesthetic Feedback of Robotic Hands","authors":"Sungwoo Park;Myo-Taeg Lim;Donghyun Hwang","doi":"10.1109/LRA.2025.3585714","DOIUrl":"https://doi.org/10.1109/LRA.2025.3585714","url":null,"abstract":"This study addresses the challenge of implementing proprioceptive and kinesthetic (PK) feedback in robotic hands, essential for grasping and manipulation tasks in unstructured environments. We developed a compact modular actuator featuring a low-module, high-transmission-ratio multistage gear mechanism that measures 25 × 10 × 24 mm, weighs only 10 grams, and maintains moderate backdrivability. The actuator provides multimodal PK feedback, capturing position, velocity, current, and torque data, which are critical for performing various grasping and manipulation tasks. To enable precise motion and force control, we introduced a new adaptive velocity estimator and a simplified Reaction Torque Observer (RTOB). Comprehensive experiments demonstrated the actuator's ability to accurately detect surface shape, roughness, and stiffness of target objects, eliminating the need for additional sensors or space. Experimental results confirmed the actuator's precision, achieving measurement errors of 5.8 mrad for position, 0.19 rad/s for velocity, and 0.011 N·m for torque. These findings highlight the actuator's ability to leverage proprioceptive information, significantly enhancing the functionality and adaptability of robotic hands in diverse and dynamic scenarios.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"10 8","pages":"8467-8474"},"PeriodicalIF":4.6,"publicationDate":"2025-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144623984","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"SLIM: A Symmetric, Low-Inertia Manipulator for Constrained, Contact-Rich Spaces","authors":"Rachel Thomasson;Alessandra Bernardini;Hao Li;Chengyi Xing;Amar Hajj-Ahmad;Mark Cutkosky","doi":"10.1109/LRA.2025.3585712","DOIUrl":"https://doi.org/10.1109/LRA.2025.3585712","url":null,"abstract":"Operation in constrained and cluttered spaces poses a challenge for robotic manipulators, in part due to their bulky link geometry and kinematic limitations in comparison to human hands and arms. To address these limitations, we introduce SLIM, a custom end-effector consisting of a bidirectional hand and an integrated 2-axis wrist. With an opposing thumb that tucks alongside the palm and fingers that bend in both directions, the hand is shaped like an articulated paddle for reaching through gaps and maneuvering in clutter. Series elastic actuation decouples finger inertia from motor inertia, enabling use of small, highly-geared motors for forceful grasps while maintaining a low effective end-point mass. The thumb is mounted on a prismatic axis that adjusts grasp width for large or small objects. We illustrate advantages of the design over conventional solutions with a computed increase in grasp acquisition region, decrease in swept volume when reorienting objects, and reduced end-point mass. SLIM's thin form factor enables faster and more successful teleoperated task completion in constrained environments compared to a conventional parallel-jaw gripper. Additionally, its bidirectional fingers allow demonstrators to complete a sequential picking task more efficiently than with an anthropomorphic hand.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"10 9","pages":"8682-8689"},"PeriodicalIF":4.6,"publicationDate":"2025-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144657490","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zhikang Ma;Jianchang Zhao;Xinan Sun;Lizhi Pan;Shuxin Wang;Jinhua Li
{"title":"A Tip-Flexible Endoscope With Reconfigurable Baseline for Enhanced 3D Perception","authors":"Zhikang Ma;Jianchang Zhao;Xinan Sun;Lizhi Pan;Shuxin Wang;Jinhua Li","doi":"10.1109/LRA.2025.3585758","DOIUrl":"https://doi.org/10.1109/LRA.2025.3585758","url":null,"abstract":"Stereoscopic endoscopes are widely used in minimally invasive cardiac surgery, providing 3D information of the thoracic cavity through small incisions. However, current high-precision 3D perception methods often reduce the flexibility of the endoscope, limiting its field of view. This study proposes a reconfigurable-baseline tip-flexible endoscope specifically designed for cardiac surgery, offering enhanced 3D perception capability. An anti-symmetric constraint architecture and a depth-driven baseline control method are adopted for high-precision 3D perception. Notably, it can adapt to multi-degree-of-freedom tip-flexible structures and constrained surgical environments without increasing the complexity of algorithms or sensors, thereby providing surgeons with greater operational space. In phantom-based experiments, the experimental group achieved a lower RMSE of 0.41 mm at 60–110 mm, compared to 0.58 mm in the control group. Similar results were observed in ex vivo tests, with RMSEs of 0.40 mm and 0.57 mm, respectively, reinforcing its clinical potential. External parameters remained within acceptable ranges, with the dominant error factor, <inline-formula><tex-math>$Delta theta$</tex-math></inline-formula>, controlled to a RMSE of 0.01428<inline-formula><tex-math>$^{circ }$</tex-math></inline-formula>. These results validate the proposed method and offer a new approach for high-precision 3D perception in minimally invasive surgery.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"10 8","pages":"8324-8331"},"PeriodicalIF":4.6,"publicationDate":"2025-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144597702","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sepehr Samavi;Anthony Lem;Fumiaki Sato;Sirui Chen;Qiao Gu;Keijiro Yano;Angela P. Schoellig;Florian Shkurti
{"title":"SICNav-Diffusion: Safe and Interactive Crowd Navigation With Diffusion Trajectory Predictions","authors":"Sepehr Samavi;Anthony Lem;Fumiaki Sato;Sirui Chen;Qiao Gu;Keijiro Yano;Angela P. Schoellig;Florian Shkurti","doi":"10.1109/LRA.2025.3585713","DOIUrl":"https://doi.org/10.1109/LRA.2025.3585713","url":null,"abstract":"To navigate crowds without collisions, robots must interact with humans by forecasting their future motion and reacting accordingly. While learning-based prediction models have shown success in generating likely human trajectory predictions, integrating these stochastic models into a robot controller presents several challenges. The controller needs to account for interactive coupling between planned robot motion and human predictions while ensuring both predictions and robot actions are safe (i.e. collision-free). To address these challenges, we present a receding horizon crowd navigation method for single-robot multi-human environments. We first propose a diffusion model to generate joint trajectory predictions for all humans in the scene. We then incorporate these multi-modal predictions into a SICNav Bilevel MPC problem that simultaneously solves for a robot plan (upper-level) and acts as a safety filter to refine the predictions for non-collision (lower-level). Combining planning and prediction refinement into one bilevel problem ensures that the robot plan and human predictions are coupled. We validate the open-loop trajectory prediction performance of our diffusion model on the commonly used ETH/UCY benchmark and evaluate the closed-loop performance of our robot navigation method in simulation and extensive real-robot experiments demonstrating safe, efficient, and reactive robot motion.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"10 9","pages":"8738-8745"},"PeriodicalIF":4.6,"publicationDate":"2025-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144671225","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"SAGA-SLAM: Scale-Adaptive 3D Gaussian Splatting for Visual SLAM","authors":"Kun Park;Seung-Woo Seo","doi":"10.1109/LRA.2025.3585756","DOIUrl":"https://doi.org/10.1109/LRA.2025.3585756","url":null,"abstract":"3D Gaussian Splatting (3DGS) has recently emerged as a powerful technique for representing 3D scenes. Its superior high-fidelity rendering quality and speed have driven its rapid adoption in many applications. Among them, Visual Simultaneous Localization and Mapping (VSLAM) is the most prominent application, as it requires real-time simultaneous mapping and position tracking of navigating objects. However, from our comprehensive study, we observed a fundamental hurdle in directly applying the current 3DGS technique to VSLAM, which we define as the scale adaptation problem. The scale adaptation problem refers to the inability of existing 3DGS-based SLAM methods to address varying scales, specifically the extent of camera pose difference from the perspective of tracking, and environmental size in terms of mapping and the addition of new 3D Gaussians. To overcome this limitation, we propose SAGA-SLAM, the first scale-adaptive RGB-D Dense SLAM framework based on 3DGS. We optimize the tracking and mapping stages robustly over various scales by utilizing the Polyak step size and momentum. Additionally, we present gaussian fission method to address the scale problem during the addition of 3D Gaussians. Experiments show that our method achieves state-of-the-art results robustly on both large and small scales, such as KITTI, Replica, and TUM-RGBD. By adapting without the need for hyperparameter tuning, our method demonstrates both superior performance and practical applicability.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"10 8","pages":"8268-8275"},"PeriodicalIF":4.6,"publicationDate":"2025-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144606217","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}