{"title":"Attention-Based Policy Network for Sensor-Free Robotic Arm Control With Deep Reinforcement Learning","authors":"Jin Wu, Yaqiao Zhu, Jinfu Li","doi":"10.1002/cpe.70250","DOIUrl":null,"url":null,"abstract":"<div>\n \n <p>This paper proposes a novel attention-based convolutional neural network (CNN) for sensor-free robotic arm control, aiming to improve six dimensional (6D) pose estimation and end-effector operation in an end-to-end manner. Unlike traditional methods that rely on explicit feature engineering or sensor feedback, our approach leverages a sophisticated attention mechanism within the convolutional backbone to enhance spatial awareness. The proposed localization sub-module scores each prior regime through a weighted average of activation maps, allowing the network to focus on the most informative regions of the input. Additionally, we introduce a two-phase training methodology requiring only image-level annotations. In the first phase, the network learns to extract discriminative features from synthetic images, which are crucial for accurate 6D pose prediction. In the second phase, a reinforcement learning agent, equipped with the trained vision model as its sensory module, is optimized using a sparse reward function to refine action policies. Experimental evaluations in two virtual scenarios demonstrate that our method outperforms popular CNN-based approaches in terms of both accuracy and efficiency. Specifically, our method improves task success rates by 52.9% and reduces position error by 72.3% compared to baseline models, showcasing its effectiveness in sensor-free robotic arm control.</p>\n </div>","PeriodicalId":55214,"journal":{"name":"Concurrency and Computation-Practice & Experience","volume":"37 23-24","pages":""},"PeriodicalIF":1.5000,"publicationDate":"2025-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Concurrency and Computation-Practice & Experience","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/cpe.70250","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}
引用次数: 0
Abstract
This paper proposes a novel attention-based convolutional neural network (CNN) for sensor-free robotic arm control, aiming to improve six dimensional (6D) pose estimation and end-effector operation in an end-to-end manner. Unlike traditional methods that rely on explicit feature engineering or sensor feedback, our approach leverages a sophisticated attention mechanism within the convolutional backbone to enhance spatial awareness. The proposed localization sub-module scores each prior regime through a weighted average of activation maps, allowing the network to focus on the most informative regions of the input. Additionally, we introduce a two-phase training methodology requiring only image-level annotations. In the first phase, the network learns to extract discriminative features from synthetic images, which are crucial for accurate 6D pose prediction. In the second phase, a reinforcement learning agent, equipped with the trained vision model as its sensory module, is optimized using a sparse reward function to refine action policies. Experimental evaluations in two virtual scenarios demonstrate that our method outperforms popular CNN-based approaches in terms of both accuracy and efficiency. Specifically, our method improves task success rates by 52.9% and reduces position error by 72.3% compared to baseline models, showcasing its effectiveness in sensor-free robotic arm control.
期刊介绍:
Concurrency and Computation: Practice and Experience (CCPE) publishes high-quality, original research papers, and authoritative research review papers, in the overlapping fields of:
Parallel and distributed computing;
High-performance computing;
Computational and data science;
Artificial intelligence and machine learning;
Big data applications, algorithms, and systems;
Network science;
Ontologies and semantics;
Security and privacy;
Cloud/edge/fog computing;
Green computing; and
Quantum computing.