IET Cybersystems and Robotics最新文献

筛选
英文 中文
Distributed field mapping for mobile sensor teams using a derivative-free optimisation algorithm 使用无导数优化算法为移动传感器团队进行分布式实地测绘
IET Cybersystems and Robotics Pub Date : 2024-03-31 DOI: 10.1049/csy2.12111
Tony X. Lin, Jia Guo, Said Al-Abri, Fumin Zhang
{"title":"Distributed field mapping for mobile sensor teams using a derivative-free optimisation algorithm","authors":"Tony X. Lin,&nbsp;Jia Guo,&nbsp;Said Al-Abri,&nbsp;Fumin Zhang","doi":"10.1049/csy2.12111","DOIUrl":"https://doi.org/10.1049/csy2.12111","url":null,"abstract":"<p>The authors propose a distributed field mapping algorithm that drives a team of robots to explore and learn an unknown scalar field using a Gaussian Process (GP). The authors’ strategy arises by balancing exploration objectives between areas of high error and high variance. As computing high error regions is impossible since the scalar field is unknown, a bio-inspired approach known as Speeding-Up and Slowing-Down is leveraged to track the gradient of the GP error. This approach achieves global field-learning convergence and is shown to be resistant to poor hyperparameter tuning of the GP. This approach is validated in simulations and experiments using 2D wheeled robots and 2D flying miniature autonomous blimps.</p>","PeriodicalId":34110,"journal":{"name":"IET Cybersystems and Robotics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-03-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/csy2.12111","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140333038","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ROSIC: Enhancing secure and accessible robot control through open-source instant messaging platforms ROSIC:通过开源即时通讯平台加强机器人控制的安全性和可及性
IET Cybersystems and Robotics Pub Date : 2024-03-29 DOI: 10.1049/csy2.12112
Rasoul Sadeghian, Shahrooz Shahin, Sina Sareh
{"title":"ROSIC: Enhancing secure and accessible robot control through open-source instant messaging platforms","authors":"Rasoul Sadeghian,&nbsp;Shahrooz Shahin,&nbsp;Sina Sareh","doi":"10.1049/csy2.12112","DOIUrl":"https://doi.org/10.1049/csy2.12112","url":null,"abstract":"<p>Ensuring secure communication and seamless accessibility remains a primary challenge in controlling robots remotely. The authors propose a novel approach that leverages open-source instant messaging platforms to overcome the complexities and reduce costs associated with implementing a secure and user-centred communication system for remote robot control named Robot Control System using Instant Communication (ROSIC). By leveraging features, such as real-time messaging, group chats, end-to-end encryption and cross-platform support inherent in the majority of instant messenger platforms, we have developed middleware that establishes a secure and efficient communication system over the Internet. By using instant messaging as the communication interface between users and robots, ROSIC caters to non-technical users, making it easier for them to control robots. The architecture of ROSIC enables various scenarios for robot control, including one user controlling multiple robots, multiple users controlling one robot, multiple robots controlled by multiple users, and one user controlling one robot. Furthermore, ROSIC facilitates the interaction of multiple robots, enabling them to interoperate and function collaboratively as a swarm system by providing a unified communication platform that allows for seamless exchange of data and commands. Telegram was specifically chosen as the instant messaging platform by the authors due to its open-source nature, robust encryption, compatibility across multiple platforms and interactive communication capabilities through channels and groups. Notably, the ROSIC is designed to communicate effectively with robot operating system (ROS)-based robots to enhance our ability to control them remotely.</p>","PeriodicalId":34110,"journal":{"name":"IET Cybersystems and Robotics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-03-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/csy2.12112","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140329018","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Digital twin-based multi-objective autonomous vehicle navigation approach as applied in infrastructure construction 应用于基础设施建设的基于数字孪生的多目标自主车辆导航方法
IET Cybersystems and Robotics Pub Date : 2024-03-20 DOI: 10.1049/csy2.12110
Tingjun Lei, Timothy Sellers, Chaomin Luo, Lei Cao, Zhuming Bi
{"title":"Digital twin-based multi-objective autonomous vehicle navigation approach as applied in infrastructure construction","authors":"Tingjun Lei,&nbsp;Timothy Sellers,&nbsp;Chaomin Luo,&nbsp;Lei Cao,&nbsp;Zhuming Bi","doi":"10.1049/csy2.12110","DOIUrl":"https://doi.org/10.1049/csy2.12110","url":null,"abstract":"<p>The widespread adoption of autonomous vehicles has generated considerable interest in their autonomous operation, with path planning emerging as a critical aspect. However, existing road infrastructure confronts challenges due to prolonged use and insufficient maintenance. Previous research on autonomous vehicle navigation has focused on determining the trajectory with the shortest distance, while neglecting road construction information, leading to potential time and energy inefficiencies in real-world scenarios involving infrastructure development. To address this issue, a digital twin-embedded multi-objective autonomous vehicle navigation is proposed under the condition of infrastructure construction. The authors propose an image processing algorithm that leverages captured images of the road construction environment to enable road extraction and modelling of the autonomous vehicle workspace. Additionally, a wavelet neural network is developed to predict real-time traffic flow, considering its inherent characteristics. Moreover, a multi-objective brainstorm optimisation (BSO)-based method for path planning is introduced, which optimises total time-cost and energy consumption objective functions. To ensure optimal trajectory planning during infrastructure construction, the algorithm incorporates a real-time updated digital twin throughout autonomous vehicle operations. The effectiveness and robustness of the proposed model are validated through simulation and comparative studies conducted in diverse scenarios involving road construction. The results highlight the improved performance and reliability of the autonomous vehicle system when equipped with the authors’ approach, demonstrating its potential for enhancing efficiency and minimising disruptions caused by road infrastructure development.</p>","PeriodicalId":34110,"journal":{"name":"IET Cybersystems and Robotics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/csy2.12110","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140181594","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An efficient and robust system for human following scenario using differential robot 利用差分机器人实现高效稳健的人类追随系统
IET Cybersystems and Robotics Pub Date : 2024-01-25 DOI: 10.1049/csy2.12108
Jiangchao Zhu, Changjia Ma, Chao Xu, Fei Gao
{"title":"An efficient and robust system for human following scenario using differential robot","authors":"Jiangchao Zhu,&nbsp;Changjia Ma,&nbsp;Chao Xu,&nbsp;Fei Gao","doi":"10.1049/csy2.12108","DOIUrl":"10.1049/csy2.12108","url":null,"abstract":"<p>A novel system for human following using a differential robot, including an accurate 3-D human position tracking module and a novel planning strategy that ensures safety and dynamic feasibility, is proposed. The authors utilise a combination of gimbal camera and LiDAR for long-term accurate human detection. Then the planning module takes the target's future trajectory as a reference to generate a coarse path to ensure the following visibility. After that, the trajectory is optimised considering other constraints and following distance. Experiments demonstrate the robustness and efficiency of our system in complex environments, demonstrating its potential in various applications.</p>","PeriodicalId":34110,"journal":{"name":"IET Cybersystems and Robotics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-01-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/csy2.12108","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139596543","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An autonomous Unmanned Aerial Vehicle exploration platform with a hierarchical control method for post-disaster infrastructures 采用分层控制方法的灾后基础设施无人飞行器自主探索平台
IET Cybersystems and Robotics Pub Date : 2024-01-24 DOI: 10.1049/csy2.12107
Xin Peng, Gaofeng Su, Raja Sengupta
{"title":"An autonomous Unmanned Aerial Vehicle exploration platform with a hierarchical control method for post-disaster infrastructures","authors":"Xin Peng,&nbsp;Gaofeng Su,&nbsp;Raja Sengupta","doi":"10.1049/csy2.12107","DOIUrl":"10.1049/csy2.12107","url":null,"abstract":"<p>Catastrophic natural disasters like earthquakes can cause infrastructure damage. Emergency response agencies need to assess damage precisely while repeating this process for infrastructures with different shapes and types. The authors aim for an autonomous Unmanned Aerial Vehicle (UAV) platform equipped with a 3D LiDAR sensor to comprehensively and accurately scan the infrastructure and map it with a predefined resolution <i>r</i>. During the inspection, the UAV needs to decide on the Next Best View (NBV) position to maximize the gathered information while avoiding collision at high speed. The authors propose solving this problem by implementing a hierarchical closed-loop control system consisting of a global planner and a local planner. The global NBV planner decides the general UAV direction based on a history of measurements from the LiDAR sensor, and the local planner considers the UAV dynamics and enables the UAV to fly at high speed with the latest LiDAR measurements. The proposed system is validated through the Regional Scale Autonomous Swarm Damage Assessment simulator, which is built by the authors. Through extensive testing in three unique and highly constrained infrastructure environments, the autonomous UAV inspection system successfully explored and mapped the infrastructures, demonstrating its versatility and applicability across various shapes of infrastructure.</p>","PeriodicalId":34110,"journal":{"name":"IET Cybersystems and Robotics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/csy2.12107","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139601248","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Correction to Chinese personalised text-to-speech synthesis for robot human–machine interaction 用于机器人人机交互的中文个性化文本到语音合成的更正
IET Cybersystems and Robotics Pub Date : 2024-01-11 DOI: 10.1049/csy2.12109
{"title":"Correction to Chinese personalised text-to-speech synthesis for robot human–machine interaction","authors":"","doi":"10.1049/csy2.12109","DOIUrl":"https://doi.org/10.1049/csy2.12109","url":null,"abstract":"<p>Pang, B., et al.: Chinese personalised text-to-speech synthesis for robot human-machine interaction. IET Cyber-Syst. Robot. e12098 (2023). https://doi.org/10.1049/csy2.12098</p><p>Incorrect grant number was used for the funder name “National Key Research and Development Plan of China” in the funding and acknowledgement sections. The correct grant number is 2020AAA0108900.</p><p>We apologize for this error.</p>","PeriodicalId":34110,"journal":{"name":"IET Cybersystems and Robotics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-01-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/csy2.12109","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139419779","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An audio-based risky flight detection framework for quadrotors 基于音频的四旋翼飞行器风险飞行检测框架
IET Cybersystems and Robotics Pub Date : 2024-01-11 DOI: 10.1049/csy2.12105
Wansong Liu, Chang Liu, Seyedomid Sajedi, Hao Su, Xiao Liang, Minghui Zheng
{"title":"An audio-based risky flight detection framework for quadrotors","authors":"Wansong Liu,&nbsp;Chang Liu,&nbsp;Seyedomid Sajedi,&nbsp;Hao Su,&nbsp;Xiao Liang,&nbsp;Minghui Zheng","doi":"10.1049/csy2.12105","DOIUrl":"https://doi.org/10.1049/csy2.12105","url":null,"abstract":"<p>Drones have increasingly collaborated with human workers in some workspaces, such as warehouses. The failure of a drone flight may bring potential risks to human beings' life safety during some aerial tasks. One of the most common flight failures is triggered by damaged propellers. To quickly detect physical damage to propellers, recognise risky flights, and provide early warnings to surrounding human workers, a new and comprehensive fault diagnosis framework is presented that uses only the audio caused by propeller rotation without accessing any flight data. The diagnosis framework includes three components: leverage convolutional neural networks, transfer learning, and Bayesian optimisation. Particularly, the audio signal from an actual flight is collected and transferred into time–frequency spectrograms. First, a convolutional neural network-based diagnosis model that utilises these spectrograms is developed to identify whether there is any broken propeller involved in a specific drone flight. Additionally, the authors employ Monte Carlo dropout sampling to obtain the inconsistency of diagnostic results and compute the mean probability score vector's entropy (uncertainty) as another factor to diagnose the drone flight. Next, to reduce data dependence on different drone types, the convolutional neural network-based diagnosis model is further augmented by transfer learning. That is, the knowledge of a well-trained diagnosis model is refined by using a small set of data from a different drone. The modified diagnosis model has the ability to detect the broken propeller of the second drone. Thirdly, to reduce the hyperparameters' tuning efforts and reinforce the robustness of the network, Bayesian optimisation takes advantage of the observed diagnosis model performances to construct a Gaussian process model that allows the acquisition function to choose the optimal network hyperparameters. The proposed diagnosis framework is validated via real experimental flight tests and has a reasonably high diagnosis accuracy.</p>","PeriodicalId":34110,"journal":{"name":"IET Cybersystems and Robotics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-01-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/csy2.12105","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139435296","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Adaptive neural tracking control for upper limb rehabilitation robot with output constraints 具有输出约束的上肢康复机器人的自适应神经跟踪控制
IET Cybersystems and Robotics Pub Date : 2023-12-26 DOI: 10.1049/csy2.12104
Zibin Zhang, Pengbo Cui, Aimin An
{"title":"Adaptive neural tracking control for upper limb rehabilitation robot with output constraints","authors":"Zibin Zhang,&nbsp;Pengbo Cui,&nbsp;Aimin An","doi":"10.1049/csy2.12104","DOIUrl":"https://doi.org/10.1049/csy2.12104","url":null,"abstract":"<p>The authors investigate the trajectory tracking control problem of an upper limb rehabilitation robot system with unknown dynamics. To address the system's uncertainties and improve the tracking accuracy of the rehabilitation robot, an adaptive neural full-state feedback control is proposed. The neural network is utilised to approximate the dynamics that are not fully modelled and adapt to the interaction between the upper limb rehabilitation robot and the patient. By incorporating a high-gain observer, unmeasurable state information is integrated into the output feedback control. Taking into consideration the issue of joint position constraints during the actual rehabilitation training process, an adaptive neural full-state and output feedback control scheme with output constraint is further designed. From the perspective of safety in human–robot interaction during rehabilitation training, log-type barrier Lyapunov function is introduced in the output constraint controller to ensure that the output remains within the predefined constraint region. The stability of the closed-loop system is proved by Lyapunov stability theory. The effectiveness of the proposed control scheme is validated by applying it to an upper limb rehabilitation robot through simulations.</p>","PeriodicalId":34110,"journal":{"name":"IET Cybersystems and Robotics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-12-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/csy2.12104","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139047594","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Lessons learned: Symbiotic autonomous robot ecosystem for nuclear environments 经验教训:核环境共生自主机器人生态系统
IET Cybersystems and Robotics Pub Date : 2023-12-26 DOI: 10.1049/csy2.12103
Daniel Mitchell, Paul Dominick Emor Baniqued, Abdul Zahid, Andrew West, Bahman Nouri Rahmat Abadi, Barry Lennox, Bin Liu, Burak Kizilkaya, David Flynn, David John Francis, Erwin Jose Lopez Pulgarin, Guodong Zhao, Hasan Kivrak, Jamie Rowland Douglas Blanche, Jennifer David, Jingyan Wang, Joseph Bolarinwa, Kanzhong Yao, Keir Groves, Liyuan Qi, Mahmoud A. Shawky, Manuel Giuliani, Melissa Sandison, Olaoluwa Popoola, Ognjen Marjanovic, Paul Bremner, Samuel Thomas Harper, Shivoh Nandakumar, Simon Watson, Subham Agrawal, Theodore Lim, Thomas Johnson, Wasim Ahmad, Xiangmin Xu, Zhen Meng, Zhengyi Jiang
{"title":"Lessons learned: Symbiotic autonomous robot ecosystem for nuclear environments","authors":"Daniel Mitchell,&nbsp;Paul Dominick Emor Baniqued,&nbsp;Abdul Zahid,&nbsp;Andrew West,&nbsp;Bahman Nouri Rahmat Abadi,&nbsp;Barry Lennox,&nbsp;Bin Liu,&nbsp;Burak Kizilkaya,&nbsp;David Flynn,&nbsp;David John Francis,&nbsp;Erwin Jose Lopez Pulgarin,&nbsp;Guodong Zhao,&nbsp;Hasan Kivrak,&nbsp;Jamie Rowland Douglas Blanche,&nbsp;Jennifer David,&nbsp;Jingyan Wang,&nbsp;Joseph Bolarinwa,&nbsp;Kanzhong Yao,&nbsp;Keir Groves,&nbsp;Liyuan Qi,&nbsp;Mahmoud A. Shawky,&nbsp;Manuel Giuliani,&nbsp;Melissa Sandison,&nbsp;Olaoluwa Popoola,&nbsp;Ognjen Marjanovic,&nbsp;Paul Bremner,&nbsp;Samuel Thomas Harper,&nbsp;Shivoh Nandakumar,&nbsp;Simon Watson,&nbsp;Subham Agrawal,&nbsp;Theodore Lim,&nbsp;Thomas Johnson,&nbsp;Wasim Ahmad,&nbsp;Xiangmin Xu,&nbsp;Zhen Meng,&nbsp;Zhengyi Jiang","doi":"10.1049/csy2.12103","DOIUrl":"https://doi.org/10.1049/csy2.12103","url":null,"abstract":"<p>Nuclear facilities have a regulatory requirement to measure radiation levels within Post Operational Clean Out (POCO) around nuclear facilities each year, resulting in a trend towards robotic deployments to gain an improved understanding during nuclear decommissioning phases. The UK Nuclear Decommissioning Authority supports the view that human-in-the-loop (HITL) robotic deployments are a solution to improve procedures and reduce risks within radiation characterisation of nuclear sites. The authors present a novel implementation of a Cyber-Physical System (CPS) deployed in an analogue nuclear environment, comprised of a multi-robot (MR) team coordinated by a HITL operator through a digital twin interface. The development of the CPS created efficient partnerships across systems including robots, digital systems and human. This was presented as a multi-staged mission within an inspection scenario for the heterogeneous Symbiotic Multi-Robot Fleet (SMuRF). Symbiotic interactions were achieved across the SMuRF where robots utilised automated collaborative governance to work together, where a single robot would face challenges in full characterisation of radiation. Key contributions include the demonstration of symbiotic autonomy and query-based learning of an autonomous mission supporting scalable autonomy and autonomy as a service. The coordination of the CPS was a success and displayed further challenges and improvements related to future MR fleets.</p>","PeriodicalId":34110,"journal":{"name":"IET Cybersystems and Robotics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-12-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/csy2.12103","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139047599","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Off-policy correction algorithm for double Q network based on deep reinforcement learning 基于深度强化学习的双 Q 网络偏离策略修正算法
IET Cybersystems and Robotics Pub Date : 2023-12-21 DOI: 10.1049/csy2.12102
Qingbo Zhang, Manlu Liu, Heng Wang, Weimin Qian, Xinglang Zhang
{"title":"Off-policy correction algorithm for double Q network based on deep reinforcement learning","authors":"Qingbo Zhang,&nbsp;Manlu Liu,&nbsp;Heng Wang,&nbsp;Weimin Qian,&nbsp;Xinglang Zhang","doi":"10.1049/csy2.12102","DOIUrl":"https://doi.org/10.1049/csy2.12102","url":null,"abstract":"<p>A deep reinforcement learning (DRL) method based on the deep deterministic policy gradient (DDPG) algorithm is proposed to address the problems of a mismatch between the needed training samples and the actual training samples during the training of intelligence, the overestimation and underestimation of the existence of Q-values, and the insufficient dynamism of the intelligence policy exploration. This method introduces the Actor-Critic Off-Policy Correction (AC-Off-POC) reinforcement learning framework and an improved double Q-value learning method, which enables the value function network in the target task to provide a more accurate evaluation of the policy network and converge to the optimal policy more quickly and stably to obtain higher value returns. The method is applied to multiple MuJoCo tasks on the Open AI Gym simulation platform. The experimental results show that it is better than the DDPG algorithm based solely on the different policy correction framework (AC-Off-POC) and the conventional DRL algorithm. The value of returns and stability of the double-Q-network off-policy correction algorithm for the deep deterministic policy gradient (DCAOP-DDPG) proposed by the authors are significantly higher than those of other DRL algorithms.</p>","PeriodicalId":34110,"journal":{"name":"IET Cybersystems and Robotics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-12-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/csy2.12102","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139041971","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信