Deep reinforcement learning-based trajectory planning for double pendulum cranes: Design and experiments

IF 7.9 1区 工程技术 Q1 ENGINEERING, MECHANICAL
Weili Ding , Heng Zhang , Changchun Hua , Biao Lu
{"title":"Deep reinforcement learning-based trajectory planning for double pendulum cranes: Design and experiments","authors":"Weili Ding ,&nbsp;Heng Zhang ,&nbsp;Changchun Hua ,&nbsp;Biao Lu","doi":"10.1016/j.ymssp.2025.112780","DOIUrl":null,"url":null,"abstract":"<div><div>Since cranes usually possess double pendulum dynamics, the mass of the payload often changes and there are frequently lifting/lowering operations simultaneously. Moreover, most crane systems are driven by motors, whose velocities and accelerations are often limited. To solve the above problems, this paper proposes a deep reinforcement learning (DRL) reference trajectory generation method based on virtual–physical joint training. Firstly, a DRL module based on deep deterministic policy gradient (DDPG), along with a double pendulum crane dynamic model and an adaptive controller, are established within the virtual environment for training the reference trajectory. When the reward reaches the threshold, the training is switched to the physical environment to further optimize the reference trajectory, realizing the swing suppression of the hook and the payload. In addition, to satisfy the performance of the drive motors, velocity and acceleration thresholds are set to constrain the performance of the drive motors. Finally, in view of the fact that the operation process of the double pendulum crane is divided into payload transportation and payload loading/unloading, an event-triggering mechanism is designed to switch different control policies in accordance with different operation processes, thus reducing the consumption of computing resources. Through experiments on the actual double pendulum crane and comparison with the existing reference trajectories and input shapers, the superiority of this method is demonstrated.</div></div>","PeriodicalId":51124,"journal":{"name":"Mechanical Systems and Signal Processing","volume":"234 ","pages":"Article 112780"},"PeriodicalIF":7.9000,"publicationDate":"2025-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Mechanical Systems and Signal Processing","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0888327025004819","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, MECHANICAL","Score":null,"Total":0}
引用次数: 0

Abstract

Since cranes usually possess double pendulum dynamics, the mass of the payload often changes and there are frequently lifting/lowering operations simultaneously. Moreover, most crane systems are driven by motors, whose velocities and accelerations are often limited. To solve the above problems, this paper proposes a deep reinforcement learning (DRL) reference trajectory generation method based on virtual–physical joint training. Firstly, a DRL module based on deep deterministic policy gradient (DDPG), along with a double pendulum crane dynamic model and an adaptive controller, are established within the virtual environment for training the reference trajectory. When the reward reaches the threshold, the training is switched to the physical environment to further optimize the reference trajectory, realizing the swing suppression of the hook and the payload. In addition, to satisfy the performance of the drive motors, velocity and acceleration thresholds are set to constrain the performance of the drive motors. Finally, in view of the fact that the operation process of the double pendulum crane is divided into payload transportation and payload loading/unloading, an event-triggering mechanism is designed to switch different control policies in accordance with different operation processes, thus reducing the consumption of computing resources. Through experiments on the actual double pendulum crane and comparison with the existing reference trajectories and input shapers, the superiority of this method is demonstrated.
基于深度强化学习的双摆起重机轨迹规划:设计与实验
由于起重机通常具有双摆动力学,载荷质量经常变化,并且经常同时进行升降操作。此外,大多数起重机系统是由电机驱动的,其速度和加速度通常是有限的。针对上述问题,本文提出了一种基于虚拟-物理联合训练的深度强化学习(DRL)参考轨迹生成方法。首先,在虚拟环境中建立基于深度确定性策略梯度(DDPG)的DRL模块,建立双摆起重机动力学模型和自适应控制器,训练参考轨迹;当奖励达到阈值时,切换到物理环境进行训练,进一步优化参考轨迹,实现对勾手和载荷的摆动抑制。此外,为了满足驱动电机的性能,设置了速度和加速度阈值来约束驱动电机的性能。最后,针对双摆起重机运行过程分为载荷运输和载荷装卸两部分的特点,设计了事件触发机制,根据不同的运行过程切换不同的控制策略,减少了计算资源的消耗。通过在实际双摆起重机上的实验,并与已有的参考轨迹和输入成形器进行比较,证明了该方法的优越性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Mechanical Systems and Signal Processing
Mechanical Systems and Signal Processing 工程技术-工程:机械
CiteScore
14.80
自引率
13.10%
发文量
1183
审稿时长
5.4 months
期刊介绍: Journal Name: Mechanical Systems and Signal Processing (MSSP) Interdisciplinary Focus: Mechanical, Aerospace, and Civil Engineering Purpose:Reporting scientific advancements of the highest quality Arising from new techniques in sensing, instrumentation, signal processing, modelling, and control of dynamic systems
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信