Federated reinforcement learning for robot motion planning with zero-shot generalization

Zhenyuan Yuan, Siyuan Xu, Minghui Zhu
{"title":"Federated reinforcement learning for robot motion planning with zero-shot generalization","authors":"Zhenyuan Yuan, Siyuan Xu, Minghui Zhu","doi":"arxiv-2403.13245","DOIUrl":null,"url":null,"abstract":"This paper considers the problem of learning a control policy for robot\nmotion planning with zero-shot generalization, i.e., no data collection and\npolicy adaptation is needed when the learned policy is deployed in new\nenvironments. We develop a federated reinforcement learning framework that\nenables collaborative learning of multiple learners and a central server, i.e.,\nthe Cloud, without sharing their raw data. In each iteration, each learner\nuploads its local control policy and the corresponding estimated normalized\narrival time to the Cloud, which then computes the global optimum among the\nlearners and broadcasts the optimal policy to the learners. Each learner then\nselects between its local control policy and that from the Cloud for next\niteration. The proposed framework leverages on the derived zero-shot\ngeneralization guarantees on arrival time and safety. Theoretical guarantees on\nalmost-sure convergence, almost consensus, Pareto improvement and optimality\ngap are also provided. Monte Carlo simulation is conducted to evaluate the\nproposed framework.","PeriodicalId":501062,"journal":{"name":"arXiv - CS - Systems and Control","volume":"160 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Systems and Control","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2403.13245","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

This paper considers the problem of learning a control policy for robot motion planning with zero-shot generalization, i.e., no data collection and policy adaptation is needed when the learned policy is deployed in new environments. We develop a federated reinforcement learning framework that enables collaborative learning of multiple learners and a central server, i.e., the Cloud, without sharing their raw data. In each iteration, each learner uploads its local control policy and the corresponding estimated normalized arrival time to the Cloud, which then computes the global optimum among the learners and broadcasts the optimal policy to the learners. Each learner then selects between its local control policy and that from the Cloud for next iteration. The proposed framework leverages on the derived zero-shot generalization guarantees on arrival time and safety. Theoretical guarantees on almost-sure convergence, almost consensus, Pareto improvement and optimality gap are also provided. Monte Carlo simulation is conducted to evaluate the proposed framework.
用于机器人运动规划的零点泛化联合强化学习
本文考虑的问题是,如何学习一种用于机器人运动规划的控制策略,并实现零点泛化,即在新环境中部署所学策略时,无需收集数据和调整策略。我们开发了一个联合强化学习框架,它能让多个学习者和一个中央服务器(即云)进行协作学习,而无需共享他们的原始数据。在每次迭代中,每个学习者都会将其本地控制策略和相应的估计归一化到达时间上传到云端,然后云端会计算学习者之间的全局最优值,并将最优策略广播给学习者。然后,每个学习者在其本地控制策略和云端策略之间进行选择,进行下一步迭代。所提出的框架利用了对到达时间和安全性的零次泛化保证。此外,还提供了关于几乎确定收敛、几乎共识、帕累托改进和最优性差距的理论保证。为评估所提出的框架,进行了蒙特卡罗模拟。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信