未知动态下线性网络系统的分布式最优同步控制

Farzaneh Tatari, M. Naghibi-Sistani, K. Vamvoudakis
{"title":"未知动态下线性网络系统的分布式最优同步控制","authors":"Farzaneh Tatari, M. Naghibi-Sistani, K. Vamvoudakis","doi":"10.23919/ACC.2017.7963029","DOIUrl":null,"url":null,"abstract":"This work proposes an online optimal distributed learning algorithm to find the game theoretic solution of systems on graphs with completely unknown dynamics. The proposed algorithm learns online the approximate solution to the cooperative coupled Hamilton-Jacobi (HJ) equations. Each player employs an actor/critic network structure to learn the optimal cost and the optimal policy along with intelligent identifiers to obviate the knowledge of the system dynamics. We use recorded experiences concurrently with current data to guarantee proper state exploration. The closed-loop system is proved to be stable and the policies form a Nash equilibrium. Finally, simulation results verify the effectiveness of the proposed approach.","PeriodicalId":422926,"journal":{"name":"2017 American Control Conference (ACC)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-05-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":"{\"title\":\"Distributed optimal synchronization control of linear networked systems under unknown dynamics\",\"authors\":\"Farzaneh Tatari, M. Naghibi-Sistani, K. Vamvoudakis\",\"doi\":\"10.23919/ACC.2017.7963029\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This work proposes an online optimal distributed learning algorithm to find the game theoretic solution of systems on graphs with completely unknown dynamics. The proposed algorithm learns online the approximate solution to the cooperative coupled Hamilton-Jacobi (HJ) equations. Each player employs an actor/critic network structure to learn the optimal cost and the optimal policy along with intelligent identifiers to obviate the knowledge of the system dynamics. We use recorded experiences concurrently with current data to guarantee proper state exploration. The closed-loop system is proved to be stable and the policies form a Nash equilibrium. Finally, simulation results verify the effectiveness of the proposed approach.\",\"PeriodicalId\":422926,\"journal\":{\"name\":\"2017 American Control Conference (ACC)\",\"volume\":\"41 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-05-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"8\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 American Control Conference (ACC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.23919/ACC.2017.7963029\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 American Control Conference (ACC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.23919/ACC.2017.7963029","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8

摘要

本文提出了一种在线最优分布式学习算法,用于寻找动态完全未知的图上系统的博弈论解。该算法在线学习协作耦合Hamilton-Jacobi (HJ)方程的近似解。每个参与者使用一个参与者/评论家网络结构来学习最优成本和最优策略,并使用智能标识符来避免系统动力学知识。我们同时使用记录的经验和当前数据来保证适当的状态勘探。证明了闭环系统是稳定的,策略形成纳什均衡。最后,仿真结果验证了该方法的有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Distributed optimal synchronization control of linear networked systems under unknown dynamics
This work proposes an online optimal distributed learning algorithm to find the game theoretic solution of systems on graphs with completely unknown dynamics. The proposed algorithm learns online the approximate solution to the cooperative coupled Hamilton-Jacobi (HJ) equations. Each player employs an actor/critic network structure to learn the optimal cost and the optimal policy along with intelligent identifiers to obviate the knowledge of the system dynamics. We use recorded experiences concurrently with current data to guarantee proper state exploration. The closed-loop system is proved to be stable and the policies form a Nash equilibrium. Finally, simulation results verify the effectiveness of the proposed approach.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信