未知动态下线性网络系统的分布式最优同步控制

2017 American Control Conference (ACC) Pub Date : 2017-05-24 DOI:10.23919/ACC.2017.7963029

Farzaneh Tatari, M. Naghibi-Sistani, K. Vamvoudakis

{"title":"未知动态下线性网络系统的分布式最优同步控制","authors":"Farzaneh Tatari, M. Naghibi-Sistani, K. Vamvoudakis","doi":"10.23919/ACC.2017.7963029","DOIUrl":null,"url":null,"abstract":"This work proposes an online optimal distributed learning algorithm to find the game theoretic solution of systems on graphs with completely unknown dynamics. The proposed algorithm learns online the approximate solution to the cooperative coupled Hamilton-Jacobi (HJ) equations. Each player employs an actor/critic network structure to learn the optimal cost and the optimal policy along with intelligent identifiers to obviate the knowledge of the system dynamics. We use recorded experiences concurrently with current data to guarantee proper state exploration. The closed-loop system is proved to be stable and the policies form a Nash equilibrium. Finally, simulation results verify the effectiveness of the proposed approach.","PeriodicalId":422926,"journal":{"name":"2017 American Control Conference (ACC)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-05-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":"{\"title\":\"Distributed optimal synchronization control of linear networked systems under unknown dynamics\",\"authors\":\"Farzaneh Tatari, M. Naghibi-Sistani, K. Vamvoudakis\",\"doi\":\"10.23919/ACC.2017.7963029\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This work proposes an online optimal distributed learning algorithm to find the game theoretic solution of systems on graphs with completely unknown dynamics. The proposed algorithm learns online the approximate solution to the cooperative coupled Hamilton-Jacobi (HJ) equations. Each player employs an actor/critic network structure to learn the optimal cost and the optimal policy along with intelligent identifiers to obviate the knowledge of the system dynamics. We use recorded experiences concurrently with current data to guarantee proper state exploration. The closed-loop system is proved to be stable and the policies form a Nash equilibrium. Finally, simulation results verify the effectiveness of the proposed approach.\",\"PeriodicalId\":422926,\"journal\":{\"name\":\"2017 American Control Conference (ACC)\",\"volume\":\"41 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-05-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"8\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 American Control Conference (ACC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.23919/ACC.2017.7963029\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 American Control Conference (ACC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.23919/ACC.2017.7963029","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 8

摘要

本文提出了一种在线最优分布式学习算法，用于寻找动态完全未知的图上系统的博弈论解。该算法在线学习协作耦合Hamilton-Jacobi (HJ)方程的近似解。每个参与者使用一个参与者/评论家网络结构来学习最优成本和最优策略，并使用智能标识符来避免系统动力学知识。我们同时使用记录的经验和当前数据来保证适当的状态勘探。证明了闭环系统是稳定的，策略形成纳什均衡。最后，仿真结果验证了该方法的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Distributed optimal synchronization control of linear networked systems under unknown dynamics

This work proposes an online optimal distributed learning algorithm to find the game theoretic solution of systems on graphs with completely unknown dynamics. The proposed algorithm learns online the approximate solution to the cooperative coupled Hamilton-Jacobi (HJ) equations. Each player employs an actor/critic network structure to learn the optimal cost and the optimal policy along with intelligent identifiers to obviate the knowledge of the system dynamics. We use recorded experiences concurrently with current data to guarantee proper state exploration. The closed-loop system is proved to be stable and the policies form a Nash equilibrium. Finally, simulation results verify the effectiveness of the proposed approach.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2017 American Control Conference (ACC)

自引率

0.00%

发文量