基于采样和深度神经网络的图形游戏连续时间前导同步深度学习

ASME Letters in Dynamic Systems and Control Pub Date : 2023-10-03 DOI:10.1115/1.4063607

Da Zhang, Junaid Anwar, Syed Ali Asad Rizvi, Yusheng Wei

{"title":"基于采样和深度神经网络的图形游戏连续时间前导同步深度学习","authors":"Da Zhang, Junaid Anwar, Syed Ali Asad Rizvi, Yusheng Wei","doi":"10.1115/1.4063607","DOIUrl":null,"url":null,"abstract":"Abstract We propose a novel deep learning-based approach for the problem of continuous-time leader synchronization in graphical games on large networks. The problem setup is to deploy a distributed and coordinated swarm to track the trajectory of a leader while minimizing local neighborhood tracking error and control cost for each agent. The goal of our work is to develop optimal control poli- cies for continuous-time leader synchronization in graphical games using deep neural networks. We discretize the agents model using sampling to facilitate the modification of gradient descent methods for learning optimal control policies. The distributed swarm is deployed for a certain amount of time while keeping the control input of each agent constant during each sampling period. After collecting state and input data at each sampling time during one iteration, we update the weights of a deep neural network for each agent using collected data to minimize a loss function that characterizes the agents local neighborhood tracking error and the control cost. A modified gradient descent method is presented to overcome existing limitations. The performance of the proposed method is compared with two reinforcement learning-based methods in terms of robustness to initial neural network weights and initial local neighbor- hood tracking errors, and the scalability to networks with a large number of agents. Our approach has been shown to achieve superior performance compared with the other two methods.","PeriodicalId":327130,"journal":{"name":"ASME Letters in Dynamic Systems and Control","volume":"38 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Deep Learning for Continuous-time Leader Synchronization in Graphical Games Using Sampling and Deep Neural Networks\",\"authors\":\"Da Zhang, Junaid Anwar, Syed Ali Asad Rizvi, Yusheng Wei\",\"doi\":\"10.1115/1.4063607\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Abstract We propose a novel deep learning-based approach for the problem of continuous-time leader synchronization in graphical games on large networks. The problem setup is to deploy a distributed and coordinated swarm to track the trajectory of a leader while minimizing local neighborhood tracking error and control cost for each agent. The goal of our work is to develop optimal control poli- cies for continuous-time leader synchronization in graphical games using deep neural networks. We discretize the agents model using sampling to facilitate the modification of gradient descent methods for learning optimal control policies. The distributed swarm is deployed for a certain amount of time while keeping the control input of each agent constant during each sampling period. After collecting state and input data at each sampling time during one iteration, we update the weights of a deep neural network for each agent using collected data to minimize a loss function that characterizes the agents local neighborhood tracking error and the control cost. A modified gradient descent method is presented to overcome existing limitations. The performance of the proposed method is compared with two reinforcement learning-based methods in terms of robustness to initial neural network weights and initial local neighbor- hood tracking errors, and the scalability to networks with a large number of agents. Our approach has been shown to achieve superior performance compared with the other two methods.\",\"PeriodicalId\":327130,\"journal\":{\"name\":\"ASME Letters in Dynamic Systems and Control\",\"volume\":\"38 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-10-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ASME Letters in Dynamic Systems and Control\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1115/1.4063607\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ASME Letters in Dynamic Systems and Control","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1115/1.4063607","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

摘要提出了一种基于深度学习的大型网络图形游戏连续时间前导同步问题的新方法。问题的设置是部署一个分布式和协调的群体来跟踪领导者的轨迹，同时最小化每个代理的局部邻域跟踪误差和控制成本。我们的工作目标是利用深度神经网络开发图形游戏中连续时间前导同步的最优控制策略。我们使用抽样离散代理模型，以方便修改梯度下降方法来学习最优控制策略。分布式集群部署一定的时间，在每个采样周期内保持每个agent的控制输入不变。在一次迭代中收集每个采样时间的状态和输入数据后，我们使用收集到的数据更新每个智能体的深度神经网络权值，以最小化表征智能体局部邻域跟踪误差和控制成本的损失函数。提出了一种改进的梯度下降法，克服了现有方法的局限性。在对初始神经网络权值和初始局部邻域跟踪误差的鲁棒性以及对具有大量智能体的网络的可扩展性方面，将该方法与两种基于强化学习的方法进行了比较。与其他两种方法相比，我们的方法已被证明具有优越的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Deep Learning for Continuous-time Leader Synchronization in Graphical Games Using Sampling and Deep Neural Networks

Abstract We propose a novel deep learning-based approach for the problem of continuous-time leader synchronization in graphical games on large networks. The problem setup is to deploy a distributed and coordinated swarm to track the trajectory of a leader while minimizing local neighborhood tracking error and control cost for each agent. The goal of our work is to develop optimal control poli- cies for continuous-time leader synchronization in graphical games using deep neural networks. We discretize the agents model using sampling to facilitate the modification of gradient descent methods for learning optimal control policies. The distributed swarm is deployed for a certain amount of time while keeping the control input of each agent constant during each sampling period. After collecting state and input data at each sampling time during one iteration, we update the weights of a deep neural network for each agent using collected data to minimize a loss function that characterizes the agents local neighborhood tracking error and the control cost. A modified gradient descent method is presented to overcome existing limitations. The performance of the proposed method is compared with two reinforcement learning-based methods in terms of robustness to initial neural network weights and initial local neighbor- hood tracking errors, and the scalability to networks with a large number of agents. Our approach has been shown to achieve superior performance compared with the other two methods.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

ASME Letters in Dynamic Systems and Control

自引率

0.00%

发文量