非参数神经自适应编队控制

IF 6.4 2区计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS

IEEE Transactions on Automation Science and Engineering Pub Date : 2025-01-15 DOI:10.1109/TASE.2025.3528501

Christos K. Verginis;Zhe Xu;Ufuk Topcu

{"title":"非参数神经自适应编队控制","authors":"Christos K. Verginis;Zhe Xu;Ufuk Topcu","doi":"10.1109/TASE.2025.3528501","DOIUrl":null,"url":null,"abstract":"We develop a learning-based algorithm for the distributed formation control of networked multi-agent systems governed by unknown, nonlinear dynamics. Most existing algorithms either assume certain parametric forms for the unknown dynamic terms or resort to unnecessarily large control inputs in order to provide theoretical guarantees. The proposed algorithm avoids these drawbacks by integrating neural network-based learning with adaptive control in a two-step procedure. In the first step of the algorithm, each agent learns a controller, represented as a neural network, using training data that correspond to a collection of formation tasks and agent parameters. These parameters and tasks are derived by varying the nominal agent parameters and a user-defined formation task to be achieved, respectively. In the second step of the algorithm, each agent incorporates the trained neural network into an online and adaptive control policy in such a way that the behavior of the multi-agent closed-loop system satisfies the user-defined formation task. Both the learning phase and the adaptive control policy are distributed, in the sense that each agent computes its own actions using only local information from its neighboring agents. The proposed algorithm does not use any a priori information on the agents’ unknown dynamic terms or any approximation schemes. We provide formal theoretical guarantees on the achievement of the formation task. Note to Practitioners—This paper is motivated by control of multi-agent systems, such as teams of robots, smart grids, or wireless sensor networks, with uncertain dynamic models. Existing works develop controllers that rely on unrealistic or impractical assumptions on these models. We propose an algorithm that integrates offline learning with neural networks and real-time feedback control to accomplish a multi-agent task. The task consists of the formation of a pre-defined geometric pattern by the multi-agent team. The learning module of the proposed algorithm aims to learn stabilizing controllers that accomplish the task from data that are obtained from offline runs of the system. However, the learned controller might result in poor performance owing to potential data inaccuracies and the fact that learning algorithms can only approximate the stabilizing controllers. Therefore, we complement the learned controller with a real-time feedback-control module that adapts on the fly to such discrepancies. In practise, the data can be collected from pre-recorded trajectories of the multi-agent system, but these trajectories do need to accomplish the task at hand. The real-time feedback-control is a closed-form function of the states of each agent and its neighbours and the trained neural networks and can be straightforwardly implemented. The experimental results show that the proposed algorithm achieves greater performance than algorithms that use only the trained neural networks or only the real-time feedback-control policy. Our future research will address the sensitivity of the algorithm to the quality and quantity of the employed data as well as to the learning performance of the neural networks.","PeriodicalId":51060,"journal":{"name":"IEEE Transactions on Automation Science and Engineering","volume":"22 ","pages":"10684-10697"},"PeriodicalIF":6.4000,"publicationDate":"2025-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Non-Parametric Neuro-Adaptive Formation Control\",\"authors\":\"Christos K. Verginis;Zhe Xu;Ufuk Topcu\",\"doi\":\"10.1109/TASE.2025.3528501\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We develop a learning-based algorithm for the distributed formation control of networked multi-agent systems governed by unknown, nonlinear dynamics. Most existing algorithms either assume certain parametric forms for the unknown dynamic terms or resort to unnecessarily large control inputs in order to provide theoretical guarantees. The proposed algorithm avoids these drawbacks by integrating neural network-based learning with adaptive control in a two-step procedure. In the first step of the algorithm, each agent learns a controller, represented as a neural network, using training data that correspond to a collection of formation tasks and agent parameters. These parameters and tasks are derived by varying the nominal agent parameters and a user-defined formation task to be achieved, respectively. In the second step of the algorithm, each agent incorporates the trained neural network into an online and adaptive control policy in such a way that the behavior of the multi-agent closed-loop system satisfies the user-defined formation task. Both the learning phase and the adaptive control policy are distributed, in the sense that each agent computes its own actions using only local information from its neighboring agents. The proposed algorithm does not use any a priori information on the agents’ unknown dynamic terms or any approximation schemes. We provide formal theoretical guarantees on the achievement of the formation task. Note to Practitioners—This paper is motivated by control of multi-agent systems, such as teams of robots, smart grids, or wireless sensor networks, with uncertain dynamic models. Existing works develop controllers that rely on unrealistic or impractical assumptions on these models. We propose an algorithm that integrates offline learning with neural networks and real-time feedback control to accomplish a multi-agent task. The task consists of the formation of a pre-defined geometric pattern by the multi-agent team. The learning module of the proposed algorithm aims to learn stabilizing controllers that accomplish the task from data that are obtained from offline runs of the system. However, the learned controller might result in poor performance owing to potential data inaccuracies and the fact that learning algorithms can only approximate the stabilizing controllers. Therefore, we complement the learned controller with a real-time feedback-control module that adapts on the fly to such discrepancies. In practise, the data can be collected from pre-recorded trajectories of the multi-agent system, but these trajectories do need to accomplish the task at hand. The real-time feedback-control is a closed-form function of the states of each agent and its neighbours and the trained neural networks and can be straightforwardly implemented. The experimental results show that the proposed algorithm achieves greater performance than algorithms that use only the trained neural networks or only the real-time feedback-control policy. Our future research will address the sensitivity of the algorithm to the quality and quantity of the employed data as well as to the learning performance of the neural networks.\",\"PeriodicalId\":51060,\"journal\":{\"name\":\"IEEE Transactions on Automation Science and Engineering\",\"volume\":\"22 \",\"pages\":\"10684-10697\"},\"PeriodicalIF\":6.4000,\"publicationDate\":\"2025-01-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Automation Science and Engineering\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10843302/\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"AUTOMATION & CONTROL SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Automation Science and Engineering","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10843302/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}

引用次数: 0

摘要

我们开发了一种基于学习的算法，用于由未知非线性动力学控制的网络多智能体系统的分布式编队控制。大多数现有算法要么对未知的动态项假设一定的参数形式，要么采用不必要的大控制输入来提供理论保证。该算法通过将基于神经网络的学习与自适应控制在两步过程中相结合来避免这些缺点。在算法的第一步，每个智能体学习一个控制器，表示为一个神经网络，使用的训练数据对应于一组编队任务和智能体参数。这些参数和任务分别是通过改变名义代理参数和要实现的用户定义的生成任务派生出来的。在算法的第二步中，每个智能体将训练好的神经网络整合到一个在线的自适应控制策略中，使多智能体闭环系统的行为满足用户定义的编队任务。学习阶段和自适应控制策略都是分布式的，也就是说，每个智能体仅使用来自相邻智能体的本地信息来计算自己的动作。该算法不使用任何关于智能体未知动态项的先验信息，也不使用任何近似方法。我们为编队任务的完成提供了正式的理论保证。从业人员注意事项——本文的动机是多智能体系统的控制，如机器人团队、智能电网或无线传感器网络，具有不确定的动态模型。现有的工作开发的控制器依赖于这些模型上不现实或不切实际的假设。我们提出了一种将离线学习与神经网络和实时反馈控制相结合的算法来完成多智能体任务。该任务包括由多智能体团队形成预定义的几何图案。该算法的学习模块旨在从系统离线运行中获得的数据中学习稳定控制器来完成任务。然而，由于潜在的数据不准确以及学习算法只能近似稳定控制器，学习控制器可能会导致性能不佳。因此，我们用一个实时反馈控制模块来补充学习控制器，该模块可以动态地适应这种差异。在实践中，数据可以从预先记录的多智能体系统的轨迹中收集，但这些轨迹确实需要完成手头的任务。实时反馈控制是每个智能体及其邻居和训练好的神经网络状态的封闭函数，可以直接实现。实验结果表明，该算法比仅使用训练好的神经网络或仅使用实时反馈控制策略的算法具有更高的性能。我们未来的研究将解决算法对所使用数据的质量和数量的敏感性以及神经网络的学习性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Non-Parametric Neuro-Adaptive Formation Control

We develop a learning-based algorithm for the distributed formation control of networked multi-agent systems governed by unknown, nonlinear dynamics. Most existing algorithms either assume certain parametric forms for the unknown dynamic terms or resort to unnecessarily large control inputs in order to provide theoretical guarantees. The proposed algorithm avoids these drawbacks by integrating neural network-based learning with adaptive control in a two-step procedure. In the first step of the algorithm, each agent learns a controller, represented as a neural network, using training data that correspond to a collection of formation tasks and agent parameters. These parameters and tasks are derived by varying the nominal agent parameters and a user-defined formation task to be achieved, respectively. In the second step of the algorithm, each agent incorporates the trained neural network into an online and adaptive control policy in such a way that the behavior of the multi-agent closed-loop system satisfies the user-defined formation task. Both the learning phase and the adaptive control policy are distributed, in the sense that each agent computes its own actions using only local information from its neighboring agents. The proposed algorithm does not use any a priori information on the agents’ unknown dynamic terms or any approximation schemes. We provide formal theoretical guarantees on the achievement of the formation task. Note to Practitioners—This paper is motivated by control of multi-agent systems, such as teams of robots, smart grids, or wireless sensor networks, with uncertain dynamic models. Existing works develop controllers that rely on unrealistic or impractical assumptions on these models. We propose an algorithm that integrates offline learning with neural networks and real-time feedback control to accomplish a multi-agent task. The task consists of the formation of a pre-defined geometric pattern by the multi-agent team. The learning module of the proposed algorithm aims to learn stabilizing controllers that accomplish the task from data that are obtained from offline runs of the system. However, the learned controller might result in poor performance owing to potential data inaccuracies and the fact that learning algorithms can only approximate the stabilizing controllers. Therefore, we complement the learned controller with a real-time feedback-control module that adapts on the fly to such discrepancies. In practise, the data can be collected from pre-recorded trajectories of the multi-agent system, but these trajectories do need to accomplish the task at hand. The real-time feedback-control is a closed-form function of the states of each agent and its neighbours and the trained neural networks and can be straightforwardly implemented. The experimental results show that the proposed algorithm achieves greater performance than algorithms that use only the trained neural networks or only the real-time feedback-control policy. Our future research will address the sensitivity of the algorithm to the quality and quantity of the employed data as well as to the learning performance of the neural networks.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Transactions on Automation Science and Engineering 工程技术-自动化与控制系统

CiteScore

12.50

自引率

14.30%

发文量

404

审稿时长

3.0 months

期刊介绍： The IEEE Transactions on Automation Science and Engineering (T-ASE) publishes fundamental papers on Automation, emphasizing scientific results that advance efficiency, quality, productivity, and reliability. T-ASE encourages interdisciplinary approaches from computer science, control systems, electrical engineering, mathematics, mechanical engineering, operations research, and other fields. T-ASE welcomes results relevant to industries such as agriculture, biotechnology, healthcare, home automation, maintenance, manufacturing, pharmaceuticals, retail, security, service, supply chains, and transportation. T-ASE addresses a research community willing to integrate knowledge across disciplines and industries. For this purpose, each paper includes a Note to Practitioners that summarizes how its results can be applied or how they might be extended to apply in practice.