Mai-Kao Lu , Ming-Feng Ge , Zhi-Chen Yan , Teng-Fei Ding , Zhi-Wei Liu
{"title":"An integrated decision-execution framework of cooperative control for multi-agent systems via reinforcement learning","authors":"Mai-Kao Lu , Ming-Feng Ge , Zhi-Chen Yan , Teng-Fei Ding , Zhi-Wei Liu","doi":"10.1016/j.sysconle.2024.105949","DOIUrl":"10.1016/j.sysconle.2024.105949","url":null,"abstract":"<div><div>Cooperative control is both a crucial and hot research topic for multi-agent systems (MASs). However, most existing cooperative control strategies guarantee tracking stability under various non-ideal conditions, while the path decision capability is often ignored. In this paper, the integrated decision-execution (IDE) framework is newly presented for cooperative control of multi-agent systems (MASs) to accomplish the integrated task of path decision and cooperative execution. This framework includes a decision layer and a control layer. The decision layer generates a continuous trajectory for the virtual leader to reach the target from its initial position in an unknown environment. To achieve the goal of this layer, (1) the Step-based Adaptive Search Q-learning (SASQ-learning) algorithm is proposed based on reinforcement learning to efficiently find the discrete path, (2) an Axis-based Trajectory Fitting (ATF) method is developed to convert the discrete path into a continuous trajectory for mobile agents. In the control layer, this trajectory is used to regulate the following MASs to achieve cooperative tracking control with the presence of input saturation. Simulation experiments are presented to demonstrate the effectiveness of this framework.</div></div>","PeriodicalId":49450,"journal":{"name":"Systems & Control Letters","volume":"193 ","pages":"Article 105949"},"PeriodicalIF":2.1,"publicationDate":"2024-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142530583","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Spectrum computation and optimization for controllability Gramian of networked Laplacian systems with limited control placement","authors":"Yuexin Cao , Yibei Li , Zhuo Zou , Xiaoming Hu","doi":"10.1016/j.sysconle.2024.105945","DOIUrl":"10.1016/j.sysconle.2024.105945","url":null,"abstract":"<div><div>This paper investigates the problem of placing a given number of controls to optimize energy efficiency for a family of linear dynamical systems, whose structure is induced by the Laplacian of a square-grid network. To quantify the performance of control combinations, several metrics have been proposed based on the spectrum of the controllability Gramian. But commonly used algorithms to compute the spectrum are usually time-consuming. In this paper, we first classify five anchor symmetries of the network systems. Then motivated by various advantages of symmetric control combinations, we provide a method to compute the eigenvalues and eigenvectors of their controllability Gramians more efficiently. Specifically, we show that they can be expressed by those of two lower-dimensional matrices. Furthermore, our method can be applied for non-symmetric cases to provide upper and lower bounds for the spectrum of the controllability Gramians. Finally, by employing the sum of eigenvalues, i.e., the trace of controllability Gramian, as the objective function, we provide a closed-form algorithm to the spectrum optimization problem with a given number of controls subject to system controllability.</div></div>","PeriodicalId":49450,"journal":{"name":"Systems & Control Letters","volume":"193 ","pages":"Article 105945"},"PeriodicalIF":2.1,"publicationDate":"2024-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142530584","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yu-Long Fan , Chuan-Ke Zhang , Yun-Fan Liu , Yong He , Qing-Guo Wang
{"title":"Stability analysis of systems with time-varying delays for conservatism and complexity reduction","authors":"Yu-Long Fan , Chuan-Ke Zhang , Yun-Fan Liu , Yong He , Qing-Guo Wang","doi":"10.1016/j.sysconle.2024.105948","DOIUrl":"10.1016/j.sysconle.2024.105948","url":null,"abstract":"<div><div>This paper is concerned with the stability analysis of systems with time-varying delays via the Lyapunov–Krasovskii functional (LKF) method. Unlike the most existing works primarily on conservatism reduction, this paper aims to establish stability criteria with less conservatism as well as low complexity, based on a relatively simple LKF with improved derivative treatments. For this purpose, a fragmented-component-based integral inequality is developed through matrix-separation and mixed estimation of the augmented integral term, which tights the estimation gap and contributes to conservatism reduction; and a novel linearized transformation method is proposed by stripping-simplification and matrix-injection, which handles nonlinear delay-itself-related terms at a low complexity cost. Then, a novel stability criterion as well as several comparative criteria are obtained for linear time-delay systems. Finally, the superiority of the proposed methods is demonstrated via two benchmark examples and a load frequency control system.</div></div>","PeriodicalId":49450,"journal":{"name":"Systems & Control Letters","volume":"193 ","pages":"Article 105948"},"PeriodicalIF":2.1,"publicationDate":"2024-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142530581","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Near optimality of Lipschitz and smooth policies in controlled diffusions","authors":"Somnath Pradhan , Serdar Yüksel","doi":"10.1016/j.sysconle.2024.105943","DOIUrl":"10.1016/j.sysconle.2024.105943","url":null,"abstract":"<div><div>For optimal control of diffusions under several criteria, due to computational or analytical reasons, many studies have a apriori assumed control policies to be Lipschitz or smooth, often with no rigorous analysis on whether this restriction entails loss. While optimality of Markov/stationary Markov policies for expected finite horizon/infinite horizon (discounted/ergodic) cost and cost-up-to-exit time optimal control problems can be established under certain technical conditions, an optimal solution is typically only measurable in the state (and time, if the horizon is finite) with no apriori additional structural properties. In this paper, building on our recent work (Pradhan and Yüksel, 2024) establishing the regularity of optimal cost on the space of control policies under the Borkar control topology for a general class of controlled diffusions in <span><math><msup><mrow><mi>R</mi></mrow><mrow><mi>d</mi></mrow></msup></math></span>, we establish near optimality of smooth or Lipschitz continuous policies for optimal control under expected finite horizon, infinite horizon discounted, infinite horizon average, and up-to-exit time cost criteria. Under mild assumptions, we first show that smooth/Lipschitz continuous policies are dense in the space of Markov/stationary Markov policies under the Borkar topology. Then utilizing the continuity of optimal costs as a function of policies on the space of Markov/stationary policies under the Borkar topology, we establish that optimal policies can be approximated by smooth/Lipschitz continuous policies with arbitrary precision. While our results are extensions of our recent work, the practical significance of an explicit statement and accessible presentation dedicated to Lipschitz and smooth policies, given their prominence in the literature, motivates our current paper.</div></div>","PeriodicalId":49450,"journal":{"name":"Systems & Control Letters","volume":"193 ","pages":"Article 105943"},"PeriodicalIF":2.1,"publicationDate":"2024-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142530582","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zi-Jie Wei , Kun-Zhi Liu , Yan-Wei Wang , Zhuo-Rui Pan , Si-Xin Wen , Xi-Ming Sun
{"title":"Periodic event-triggered data-driven control for networked control systems with time-varying delays","authors":"Zi-Jie Wei , Kun-Zhi Liu , Yan-Wei Wang , Zhuo-Rui Pan , Si-Xin Wen , Xi-Ming Sun","doi":"10.1016/j.sysconle.2024.105951","DOIUrl":"10.1016/j.sysconle.2024.105951","url":null,"abstract":"<div><div>This article focuses on data-driven analysis and controller design for networked control systems (NCSs) with network-induced delays. The study considers a linear time-invariant (LTI) system controlled through a periodic event-triggering mechanism. First, by leveraging data-based representations, we establish data-based stability conditions for NCSs with time-varying delays. Furthermore, we propose the data-based method for co-designing the controller and the periodic event-triggering scheme. In addition, we present novel data-based conditions for verifying dissipativity properties of NCSs. The effectiveness of our proposed methods is validated through a simulation and a turbofan engine hardware-in-the-loop (HIL) experiment.</div></div>","PeriodicalId":49450,"journal":{"name":"Systems & Control Letters","volume":"193 ","pages":"Article 105951"},"PeriodicalIF":2.1,"publicationDate":"2024-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142530580","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Joint identification of system parameter and noise parameters in quantized systems","authors":"Jieming Ke, Yanlong Zhao, Ji-Feng Zhang","doi":"10.1016/j.sysconle.2024.105941","DOIUrl":"10.1016/j.sysconle.2024.105941","url":null,"abstract":"<div><div>This paper investigates the joint identification problem of unknown system parameter and noise parameters in quantized systems when the noises involved are Gaussian with unknown variance and mean value. Under such noises, previous investigations show that the unknown system parameter and noise parameters are not jointly identifiable in the single-threshold quantizer case. The joint identifiability in the multi-threshold quantizer case still remains an open problem. This paper proves that the unknown system parameter, the noise variance and the mean value are jointly identifiable if and only if there are at least two thresholds. Then, a decomposition-recombination identification algorithm is proposed to jointly identify the unknown system parameter and noise parameters. Firstly, a technique is designed to convert the identification problem with unknown noise parameters into an extended parameter identification problem with standard Gaussian noises. Secondly, the extended parameter is identified by a stochastic approximation method for quantized systems. For the effectiveness, this paper obtains the strong consistency and the <span><math><msup><mrow><mi>L</mi></mrow><mrow><mi>p</mi></mrow></msup></math></span> convergence for the algorithm under non-persistently exciting inputs and without any <em>a priori</em> knowledge on the range of the unknown system parameter. The almost sure convergence rate is also obtained. Furthermore, when the mean value is known, the unknown system parameter and noise variance can be jointly identified under weaker conditions on the inputs and the quantizer. Finally, the effectiveness of the proposed algorithm is demonstrated by simulation.</div></div>","PeriodicalId":49450,"journal":{"name":"Systems & Control Letters","volume":"193 ","pages":"Article 105941"},"PeriodicalIF":2.1,"publicationDate":"2024-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142530579","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Robust control of time-delayed stochastic switched systems with dwell","authors":"E. Gershon , L.I. Allerhand , U. Shaked","doi":"10.1016/j.sysconle.2024.105934","DOIUrl":"10.1016/j.sysconle.2024.105934","url":null,"abstract":"<div><div>Linear, state-delayed, discrete-time, stochastic, switched systems are considered, where the problems of stochastic <span><math><msub><mrow><mi>l</mi></mrow><mrow><mn>2</mn></mrow></msub></math></span>-gain and state-feedback control designs are treated and solved. We first develop a special version of a bounded real lemma for the said systems for the nominal case.</div><div>Based on the this lemma we derive state-feedback gains for nominal systems where in our solution method, to each subsystem of the switched system, a Lyapunov function is assigned that is non-increasing at the switching instants and where a dwell time constrain is imposed on the system. The assigned Lyapunov function is allowed to vary piecewise linearly in time, starting at the end of the previous switch instant, and it becomes time-invariant after the dwell. Based on the solution of the state-feedback control for nominal systems and exploiting the fact that this solution is affine in the system matrices, a state-feedback control is derived for the polytopic case. We bring a numerical example that demonstrates the solvability and tractability of our solution method.</div></div>","PeriodicalId":49450,"journal":{"name":"Systems & Control Letters","volume":"193 ","pages":"Article 105934"},"PeriodicalIF":2.1,"publicationDate":"2024-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142433853","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Data-driven control of nonlinear systems: An online sequential approach","authors":"Minh Vu , Yunshen Huang , Shen Zeng","doi":"10.1016/j.sysconle.2024.105932","DOIUrl":"10.1016/j.sysconle.2024.105932","url":null,"abstract":"<div><div>While data-driven control has shown its potential for solving complex tasks, current algorithms such as reinforcement learning are still data-intensive and often limited to simulated environments. Model-based learning is a promising approach to reducing the amount of data required in practical implementations, yet it suffers from a critical issue known as model exploitation. In this paper, we present a sequential approach to model-based learning that avoids model exploitation and achieves stable system behaviors during learning with minimal exploration. The advocated control design utilizes estimates of the system’s local dynamics to step-by-step improve the control. During the process, when additional data is required, the program pauses the control synthesis to collect data in the surrounding area and updates the model accordingly. The local and sequential nature of this approach is the key component to <em>regulating the system’s exploration in the state–action space</em> and, at the same time, <em>avoiding the issue of model exploitation</em>, which are the main challenges in model-based learning control. Through simulated examples and physical experiments, we demonstrate that the proposed approach can quickly learn a desirable control from scratch, with just a small number of trials.</div></div>","PeriodicalId":49450,"journal":{"name":"Systems & Control Letters","volume":"193 ","pages":"Article 105932"},"PeriodicalIF":2.1,"publicationDate":"2024-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142426246","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Optimal impulse control problems with time delays: An illustrative example","authors":"Giovanni Fusco , Monica Motta , Richard Vinter","doi":"10.1016/j.sysconle.2024.105940","DOIUrl":"10.1016/j.sysconle.2024.105940","url":null,"abstract":"<div><div>For impulse control systems described by a measure driven differential equation, depending linearly on the measure, it is customary to interpret the state trajectory corresponding to an impulse control, specified by a measure, as the limit of state trajectories associated with some sequence of conventional controls approximating the measure. It is known that, when the measure is vector valued, it is possible that different choices of approximating sequences for the measure give rise to different limiting state trajectories. If the measure is scalar valued, however, there is a unique limiting trajectory. Now consider impulse control systems, in which the right side of the measure driven differential equation depends on both the current and delayed states. In recent work by the authors it has been shown that, for such impulse control systems with time delay, the state trajectory corresponding to a given measure may be non-unique, even when the measure is scalar valued. It was also shown that each limiting state trajectory can be identified with the unique state trajectory associated with some measure together with a family of ‘attached controls’. (The attached controls capture the nature of the measure approximation.) The authors also derived a maximum principle governing minimizers for a general class of impulse optimal control problems with time delay, in which the domain of the optimization problem comprises measures coupled with a family of ‘attached controls’. The purpose of this paper is both to illustrate, by means of an example, this newly discovered non-uniqueness phenomenon and to provide the first application of the new maximum principle, to investigate minimizers for scalar input impulse optimal control problems with time delay, in circumstances when limiting state trajectories associated with a given measure control are not unique. The example is an optimal control problem, for which the underlying control system is a forced harmonic oscillator, with scalar impulse control, in which the control gain is a nonlinear function of the current and delayed states.</div></div>","PeriodicalId":49450,"journal":{"name":"Systems & Control Letters","volume":"193 ","pages":"Article 105940"},"PeriodicalIF":2.1,"publicationDate":"2024-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142426245","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Inverse reinforcement learning methods for linear differential games","authors":"Hamed Jabbari Asl, Eiji Uchibe","doi":"10.1016/j.sysconle.2024.105936","DOIUrl":"10.1016/j.sysconle.2024.105936","url":null,"abstract":"<div><div>In this study, we considered the problem of inverse reinforcement learning or estimating the cost function of expert players in multi-player differential games. We proposed two online data-driven solutions for linear–quadratic games that are applicable to systems that fulfill a specific dimension criterion or whose unknown matrices in the cost function conform to a diagonal condition. The first method, which is partially model-free, utilizes the trajectories of expert agents to solve the problem. The second method is entirely model-free and employs the trajectories of both expert and learner agents. We determined the conditions under which the solutions are applicable and identified the necessary requirements for the collected data. We conducted numerical simulations to establish the effectiveness of the proposed methods.</div></div>","PeriodicalId":49450,"journal":{"name":"Systems & Control Letters","volume":"193 ","pages":"Article 105936"},"PeriodicalIF":2.1,"publicationDate":"2024-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142426249","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}