IEEE open journal of control systems最新文献_第10页

Distributed Data-Driven Control of Network Systems 网络系统的分布式数据驱动控制

IEEE open journal of control systems Pub Date : 2023-03-20 DOI: 10.1109/OJCSYS.2023.3259228

Federico Celi;Giacomo Baggio;Fabio Pasqualetti

引用次数: 3

Policy Evaluation in Decentralized POMDPs With Belief Sharing 具有信念共享的去中心化POMDP中的政策评估

IEEE open journal of control systems Pub Date : 2023-03-18 DOI: 10.1109/OJCSYS.2023.3277760

Mert Kayaalp;Fatima Ghadieh;Ali H. Sayed

引用次数: 0

Model-Based Reinforcement Learning via Stochastic Hybrid Models 基于随机混合模型的强化学习

IEEE open journal of control systems Pub Date : 2023-03-17 DOI: 10.1109/OJCSYS.2023.3277308

Hany Abdulsamad;Jan Peters

{"title":"Model-Based Reinforcement Learning via Stochastic Hybrid Models","authors":"Hany Abdulsamad;Jan Peters","doi":"10.1109/OJCSYS.2023.3277308","DOIUrl":"https://doi.org/10.1109/OJCSYS.2023.3277308","url":null,"abstract":"Optimal control of general nonlinear systems is a central challenge in automation. Enabled by powerful function approximators, data-driven approaches to control have recently successfully tackled challenging applications. However, such methods often obscure the structure of dynamics and control behind black-box over-parameterized representations, thus limiting our ability to understand closed-loop behavior. This article adopts a hybrid-system view of nonlinear modeling and control that lends an explicit hierarchical structure to the problem and breaks down complex dynamics into simpler localized units. We consider a sequence modeling paradigm that captures the temporal structure of the data and derive an expectation-maximization (EM) algorithm that automatically decomposes nonlinear dynamics into stochastic piecewise affine models with nonlinear transition boundaries. Furthermore, we show that these time-series models naturally admit a closed-loop extension that we use to extract local polynomial feedback controllers from nonlinear experts via behavioral cloning. Finally, we introduce a novel hybrid relative entropy policy search (Hb-REPS) technique that incorporates the hierarchical nature of hybrid models and optimizes a set of time-invariant piecewise feedback controllers derived from a piecewise polynomial approximation of a global state-value function.","PeriodicalId":73299,"journal":{"name":"IEEE open journal of control systems","volume":"2 ","pages":"155-170"},"PeriodicalIF":0.0,"publicationDate":"2023-03-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/iel7/9552933/9973428/10128705.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50376175","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Provably Safe Reinforcement Learning via Action Projection Using Reachability Analysis and Polynomial Zonotopes 使用可达性分析和多项式分区型通过动作投影的可证明安全的强化学习

IEEE open journal of control systems Pub Date : 2023-03-13 DOI: 10.1109/OJCSYS.2023.3256305

Niklas Kochdumper;Hanna Krasowski;Xiao Wang;Stanley Bak;Matthias Althoff

{"title":"Provably Safe Reinforcement Learning via Action Projection Using Reachability Analysis and Polynomial Zonotopes","authors":"Niklas Kochdumper;Hanna Krasowski;Xiao Wang;Stanley Bak;Matthias Althoff","doi":"10.1109/OJCSYS.2023.3256305","DOIUrl":"https://doi.org/10.1109/OJCSYS.2023.3256305","url":null,"abstract":"While reinforcement learning produces very promising results for many applications, its main disadvantage is the lack of safety guarantees, which prevents its use in safety-critical systems. In this work, we address this issue by a safety shield for nonlinear continuous systems that solve reach-avoid tasks. Our safety shield prevents applying potentially unsafe actions from a reinforcement learning agent by projecting the proposed action to the closest safe action. This approach is called action projection and is implemented via mixed-integer optimization. The safety constraints for action projection are obtained by applying parameterized reachability analysis using polynomial zonotopes, which enables to accurately capture the nonlinear effects of the actions on the system. In contrast to other state-of-the-art approaches for action projection, our safety shield can efficiently handle input constraints and dynamic obstacles, eases incorporation of the spatial robot dimensions into the safety constraints, guarantees robust safety despite process noise and measurement errors, and is well suited for high-dimensional systems, as we demonstrate on several challenging benchmark systems.","PeriodicalId":73299,"journal":{"name":"IEEE open journal of control systems","volume":"2 ","pages":"79-92"},"PeriodicalIF":0.0,"publicationDate":"2023-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/iel7/9552933/9973428/10068193.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50376171","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 9

Model-Free Distributed Reinforcement Learning State Estimation of a Dynamical System Using Integral Value Functions 基于积分值函数的动力系统无模型分布强化学习状态估计

IEEE open journal of control systems Pub Date : 2023-02-27 DOI: 10.1109/OJCSYS.2023.3250089

Babak Salamat;Gerhard Elsbacher;Andrea M. Tonello;Lenz Belzner

{"title":"Model-Free Distributed Reinforcement Learning State Estimation of a Dynamical System Using Integral Value Functions","authors":"Babak Salamat;Gerhard Elsbacher;Andrea M. Tonello;Lenz Belzner","doi":"10.1109/OJCSYS.2023.3250089","DOIUrl":"https://doi.org/10.1109/OJCSYS.2023.3250089","url":null,"abstract":"One of the challenging problems in sensor network systems is to estimate and track the state of a target point mass with unknown dynamics. Recent improvements in deep learning (DL) show a renewed interest in applying DL techniques to state estimation problems. However, the process noise is absent which seems to indicate that the point-mass target must be non-maneuvering, as process noise is typically as significant as the measurement noise for tracking maneuvering targets. In this paper, we propose a continuous-time (CT) model-free or model-building distributed reinforcement learning estimator (DRLE) using an integral value function in sensor networks. The DRLE algorithm is capable of learning an optimal policy from a neural value function that aims to provide the estimation of a target point mass. The proposed estimator consists of two high pass consensus filters in terms of weighted measurements and inverse-covariance matrices and a critic reinforcement learning mechanism for each node in the network. The efficiency of the proposed DRLE is shown by a simulation experiment of a network of underactuated vertical takeoff and landing aircraft with strong input coupling. The experiment highlights two advantages of DRLE: i) it does not require the dynamic model to be known, and ii) it is an order of magnitude faster than the state-dependent Riccati equation (SDRE) baseline.","PeriodicalId":73299,"journal":{"name":"IEEE open journal of control systems","volume":"2 ","pages":"70-78"},"PeriodicalIF":0.0,"publicationDate":"2023-02-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/iel7/9552933/9973428/10054475.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50376170","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

The Internal Model Principle for Biomolecular Control Theory 生物分子控制理论的内模原理

IEEE open journal of control systems Pub Date : 2023-02-10 DOI: 10.1109/OJCSYS.2023.3244089

Ankit Gupta;Mustafa Khammash

引用次数: 1

Certifying Black-Box Policies With Stability for Nonlinear Control 非线性控制黑盒策略的稳定性证明

IEEE open journal of control systems Pub Date : 2023-02-01 DOI: 10.1109/OJCSYS.2023.3241486

Tongxin Li;Ruixiao Yang;Guannan Qu;Yiheng Lin;Adam Wierman;Steven H. Low

{"title":"Certifying Black-Box Policies With Stability for Nonlinear Control","authors":"Tongxin Li;Ruixiao Yang;Guannan Qu;Yiheng Lin;Adam Wierman;Steven H. Low","doi":"10.1109/OJCSYS.2023.3241486","DOIUrl":"https://doi.org/10.1109/OJCSYS.2023.3241486","url":null,"abstract":"Machine-learned black-box policies are ubiquitous for nonlinear control problems. Meanwhile, crude model information is often available for these problems from, e.g., linear approximations of nonlinear dynamics. We study the problem of certifying a black-box control policy with stability using model-based advice for nonlinear control on a single trajectory. We first show a general negative result that a naive convex combination of a black-box policy and a linear model-based policy can lead to instability, even if the two policies are both stabilizing. We then propose an \u0000<italic>adaptive <inline-formula><tex-math>$lambda$</tex-math></inline-formula>-confident policy</i>\u0000, with a coefficient \u0000<inline-formula><tex-math>$lambda$</tex-math></inline-formula>\u0000 indicating the confidence in a black-box policy, and prove its stability. With bounded nonlinearity, in addition, we show that the adaptive \u0000<inline-formula><tex-math>$lambda$</tex-math></inline-formula>\u0000-confident policy achieves a bounded competitive ratio when a black-box policy is near-optimal. Finally, we propose an online learning approach to implement the adaptive \u0000<inline-formula><tex-math>$lambda$</tex-math></inline-formula>\u0000-confident policy and verify its efficacy in case studies about the Cart-Pole problem and a real-world electric vehicle (EV) charging problem with covariate shift due to COVID-19.","PeriodicalId":73299,"journal":{"name":"IEEE open journal of control systems","volume":"2 ","pages":"49-62"},"PeriodicalIF":0.0,"publicationDate":"2023-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/iel7/9552933/9973428/10034859.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50376169","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

Cross Apprenticeship Learning Framework: Properties and Solution Approaches 跨学徒制学习框架：特性与解决方法

IEEE open journal of control systems Pub Date : 2023-01-09 DOI: 10.1109/OJCSYS.2023.3235248

Ashwin Aravind;Debasish Chatterjee;Ashish Cherukuri

{"title":"Cross Apprenticeship Learning Framework: Properties and Solution Approaches","authors":"Ashwin Aravind;Debasish Chatterjee;Ashish Cherukuri","doi":"10.1109/OJCSYS.2023.3235248","DOIUrl":"https://doi.org/10.1109/OJCSYS.2023.3235248","url":null,"abstract":"Apprenticeship learning is a framework in which an agent learns a policy to perform a given task in an environment using example trajectories provided by an expert. In the real world, one might have access to expert trajectories in different environments where system dynamics is different while the learning task is the same. For such scenarios, two types of learning objectives can be defined. One where the learned policy performs very well in one specific environment and another when it performs well across all environments. To balance these two objectives in a principled way, our work presents the cross apprenticeship learning (CAL) framework. This consists of an optimization problem where an optimal policy for each environment is sought while ensuring that all policies remain close to each other. This nearness is facilitated by one tuning parameter in the optimization problem. We derive properties of the optimizers of the problem as the tuning parameter varies. We identify conditions under which an agent prefers using the policy obtained from CAL over the traditional apprenticeship learning. Since the CAL problem is nonconvex, we provide a convex outer approximation. Finally, we demonstrate the attributes of our framework in the context of a navigation task in a windy gridworld environment.","PeriodicalId":73299,"journal":{"name":"IEEE open journal of control systems","volume":"2 ","pages":"36-48"},"PeriodicalIF":0.0,"publicationDate":"2023-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/iel7/9552933/9973428/10011555.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50376168","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Modeling and Characterization of Pre-Charged Collapse-Mode CMUTs 预充电坍缩模式cmut的建模与表征

IEEE open journal of control systems Pub Date : 2023-01-01 DOI: 10.1109/OJUFFC.2023.3240699

M. Saccher, Shinnosuke Kawasaki, J. Klootwijk, R. van Schaijk, Ronald Dekker

{"title":"Modeling and Characterization of Pre-Charged Collapse-Mode CMUTs","authors":"M. Saccher, Shinnosuke Kawasaki, J. Klootwijk, R. van Schaijk, Ronald Dekker","doi":"10.1109/OJUFFC.2023.3240699","DOIUrl":"https://doi.org/10.1109/OJUFFC.2023.3240699","url":null,"abstract":"Recently, the applications of ultrasound transducers expanded from high-end diagnostic tools to point of care diagnostic devices and wireless power receivers for implantable devices. These new applications additionally require that the transducer technology must comply to biocompatibility and manufacturing scalability. In this respect, Capacitive Micromachined Ultrasound Transducers (CMUTs) have a strong advantage compared to the conventional PZT based transducers. However, current CMUTs require a large DC bias voltage for their operation, which limits the miniaturizability of these devices. In this study, we propose a pre-charged collapse-mode CMUT for immersive applications that can operate without an external bias by means of a charge trapping Al2O3 layer embedded in the dielectrics between the top and bottom electrodes. The built-in charge layer was analytically modeled and four layer stack combinations were investigated and characterized. The measurement results of the CMUTs were then used to fit the model and to quantify the amount and type of trapped charge. It was found that these devices polarize due to the ferroelectric-like behavior of the Al2O3, and the amount of charge stored in the charge-trapping layer was estimated to be approximately 0.02 C/m2. Their acoustic performance shows a transmit and receive sensitivity of 8.8 kPa/V and 13.1 V/MPa respectively. In addition, we show that increasing the charging temperature, the charging duration, and the charging voltage results in a higher amount of stored charge. Finally, results of ALT tests showed that these devices have a lifetime of more than 2.5 years at body temperature.","PeriodicalId":73299,"journal":{"name":"IEEE open journal of control systems","volume":"3 1","pages":"14-28"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"62907489","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Exact Decomposition of Optimal Control Problems via Simultaneous Block Diagonalization of Matrices 最优控制问题的矩阵同时块对角化的精确分解

IEEE open journal of control systems Pub Date : 2022-12-22 DOI: 10.1109/OJCSYS.2022.3231553

Amirhossein Nazerian;Kshitij Bhatta;Francesco Sorrentino

引用次数: 1