{"title":"Finite Sample Analysis of Minmax Variant of Offline Reinforcement Learning for General MDPs","authors":"Jayanth Reddy Regatti;Abhishek Gupta","doi":"10.1109/OJCSYS.2022.3198660","DOIUrl":"https://doi.org/10.1109/OJCSYS.2022.3198660","url":null,"abstract":"In this work, we analyze the finite sample complexity bounds for offline reinforcement learning with general state, general function space and state-dependent action sets. The algorithm analyzed does not require the knowledge of the data-collection policy as compared to earlier works. We show that one can compute an \u0000<inline-formula><tex-math>$epsilon$</tex-math></inline-formula>\u0000-optimal Q function (state-action value function) using \u0000<inline-formula><tex-math>$O(1/epsilon ^{4})$</tex-math></inline-formula>\u0000 i.i.d. samples of state-action-reward-next state tuples.","PeriodicalId":73299,"journal":{"name":"IEEE open journal of control systems","volume":"1 ","pages":"152-163"},"PeriodicalIF":0.0,"publicationDate":"2022-08-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/iel7/9552933/9683993/09857559.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50348952","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ahmed Allibhoy;Federico Celi;Fabio Pasqualetti;Jorge Cortés
{"title":"Optimal Network Interventions to Control the Spreading of Oscillations","authors":"Ahmed Allibhoy;Federico Celi;Fabio Pasqualetti;Jorge Cortés","doi":"10.1109/OJCSYS.2022.3193127","DOIUrl":"https://doi.org/10.1109/OJCSYS.2022.3193127","url":null,"abstract":"Oscillations are a prominent feature of neuronal activity and are associated with a variety of phenomena in brain tissue, both healthy and unhealthy. Characterizing how oscillations spread through regions of the brain is of particular interest when studying countermeasures to pathological brain synchronizations. This paper models neuronal activity using networks of interconnected excitatory-inhibitory pairs with linear threshold dynamics, and presents strategies to design networks with desired robustness properties. In particular, we develop a dynamical description of the brain through a network where the state of each node models the firing rate of a region of neurons and where edges capture the structural connectivity between the regions. We characterize the presence of oscillations and study conditions on their spreading. We also discuss strategies to optimally design networks which are robust to oscillation spreading. We demonstrate our results with numerical simulations.","PeriodicalId":73299,"journal":{"name":"IEEE open journal of control systems","volume":"1 ","pages":"141-151"},"PeriodicalIF":0.0,"publicationDate":"2022-08-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/iel7/9552933/9683993/09854194.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50349025","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"DeepSplit: Scalable Verification of Deep Neural Networks via Operator Splitting","authors":"Shaoru Chen;Eric Wong;J. Zico Kolter;Mahyar Fazlyab","doi":"10.1109/OJCSYS.2022.3187429","DOIUrl":"https://doi.org/10.1109/OJCSYS.2022.3187429","url":null,"abstract":"Analyzing the worst-case performance of deep neural networks against input perturbations amounts to solving a large-scale non-convex optimization problem, for which several past works have proposed convex relaxations as a promising alternative. However, even for reasonably-sized neural networks, these relaxations are not tractable, and so must be replaced by even weaker relaxations in practice. In this work, we propose a novel operator splitting method that can directly solve a convex relaxation of the problem to high accuracy, by splitting it into smaller sub-problems that often have analytical solutions. The method is modular, scales to very large problem instances, and compromises of operations that are amenable to fast parallelization with GPU acceleration. We demonstrate our method in bounding the worst-case performance of large convolutional networks in image classification and reinforcement learning settings, and in reachability analysis of neural network dynamical systems.","PeriodicalId":73299,"journal":{"name":"IEEE open journal of control systems","volume":"1 ","pages":"126-140"},"PeriodicalIF":0.0,"publicationDate":"2022-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/iel7/9552933/9683993/09811356.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50349024","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ján Drgoňa;Aaron Tuor;Soumya Vasisht;Draguna Vrabie
{"title":"Dissipative Deep Neural Dynamical Systems","authors":"Ján Drgoňa;Aaron Tuor;Soumya Vasisht;Draguna Vrabie","doi":"10.1109/OJCSYS.2022.3186838","DOIUrl":"https://doi.org/10.1109/OJCSYS.2022.3186838","url":null,"abstract":"In this paper, we provide sufficient conditions for dissipativity and local asymptotic stability of discrete-time dynamical systems parametrized by deep neural networks. We leverage the representation of neural networks as pointwise affine maps, thus exposing their local linear operators and making them accessible to classical system analytic and design methods. This allows us to “crack open the black box” of the neural dynamical system’s behavior by evaluating their dissipativity, and estimating their stationary points and state-space partitioning. We relate the norms of these local linear operators to the energy stored in the dissipative system with supply rates represented by their aggregate bias terms. Empirically, we analyze the variance in dynamical behavior and eigenvalue spectra of these local linear operators with varying weight factorizations, activation functions, bias terms, and depths.","PeriodicalId":73299,"journal":{"name":"IEEE open journal of control systems","volume":"1 ","pages":"100-112"},"PeriodicalIF":0.0,"publicationDate":"2022-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/iel7/9552933/9683993/09809789.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50349022","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Discrete Fractional Order Adaptive Law for Parameter Estimation and Adaptive Control","authors":"Mohamed Aburakhis;Raúl Ordóñez;Ouboti Djaneye-Boundjou","doi":"10.1109/OJCSYS.2022.3185002","DOIUrl":"https://doi.org/10.1109/OJCSYS.2022.3185002","url":null,"abstract":"In this article, a discrete fractional order adaptive law (DFOAL) is designed based on the Caputo fractional difference to perform parameter estimation of structured uncertainties. The paper provides a rigorous stability analysis of the DFOAL parameter estimation method. The DFOAL is then modified in order to improve parameter estimator performance to show that, under certain conditions, it provides asymptotic convergence to the true parameter values even when the regressor is not persistently exciting. A method to allow for practical implementation of the DFOAL and the modified DFOAL is developed. Finally, the modified DFOAL is used to identify the plant parameters in an indirect adaptive control law for a class of nonlinear discrete-time systems with structured uncertainty.","PeriodicalId":73299,"journal":{"name":"IEEE open journal of control systems","volume":"1 ","pages":"113-125"},"PeriodicalIF":0.0,"publicationDate":"2022-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/iel7/9552933/9683993/09802697.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50349023","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abed AlRahman Al Makdah;Vishaal Krishnan;Fabio Pasqualetti
{"title":"Learning Lipschitz Feedback Policies From Expert Demonstrations: Closed-Loop Guarantees, Robustness and Generalization","authors":"Abed AlRahman Al Makdah;Vishaal Krishnan;Fabio Pasqualetti","doi":"10.1109/OJCSYS.2022.3181584","DOIUrl":"https://doi.org/10.1109/OJCSYS.2022.3181584","url":null,"abstract":"In this work, we propose a framework in which we use a Lipschitz-constrained loss minimization scheme to learn feedback control policies with guarantees on closed-loop stability, adversarial robustness, and generalization. These policies are learned directly from expert demonstrations, contained in a dataset of state-control input pairs, without any prior knowledge of the task and system model. Our analysis exploits the Lipschitz property of the learned policies to obtain closed-loop guarantees on stability, adversarial robustness, and generalization over scenarios unexplored by the expert. In particular, first, we establish robust closed-loop stability under the learned control policy, where we provide guarantees that the closed-loop trajectory under the learned policy stays within a bounded region around the expert trajectory and converges asymptotically to a bounded region around the origin. Second, we derive bounds on the closed-loop regret with respect to the expert policy and on the deterioration of the closed-loop performance under bounded (adversarial) disturbances to the state measurements. These bounds provide certificates for closed-loop performance and adversarial robustness for learned policies. Third, we derive a (probabilistic) bound on generalization error for the learned policies. Numerical results validate our analysis and demonstrate the effectiveness of our robust feedback policy learning framework. Finally, our results support the existence of a potential tradeoff between nominal closed-loop performance and adversarial robustness, and that improvements in nominal closed-loop performance can only be made at the expense of robustness to adversarial perturbations.","PeriodicalId":73299,"journal":{"name":"IEEE open journal of control systems","volume":"1 ","pages":"85-99"},"PeriodicalIF":0.0,"publicationDate":"2022-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/iel7/9552933/9683993/09798865.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50349021","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Michael R. P. Ragazzon;Saverio Messineo;Jan Tommy Gravdahl;David M. Harcombe;Michael G. Ruppert
{"title":"The Generalized Lyapunov Demodulator: High-Bandwidth, Low-Noise Amplitude and Phase Estimation","authors":"Michael R. P. Ragazzon;Saverio Messineo;Jan Tommy Gravdahl;David M. Harcombe;Michael G. Ruppert","doi":"10.1109/OJCSYS.2022.3181111","DOIUrl":"https://doi.org/10.1109/OJCSYS.2022.3181111","url":null,"abstract":"Effective demodulation of amplitude and phase is a requirement in a wide array of applications. Recent efforts have increased the demodulation performance, in particular, the Lyapunov demodulator allows bandwidths up to the carrier frequency of the signal. However, being inherently restricted to first-order filtering of the input signal, it is highly sensitive to frequency components outside its passband region. This makes it unsuitable for certain applications such as multifrequency atomic force microscopy (AFM). In this article, the structure of the Lyapunov demodulator is transformed to an equivalent form and generalized by exploiting the internal model principle. The resulting generalized Lyapunov demodulator structure allows for arbitrary filtering order and is easy to implement, requiring only a bandpass filter, a single integrator, and two nonlinear transformations. The generalized Lyapunov demodulator is implemented experimentally on a field-programmable gate array (FPGA). Then it is used for imaging in an AFM and benchmarked against the standard Lyapunov demodulator and the widely used lock-in amplifier. The lock-in amplifier achieves great noise attenuation capabilities and off-mode rejection at low bandwidths, whereas the standard Lyapunov demodulator is shown to be effective at high bandwidths. We demonstrate that the proposed demodulator combines the best from the two state-of-the-art demodulators, demonstrating high bandwidths, large off-mode rejection, and excellent noise attenuation simultaneously.","PeriodicalId":73299,"journal":{"name":"IEEE open journal of control systems","volume":"1 ","pages":"69-84"},"PeriodicalIF":0.0,"publicationDate":"2022-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/iel7/9552933/9683993/09790310.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50237536","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Joey Reinders;Bram Hunnekens;Nathan van de Wouw;Tom Oomen
{"title":"Noninvasive Breathing Effort Estimation of Mechanically Ventilated Patients Using Sparse Optimization","authors":"Joey Reinders;Bram Hunnekens;Nathan van de Wouw;Tom Oomen","doi":"10.1109/OJCSYS.2022.3180002","DOIUrl":"https://doi.org/10.1109/OJCSYS.2022.3180002","url":null,"abstract":"Mechanical ventilators facilitate breathing for patients who cannot breathe (sufficiently) on their own. The aim of this paper is to estimate relevant lung parameters and the spontaneous breathing effort of a ventilated patient that help keeping track of the patient’s clinical condition. A key challenge is that estimation using the available sensors for typical model structures results in a non-identifiable parametrization. A sparse optimization algorithm to estimate the lung parameters and the patient effort, without interfering with the patient’s treatment, using an \u0000<inline-formula><tex-math>$ell _{1}$</tex-math></inline-formula>\u0000-regularization approach is presented. It is confirmed that accurate estimates of the lung parameters and the patient effort can be retrieved through a simulation case study and an experimental case study.","PeriodicalId":73299,"journal":{"name":"IEEE open journal of control systems","volume":"1 ","pages":"57-68"},"PeriodicalIF":0.0,"publicationDate":"2022-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/iel7/9552933/9683993/09787788.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50255796","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jianping Lin;Nikhil V. Divekar;Gray C. Thomas;Robert D. Gregg
{"title":"Optimally Biomimetic Passivity-Based Control of a Lower-Limb Exoskeleton Over the Primary Activities of Daily Life","authors":"Jianping Lin;Nikhil V. Divekar;Gray C. Thomas;Robert D. Gregg","doi":"10.1109/OJCSYS.2022.3165733","DOIUrl":"https://doi.org/10.1109/OJCSYS.2022.3165733","url":null,"abstract":"Task-specific, trajectory-based control methods commonly used in exoskeletons may be appropriate for individuals with paraplegia, but they overly constrain the volitional motion of individuals with remnant voluntary ability (representing a far larger population). Human-exoskeleton systems can be represented in the form of the Euler-Lagrange equations or, equivalently, the port-controlled Hamiltonian equations to design control laws that provide \u0000<italic>task-invariant</i>\u0000 assistance across a continuum of activities/environments by altering energetic properties of the human body. We previously introduced a port-controlled Hamiltonian framework that parameterizes the control law through basis functions related to gravitational and gyroscopic terms, which are optimized to fit normalized able-bodied joint torques across multiple walking gaits on different ground inclines. However, this approach did not have the flexibility to reproduce joint torques for a broader set of activities, including stair climbing and stand-to-sit, due to strict assumptions related to input-output passivity, which ensures the human remains in control of energy growth in the closed-loop dynamics. To provide biomimetic assistance across all primary activities of daily life, this paper generalizes this energy shaping framework by incorporating vertical ground reaction forces and global planar orientation into the basis set, while preserving passivity between the human joint torques and human joint velocities. We present an experimental implementation on a powered knee-ankle exoskeleton used by three able-bodied human subjects during walking on various inclines, ramp ascent/descent, and stand-to-sit, demonstrating the versatility of this control approach and its effect on muscular effort.","PeriodicalId":73299,"journal":{"name":"IEEE open journal of control systems","volume":"1 ","pages":"15-28"},"PeriodicalIF":0.0,"publicationDate":"2022-04-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/iel7/9552933/9683993/09756252.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50237549","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Non-Stationary Representation Learning in Sequential Linear Bandits","authors":"Yuzhen Qin;Tommaso Menara;Samet Oymak;ShiNung Ching;Fabio Pasqualetti","doi":"10.1109/OJCSYS.2022.3178540","DOIUrl":"https://doi.org/10.1109/OJCSYS.2022.3178540","url":null,"abstract":"In this paper, we study representation learning for multi-task decision-making in non-stationary environments. We consider the framework of sequential linear bandits, where the agent performs a series of tasks drawn from different environments. The embeddings of tasks in each environment share a low-dimensional feature extractor called \u0000<italic>representation</i>\u0000, and representations are different across environments. We propose an online algorithm that facilitates efficient decision-making by learning and transferring non-stationary representations in an adaptive fashion. We prove that our algorithm significantly outperforms the existing ones that treat tasks independently. We also conduct experiments using both synthetic and real data to validate our theoretical insights and demonstrate the efficacy of our algorithm.","PeriodicalId":73299,"journal":{"name":"IEEE open journal of control systems","volume":"1 ","pages":"41-56"},"PeriodicalIF":0.0,"publicationDate":"2022-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/iel7/9552933/9683993/09783063.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50237535","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}