{"title":"Revisiting the LQR Problem of Singular Systems","authors":"Komeil Nosrati;Juri Belikov;Aleksei Tepljakov;Eduard Petlenkov","doi":"10.1109/JAS.2024.124665","DOIUrl":"https://doi.org/10.1109/JAS.2024.124665","url":null,"abstract":"In the development of linear quadratic regulator (LQR) algorithms, the Riccati equation approach offers two important characteristics —it is recursive and readily meets the existence condition. However, these attributes are applicable only to transformed singular systems, and the efficiency of the regulator may be undermined if constraints are violated in nonsingular versions. To address this gap, we introduce a direct approach to the LQR problem for linear singular systems, avoiding the need for any transformations and eliminating the need for regularity assumptions. To achieve this goal, we begin by formulating a quadratic cost function to derive the LQR algorithm through a penalized and weighted regression framework and then connect it to a constrained minimization problem using the Bellman's criterion. Then, we employ a dynamic programming strategy in a backward approach within a finite horizon to develop an LQR algorithm for the original system. To accomplish this, we address the stability and convergence analysis under the reachability and observability assumptions of a hypothetical system constructed by the pencil of augmented matrices and connected using the Hamiltonian diagonalization technique.","PeriodicalId":54230,"journal":{"name":"Ieee-Caa Journal of Automatica Sinica","volume":"11 11","pages":"2236-2252"},"PeriodicalIF":15.3,"publicationDate":"2024-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142397481","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Probabilistic Automata-Based Method for Enhancing Performance of Deep Reinforcement Learning Systems","authors":"Min Yang;Guanjun Liu;Ziyuan Zhou;Jiacun Wang","doi":"10.1109/JAS.2024.124818","DOIUrl":"https://doi.org/10.1109/JAS.2024.124818","url":null,"abstract":"Deep reinforcement learning (DRL) has demonstrated significant potential in industrial manufacturing domains such as workshop scheduling and energy system management. However, due to the model's inherent uncertainty, rigorous validation is requisite for its application in real-world tasks. Specific tests may reveal inadequacies in the performance of pre-trained DRL models, while the “black-box” nature of DRL poses a challenge for testing model behavior. We propose a novel performance improvement framework based on probabilistic automata, which aims to proactively identify and correct critical vulnerabilities of DRL systems, so that the performance of DRL models in real tasks can be improved with minimal model modifications. First, a probabilistic automaton is constructed from the historical trajectory of the DRL system by abstracting the state to generate probabilistic decision-making units (PDMUs), and a reverse breadth-first search (BFS) method is used to identify the key PDMU-action pairs that have the greatest impact on adverse outcomes. This process relies only on the state-action sequence and final result of each trajectory. Then, under the key PDMU, we search for the new action that has the greatest impact on favorable results. Finally, the key PDMU, undesirable action and new action are encapsulated as monitors to guide the DRL system to obtain more favorable results through real-time monitoring and correction mechanisms. Evaluations in two standard reinforcement learning environments and three actual job scheduling scenarios confirmed the effectiveness of the method, providing certain guarantees for the deployment of DRL models in real-world applications.","PeriodicalId":54230,"journal":{"name":"Ieee-Caa Journal of Automatica Sinica","volume":"11 11","pages":"2327-2339"},"PeriodicalIF":15.3,"publicationDate":"2024-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142397369","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yuanxin Lin;Zhiwen Yu;Kaixiang Yang;Ziwei Fan;C. L. Philip Chen
{"title":"Boosting Adaptive Weighted Broad Learning System for Multi-Label Learning","authors":"Yuanxin Lin;Zhiwen Yu;Kaixiang Yang;Ziwei Fan;C. L. Philip Chen","doi":"10.1109/JAS.2024.124557","DOIUrl":"https://doi.org/10.1109/JAS.2024.124557","url":null,"abstract":"Multi-label classification is a challenging problem that has attracted significant attention from researchers, particularly in the domain of image and text attribute annotation. However, multi-label datasets are prone to serious intra-class and inter-class imbalance problems, which can significantly degrade the classification performance. To address the above issues, we propose the multi-label weighted broad learning system (MLW-BLS) from the perspective of label imbalance weighting and label correlation mining. Further, we propose the multi-label adaptive weighted broad learning system (MLAW-BLS) to adaptively adjust the specific weights and values of labels of MLW-BLS and construct an efficient imbalanced classifier set. Extensive experiments are conducted on various datasets to evaluate the effectiveness of the proposed model, and the results demonstrate its superiority over other advanced approaches.","PeriodicalId":54230,"journal":{"name":"Ieee-Caa Journal of Automatica Sinica","volume":"11 11","pages":"2204-2219"},"PeriodicalIF":15.3,"publicationDate":"2024-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142397484","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A State-Migration Particle Swarm Optimizer for Adaptive Latent Factor Analysis of High-Dimensional and Incomplete Data","authors":"Jiufang Chen;Kechen Liu;Xin Luo;Ye Yuan;Khaled Sedraoui;Yusuf Al-Turki;MengChu Zhou","doi":"10.1109/JAS.2024.124575","DOIUrl":"https://doi.org/10.1109/JAS.2024.124575","url":null,"abstract":"High-dimensional and incomplete (HDI) matrices are primarily generated in all kinds of big-data-related practical applications. A latent factor analysis (LFA) model is capable of conducting efficient representation learning to an HDI matrix, whose hyper-parameter adaptation can be implemented through a particle swarm optimizer (PSO) to meet scalable requirements. However, conventional PSO is limited by its premature issues, which leads to the accuracy loss of a resultant LFA model. To address this thorny issue, this study merges the information of each particle's state migration into its evolution process following the principle of a generalized momentum method for improving its search ability, thereby building a state-migration particle swarm optimizer (SPSO), whose theoretical convergence is rigorously proved in this study. It is then incorporated into an LFA model for implementing efficient hyper-parameter adaptation without accuracy loss. Experiments on six HDI matrices indicate that an SPSO-incorporated LFA model outperforms state-of-the-art LFA models in terms of prediction accuracy for missing data of an HDI matrix with competitive computational efficiency. Hence, SPSO's use ensures efficient and reliable hyper-parameter adaptation in an LFA model, thus ensuring practicality and accurate representation learning for HDI matrices.","PeriodicalId":54230,"journal":{"name":"Ieee-Caa Journal of Automatica Sinica","volume":"11 11","pages":"2220-2235"},"PeriodicalIF":15.3,"publicationDate":"2024-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142397487","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Two-Stage Approach for Targeted Knowledge Transfer in Self-Knowledge Distillation","authors":"Zimo Yin;Jian Pu;Yijie Zhou;Xiangyang Xue","doi":"10.1109/JAS.2024.124629","DOIUrl":"https://doi.org/10.1109/JAS.2024.124629","url":null,"abstract":"Knowledge distillation (KD) enhances student network generalization by transferring dark knowledge from a complex teacher network. To optimize computational expenditure and memory utilization, self-knowledge distillation (SKD) extracts dark knowledge from the model itself rather than an external teacher network. However, previous SKD methods performed distillation indiscriminately on full datasets, overlooking the analysis of representative samples. In this work, we present a novel two-stage approach to providing targeted knowledge on specific samples, named two-stage approach self-knowledge distillation (TOAST). We first soften the hard targets using class medoids generated based on logit vectors per class. Then, we iteratively distill the under-trained data with past predictions of half the batch size. The two-stage knowledge is linearly combined, efficiently enhancing model performance. Extensive experiments conducted on five backbone architectures show our method is model-agnostic and achieves the best generalization performance. Besides, TOAST is strongly compatible with existing augmentation-based regularization methods. Our method also obtains a speedup of up to 2.95x compared with a recent state-of-the-art method.","PeriodicalId":54230,"journal":{"name":"Ieee-Caa Journal of Automatica Sinica","volume":"11 11","pages":"2270-2283"},"PeriodicalIF":15.3,"publicationDate":"2024-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142397459","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multi-USV Formation Collision Avoidance via Deep Reinforcement Learning and COLREGs","authors":"Cheng-Cheng Wang;Yu-Long Wang;Li Jia","doi":"10.1109/JAS.2023.123846","DOIUrl":"https://doi.org/10.1109/JAS.2023.123846","url":null,"abstract":"Dear Editor, This letter focuses on the collision avoidance for a multi-unmanned surface vehicle (multi-USV) system. A novel multi-USV collision avoidance (MUCA) algorithm is proposed. Firstly, in order to get a more reasonable collision avoidance policy, reward functions are constructed according to international regulations for preventing col-lisions at sea (COLREGS) and USV dynamics. Secondly, to reduce data noises and the impacts of outliers, an improved normalization method is proposed. States and rewards of USVs are normalized to avoid gradient vanishing and exploding. Thirdly, a novel \u0000<tex>$epsilon$</tex>\u0000-greedy method is proposed to help the optimal policy converge faster. It is easier for USVs to explore the optimal policy in the learning process. Finally, the proposed MUCA algorithm is tested in a multi-encounter situation including head-on, crossing, and overtaking. The experimental results demonstrate that the newly proposed MUCA algorithm can provide a collision-free marching policy for the USVs in formation.","PeriodicalId":54230,"journal":{"name":"Ieee-Caa Journal of Automatica Sinica","volume":"11 11","pages":"2349-2351"},"PeriodicalIF":15.3,"publicationDate":"2024-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10707649","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142397479","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zhongyuan Zhao;Zhiqiang Yang;Luyao Jiang;Ju Yang;Quanbo Ge
{"title":"Privacy Preserving Distributed Bandit Residual Feedback Online Optimization Over Time-Varying Unbalanced Graphs","authors":"Zhongyuan Zhao;Zhiqiang Yang;Luyao Jiang;Ju Yang;Quanbo Ge","doi":"10.1109/JAS.2024.124656","DOIUrl":"https://doi.org/10.1109/JAS.2024.124656","url":null,"abstract":"This paper considers the distributed online optimization (DOO) problem over time-varying unbalanced networks, where gradient information is explicitly unknown. To address this issue, a privacy-preserving distributed online one-point residual feedback (OPRF) optimization algorithm is proposed. This algorithm updates decision variables by leveraging one-point residual feedback to estimate the true gradient information. It can achieve the same performance as the two-point feedback scheme while only requiring a single function value query per iteration. Additionally, it effectively eliminates the effect of time-varying unbalanced graphs by dynamically constructing row stochastic matrices. Furthermore, compared to other distributed optimization algorithms that only consider explicitly unknown cost functions, this paper also addresses the issue of privacy information leakage of nodes. Theoretical analysis demonstrate that the method attains sublinear regret while protecting the privacy information of agents. Finally, numerical experiments on distributed collaborative localization problem and federated learning confirm the effectiveness of the algorithm.","PeriodicalId":54230,"journal":{"name":"Ieee-Caa Journal of Automatica Sinica","volume":"11 11","pages":"2284-2297"},"PeriodicalIF":15.3,"publicationDate":"2024-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142397368","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Linear Programming-Based Reinforcement Learning Mechanism for Incomplete-Information Games","authors":"Baosen Yang;Changbing Tang;Yang Liu;Guanghui Wen;Guanrong Chen","doi":"10.1109/JAS.2024.124464","DOIUrl":"https://doi.org/10.1109/JAS.2024.124464","url":null,"abstract":"Dear Editor, Recently, with the development of artificial intelligence, game intelligence decision-making has attracted more and more attention. In particular, incomplete-information games (IIG) have gradually become a new research focus, where players make decisions without sufficient information, such as the opponent's strategies or preferences. In this case, a selfish player can only make reactive decisions based on the changes in environment and state. Thus, blind decisions by players may drift them away from the path of reward maximization, and may even hinder the health of the IIG environment. Therefore, it is necessary to design an effective mechanism to optimize decision-making for IIG players.","PeriodicalId":54230,"journal":{"name":"Ieee-Caa Journal of Automatica Sinica","volume":"11 11","pages":"2340-2342"},"PeriodicalIF":15.3,"publicationDate":"2024-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10707689","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142397478","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"On Zero Dynamics and Controllable Cyber-Attacks in Cyber-Physical Systems and Dynamic Coding Schemes as Their Countermeasures","authors":"Mahdi Taheri;Khashayar Khorasani;Nader Meskin","doi":"10.1109/JAS.2024.124692","DOIUrl":"https://doi.org/10.1109/JAS.2024.124692","url":null,"abstract":"In this paper, we study stealthy cyber-attacks on actuators of cyber-physical systems (CPS), namely zero dynamics and controllable attacks. In particular, under certain assumptions, we investigate and propose conditions under which one can execute zero dynamics and controllable attacks in the CPS. The above conditions are derived based on the Markov parameters of the CPS and elements of the system observability matrix. Consequently, in addition to outlining the number of required actuators to be attacked, these conditions provide one with the minimum system knowledge needed to perform zero dynamics and controllable cyber-attacks. As a countermeasure against the above stealthy cyber-attacks, we develop a dynamic coding scheme that increases the minimum number of the CPS required actuators to carry out zero dynamics and controllable cyber-attacks to its maximum possible value. It is shown that if at least one secure input channel exists, the proposed dynamic coding scheme can prevent adversaries from executing the zero dynamics and controllable attacks even if they have complete knowledge of the coding system. Finally, two illustrative numerical case studies are provided to demonstrate the effectiveness and capabilities of our derived conditions and proposed methodologies.","PeriodicalId":54230,"journal":{"name":"Ieee-Caa Journal of Automatica Sinica","volume":"11 11","pages":"2191-2203"},"PeriodicalIF":15.3,"publicationDate":"2024-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142397143","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Pure State Feedback Switching Control Based on the Online Estimated State for Stochastic Open Quantum Systems","authors":"Shuang Cong;Zhixiang Dong","doi":"10.1109/JAS.2023.124071","DOIUrl":"https://doi.org/10.1109/JAS.2023.124071","url":null,"abstract":"For the \u0000<tex>$n$</tex>\u0000-qubit stochastic open quantum systems, based on the Lyapunov stability theorem and LaSalle's invariant set principle, a pure state switching control based on on-line estimated state feedback (short for OQST-SFC) is proposed to realize the state transition the pure state of the target state including eigenstate and superposition state. The proposed switching control consists of a constant control and a control law designed based on the Lyapunov method, in which the Lyapunov function is the state distance of the system. The constant control is used to drive the system state from an initial state to the convergence domain only containing the target state, and a Lyapunov-based control is used to make the state enter the convergence domain and then continue to converge to the target state. At the same time, the continuous weak measurement of quantum system and the quantum state tomography method based on the on-line alternating direction multiplier (QST-OADM) are used to obtain the system information and estimate the quantum state which is used as the input of the quantum system controller. Then, the pure state feedback switching control method based on the on-line estimated state feedback is realized in an \u0000<tex>$n$</tex>\u0000-qubit stochastic open quantum system. The complete derivation process of \u0000<tex>$n$</tex>\u0000-qubit QST-OADM algorithm is given; Through strict theoretical proof and analysis, the convergence conditions to ensure any initial state of the quantum system to converge the target pure state are given. The proposed control method is applied to a 2-qubit stochastic open quantum system for numerical simulation experiments. Four possible different position cases between the initial estimated state and that of the controlled system are studied and discussed, and the performances of the state transition under the corresponding cases are analyzed.","PeriodicalId":54230,"journal":{"name":"Ieee-Caa Journal of Automatica Sinica","volume":"11 10","pages":"2166-2178"},"PeriodicalIF":15.3,"publicationDate":"2024-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142137610","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}