Qirui Zhang;Siqi Meng;Wei Dai;Zhenxing Xia;Chunyu Yang;Xuesong Wang
{"title":"基于深度强化学习的网络物理系统无模型隐身攻击","authors":"Qirui Zhang;Siqi Meng;Wei Dai;Zhenxing Xia;Chunyu Yang;Xuesong Wang","doi":"10.1109/TSMC.2025.3559710","DOIUrl":null,"url":null,"abstract":"This article, from the attacker’s standpoint, develops a model-free stealthy attack that can steer the system state to the predefined target value and evade detection, without prior knowledge of the system dynamics. A constrained Markov decision process (CMDP) is first modeled to characterize the objective of the stealthy attack. On the basis of the established CMDP, an actor-critic reinforcement learning algorithm is proposed to train the attacker’s policy. Furthermore, by introducing a Lyapunov function constructed from the action value function to the algorithm, convergence of the attacked system’s state to the target is theoretically guaranteed. Differing from existing model-free stealthy attacks which are only suitable for linear systems, the proposed approach guarantees the applicability to nonlinear systems. A linear numerical example and a nonlinear example of flotation industrial system are provided to validate the effectiveness of our proposed stealthy attack.","PeriodicalId":48915,"journal":{"name":"IEEE Transactions on Systems Man Cybernetics-Systems","volume":"55 7","pages":"5091-5101"},"PeriodicalIF":8.6000,"publicationDate":"2025-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Model-Free Stealthy Attack for Cyber-Physical Systems Based on Deep Reinforcement Learning\",\"authors\":\"Qirui Zhang;Siqi Meng;Wei Dai;Zhenxing Xia;Chunyu Yang;Xuesong Wang\",\"doi\":\"10.1109/TSMC.2025.3559710\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This article, from the attacker’s standpoint, develops a model-free stealthy attack that can steer the system state to the predefined target value and evade detection, without prior knowledge of the system dynamics. A constrained Markov decision process (CMDP) is first modeled to characterize the objective of the stealthy attack. On the basis of the established CMDP, an actor-critic reinforcement learning algorithm is proposed to train the attacker’s policy. Furthermore, by introducing a Lyapunov function constructed from the action value function to the algorithm, convergence of the attacked system’s state to the target is theoretically guaranteed. Differing from existing model-free stealthy attacks which are only suitable for linear systems, the proposed approach guarantees the applicability to nonlinear systems. A linear numerical example and a nonlinear example of flotation industrial system are provided to validate the effectiveness of our proposed stealthy attack.\",\"PeriodicalId\":48915,\"journal\":{\"name\":\"IEEE Transactions on Systems Man Cybernetics-Systems\",\"volume\":\"55 7\",\"pages\":\"5091-5101\"},\"PeriodicalIF\":8.6000,\"publicationDate\":\"2025-04-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Systems Man Cybernetics-Systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10974716/\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"AUTOMATION & CONTROL SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Systems Man Cybernetics-Systems","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10974716/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
A Model-Free Stealthy Attack for Cyber-Physical Systems Based on Deep Reinforcement Learning
This article, from the attacker’s standpoint, develops a model-free stealthy attack that can steer the system state to the predefined target value and evade detection, without prior knowledge of the system dynamics. A constrained Markov decision process (CMDP) is first modeled to characterize the objective of the stealthy attack. On the basis of the established CMDP, an actor-critic reinforcement learning algorithm is proposed to train the attacker’s policy. Furthermore, by introducing a Lyapunov function constructed from the action value function to the algorithm, convergence of the attacked system’s state to the target is theoretically guaranteed. Differing from existing model-free stealthy attacks which are only suitable for linear systems, the proposed approach guarantees the applicability to nonlinear systems. A linear numerical example and a nonlinear example of flotation industrial system are provided to validate the effectiveness of our proposed stealthy attack.
期刊介绍:
The IEEE Transactions on Systems, Man, and Cybernetics: Systems encompasses the fields of systems engineering, covering issue formulation, analysis, and modeling throughout the systems engineering lifecycle phases. It addresses decision-making, issue interpretation, systems management, processes, and various methods such as optimization, modeling, and simulation in the development and deployment of large systems.