基于深度强化学习的网络物理系统无模型隐身攻击

IF 8.6 1区计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS

IEEE Transactions on Systems Man Cybernetics-Systems Pub Date : 2025-04-23 DOI:10.1109/TSMC.2025.3559710

Qirui Zhang;Siqi Meng;Wei Dai;Zhenxing Xia;Chunyu Yang;Xuesong Wang

{"title":"基于深度强化学习的网络物理系统无模型隐身攻击","authors":"Qirui Zhang;Siqi Meng;Wei Dai;Zhenxing Xia;Chunyu Yang;Xuesong Wang","doi":"10.1109/TSMC.2025.3559710","DOIUrl":null,"url":null,"abstract":"This article, from the attacker’s standpoint, develops a model-free stealthy attack that can steer the system state to the predefined target value and evade detection, without prior knowledge of the system dynamics. A constrained Markov decision process (CMDP) is first modeled to characterize the objective of the stealthy attack. On the basis of the established CMDP, an actor-critic reinforcement learning algorithm is proposed to train the attacker’s policy. Furthermore, by introducing a Lyapunov function constructed from the action value function to the algorithm, convergence of the attacked system’s state to the target is theoretically guaranteed. Differing from existing model-free stealthy attacks which are only suitable for linear systems, the proposed approach guarantees the applicability to nonlinear systems. A linear numerical example and a nonlinear example of flotation industrial system are provided to validate the effectiveness of our proposed stealthy attack.","PeriodicalId":48915,"journal":{"name":"IEEE Transactions on Systems Man Cybernetics-Systems","volume":"55 7","pages":"5091-5101"},"PeriodicalIF":8.6000,"publicationDate":"2025-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Model-Free Stealthy Attack for Cyber-Physical Systems Based on Deep Reinforcement Learning\",\"authors\":\"Qirui Zhang;Siqi Meng;Wei Dai;Zhenxing Xia;Chunyu Yang;Xuesong Wang\",\"doi\":\"10.1109/TSMC.2025.3559710\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This article, from the attacker’s standpoint, develops a model-free stealthy attack that can steer the system state to the predefined target value and evade detection, without prior knowledge of the system dynamics. A constrained Markov decision process (CMDP) is first modeled to characterize the objective of the stealthy attack. On the basis of the established CMDP, an actor-critic reinforcement learning algorithm is proposed to train the attacker’s policy. Furthermore, by introducing a Lyapunov function constructed from the action value function to the algorithm, convergence of the attacked system’s state to the target is theoretically guaranteed. Differing from existing model-free stealthy attacks which are only suitable for linear systems, the proposed approach guarantees the applicability to nonlinear systems. A linear numerical example and a nonlinear example of flotation industrial system are provided to validate the effectiveness of our proposed stealthy attack.\",\"PeriodicalId\":48915,\"journal\":{\"name\":\"IEEE Transactions on Systems Man Cybernetics-Systems\",\"volume\":\"55 7\",\"pages\":\"5091-5101\"},\"PeriodicalIF\":8.6000,\"publicationDate\":\"2025-04-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Systems Man Cybernetics-Systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10974716/\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"AUTOMATION & CONTROL SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Systems Man Cybernetics-Systems","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10974716/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}

引用次数: 0

摘要

本文从攻击者的角度出发，开发了一种无模型隐身攻击，可以在不事先了解系统动力学的情况下，将系统状态引导到预定义的目标值并逃避检测。首先建立了约束马尔可夫决策过程（CMDP）模型来描述隐身攻击的目标。在建立的CMDP基础上，提出了一种actor-critic强化学习算法来训练攻击者的策略。此外，通过在算法中引入由动作值函数构造的Lyapunov函数，理论上保证了被攻击系统的状态向目标的收敛性。与现有的无模型隐身攻击只适用于线性系统不同，该方法保证了对非线性系统的适用性。通过一个线性数值算例和一个浮选工业系统的非线性算例验证了隐身攻击的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A Model-Free Stealthy Attack for Cyber-Physical Systems Based on Deep Reinforcement Learning

This article, from the attacker’s standpoint, develops a model-free stealthy attack that can steer the system state to the predefined target value and evade detection, without prior knowledge of the system dynamics. A constrained Markov decision process (CMDP) is first modeled to characterize the objective of the stealthy attack. On the basis of the established CMDP, an actor-critic reinforcement learning algorithm is proposed to train the attacker’s policy. Furthermore, by introducing a Lyapunov function constructed from the action value function to the algorithm, convergence of the attacked system’s state to the target is theoretically guaranteed. Differing from existing model-free stealthy attacks which are only suitable for linear systems, the proposed approach guarantees the applicability to nonlinear systems. A linear numerical example and a nonlinear example of flotation industrial system are provided to validate the effectiveness of our proposed stealthy attack.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Transactions on Systems Man Cybernetics-Systems AUTOMATION & CONTROL SYSTEMS-COMPUTER SCIENCE, CYBERNETICS

CiteScore

18.50

自引率

11.50%

发文量

812

审稿时长

6 months

期刊介绍： The IEEE Transactions on Systems, Man, and Cybernetics: Systems encompasses the fields of systems engineering, covering issue formulation, analysis, and modeling throughout the systems engineering lifecycle phases. It addresses decision-making, issue interpretation, systems management, processes, and various methods such as optimization, modeling, and simulation in the development and deployment of large systems.