Zhilin Liu, Hengwei Zhang, Pengyu Sun, Yan Mi, Xiaoning Zhang, Jin-dong Wang
{"title":"基于FlipIt模型和q -学习方法的APT主动检测定时策略","authors":"Zhilin Liu, Hengwei Zhang, Pengyu Sun, Yan Mi, Xiaoning Zhang, Jin-dong Wang","doi":"10.1109/dsins54396.2021.9670619","DOIUrl":null,"url":null,"abstract":"At present, APT attack detection mainly focuses on the specific realization of detection technology, and there is little research on the timing strategy for active detection of APT attack. This paper focuses on the problem of APT dynamic detection with timing decision intelligence. We use the FlipIt game model and reinforcement learning method of Q-learning algorithm, consider the APT attack events as an exponentially distributed random process, and the defender using the attacker’s last move time (LM-defender) to learn his optimal timing strategy using Q-Learning algorithm. The defender's reinforcement learning mechanism for the timing of the APT attack action and outputs its timing selection strategy. Simulation experiments show that compared with the existing greedy algorithm, the timing strategy based on the Q-Learning algorithm does not require the attacker's prior information under the premise, it can achieve a higher average benefit than the greedy strategy. This result provides a theoretical reference for the design of timing strategy to defend APT attack intelligently and efficient.","PeriodicalId":243724,"journal":{"name":"2021 International Conference on Digital Society and Intelligent Systems (DSInS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Timing Strategy for Active Detection of APT Attack Based on FlipIt Model and Q-learning Method\",\"authors\":\"Zhilin Liu, Hengwei Zhang, Pengyu Sun, Yan Mi, Xiaoning Zhang, Jin-dong Wang\",\"doi\":\"10.1109/dsins54396.2021.9670619\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"At present, APT attack detection mainly focuses on the specific realization of detection technology, and there is little research on the timing strategy for active detection of APT attack. This paper focuses on the problem of APT dynamic detection with timing decision intelligence. We use the FlipIt game model and reinforcement learning method of Q-learning algorithm, consider the APT attack events as an exponentially distributed random process, and the defender using the attacker’s last move time (LM-defender) to learn his optimal timing strategy using Q-Learning algorithm. The defender's reinforcement learning mechanism for the timing of the APT attack action and outputs its timing selection strategy. Simulation experiments show that compared with the existing greedy algorithm, the timing strategy based on the Q-Learning algorithm does not require the attacker's prior information under the premise, it can achieve a higher average benefit than the greedy strategy. This result provides a theoretical reference for the design of timing strategy to defend APT attack intelligently and efficient.\",\"PeriodicalId\":243724,\"journal\":{\"name\":\"2021 International Conference on Digital Society and Intelligent Systems (DSInS)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-12-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 International Conference on Digital Society and Intelligent Systems (DSInS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/dsins54396.2021.9670619\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 International Conference on Digital Society and Intelligent Systems (DSInS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/dsins54396.2021.9670619","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Timing Strategy for Active Detection of APT Attack Based on FlipIt Model and Q-learning Method
At present, APT attack detection mainly focuses on the specific realization of detection technology, and there is little research on the timing strategy for active detection of APT attack. This paper focuses on the problem of APT dynamic detection with timing decision intelligence. We use the FlipIt game model and reinforcement learning method of Q-learning algorithm, consider the APT attack events as an exponentially distributed random process, and the defender using the attacker’s last move time (LM-defender) to learn his optimal timing strategy using Q-Learning algorithm. The defender's reinforcement learning mechanism for the timing of the APT attack action and outputs its timing selection strategy. Simulation experiments show that compared with the existing greedy algorithm, the timing strategy based on the Q-Learning algorithm does not require the attacker's prior information under the premise, it can achieve a higher average benefit than the greedy strategy. This result provides a theoretical reference for the design of timing strategy to defend APT attack intelligently and efficient.