Timing Strategy for Active Detection of APT Attack Based on FlipIt Model and Q-learning Method

2021 International Conference on Digital Society and Intelligent Systems (DSInS) Pub Date : 2021-12-03 DOI:10.1109/dsins54396.2021.9670619

Zhilin Liu, Hengwei Zhang, Pengyu Sun, Yan Mi, Xiaoning Zhang, Jin-dong Wang

{"title":"Timing Strategy for Active Detection of APT Attack Based on FlipIt Model and Q-learning Method","authors":"Zhilin Liu, Hengwei Zhang, Pengyu Sun, Yan Mi, Xiaoning Zhang, Jin-dong Wang","doi":"10.1109/dsins54396.2021.9670619","DOIUrl":null,"url":null,"abstract":"At present, APT attack detection mainly focuses on the specific realization of detection technology, and there is little research on the timing strategy for active detection of APT attack. This paper focuses on the problem of APT dynamic detection with timing decision intelligence. We use the FlipIt game model and reinforcement learning method of Q-learning algorithm, consider the APT attack events as an exponentially distributed random process, and the defender using the attacker’s last move time (LM-defender) to learn his optimal timing strategy using Q-Learning algorithm. The defender's reinforcement learning mechanism for the timing of the APT attack action and outputs its timing selection strategy. Simulation experiments show that compared with the existing greedy algorithm, the timing strategy based on the Q-Learning algorithm does not require the attacker's prior information under the premise, it can achieve a higher average benefit than the greedy strategy. This result provides a theoretical reference for the design of timing strategy to defend APT attack intelligently and efficient.","PeriodicalId":243724,"journal":{"name":"2021 International Conference on Digital Society and Intelligent Systems (DSInS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 International Conference on Digital Society and Intelligent Systems (DSInS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/dsins54396.2021.9670619","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

At present, APT attack detection mainly focuses on the specific realization of detection technology, and there is little research on the timing strategy for active detection of APT attack. This paper focuses on the problem of APT dynamic detection with timing decision intelligence. We use the FlipIt game model and reinforcement learning method of Q-learning algorithm, consider the APT attack events as an exponentially distributed random process, and the defender using the attacker’s last move time (LM-defender) to learn his optimal timing strategy using Q-Learning algorithm. The defender's reinforcement learning mechanism for the timing of the APT attack action and outputs its timing selection strategy. Simulation experiments show that compared with the existing greedy algorithm, the timing strategy based on the Q-Learning algorithm does not require the attacker's prior information under the premise, it can achieve a higher average benefit than the greedy strategy. This result provides a theoretical reference for the design of timing strategy to defend APT attack intelligently and efficient.

查看原文本刊更多论文

基于FlipIt模型和q -学习方法的APT主动检测定时策略

目前，APT攻击检测主要集中在检测技术的具体实现上，对APT攻击主动检测的定时策略研究较少。本文主要研究具有时序决策智能的APT动态检测问题。我们使用FlipIt博弈模型和Q-learning算法的强化学习方法，将APT攻击事件视为指数分布的随机过程，防御者使用攻击者的最后一次移动时间(LM-defender)来学习使用Q-learning算法的最优定时策略。防御者对APT攻击动作时机的强化学习机制，并输出其时机选择策略。仿真实验表明，与现有的贪婪算法相比，基于Q-Learning算法的定时策略在不需要攻击者先验信息的前提下，可以获得比贪婪策略更高的平均效益。研究结果为智能高效防御APT攻击的定时策略设计提供了理论参考。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2021 International Conference on Digital Society and Intelligent Systems (DSInS)

自引率

0.00%

发文量