强盗的状态:多臂强盗问题的状态选择建模

2023 IEEE European Symposium on Security and Privacy Workshops (EuroS&PW) Pub Date : 2023-07-01 DOI:10.1109/EuroSPW59978.2023.00043

Anne Borcherding, Marc-Henri Giraud, Ian Fitzgerald, J. Beyerer

{"title":"强盗的状态:多臂强盗问题的状态选择建模","authors":"Anne Borcherding, Marc-Henri Giraud, Ian Fitzgerald, J. Beyerer","doi":"10.1109/EuroSPW59978.2023.00043","DOIUrl":null,"url":null,"abstract":"Network interfaces of Industrial Control Systems are a common entry point for attackers, and thus need to be thoroughly tested for vulnerabilities. One way to perform such tests is with network fuzzers, which randomly mutate network packets to induce unexpected behavior and vulnerabilities. Highly stateful network protocols pose a particular challenge to fuzzers, since a fuzzer needs to be aware of the states in order to find deep vulnerabilities. Even if a fuzzer is aware of the states of a stateful network protocol, there are still several challenges to overcome. The challenge we focus on is deciding which state to test next. To make this decision, the fuzzer needs to strike a balance between exploiting known states and exploring states not yet tested. We propose to model this exploration versus exploitation dilemma using a Multi-armed Bandit. In this work, we present two modeling approaches and preliminary experiments. We choose to model the state selection problem with (I) a stochastic Multi-armed Bandit, and (II) an adversarial Multi-armed Bandit. The latter takes into account that coverage can only be discovered once, and that the underlying reward probability therefore decreases over time. Although the adversarial Multi-armed Bandit models the state selection problem more accurately, our experiments show that both approaches lead to statistically indistinguishable fuzzer performance. Furthermore, we show that the baseline fuzzer AFLNet leads to significantly better results in terms of coverage. Building on these unintuitive preliminary results, we aim to investigate the behavior of the agents in more detail, to include additional modeling approaches, and to use additional Systems under Test for the evaluation.","PeriodicalId":220415,"journal":{"name":"2023 IEEE European Symposium on Security and Privacy Workshops (EuroS&PW)","volume":"146 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"The Bandit’s States: Modeling State Selection for Stateful Network Fuzzing as Multi-armed Bandit Problem\",\"authors\":\"Anne Borcherding, Marc-Henri Giraud, Ian Fitzgerald, J. Beyerer\",\"doi\":\"10.1109/EuroSPW59978.2023.00043\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Network interfaces of Industrial Control Systems are a common entry point for attackers, and thus need to be thoroughly tested for vulnerabilities. One way to perform such tests is with network fuzzers, which randomly mutate network packets to induce unexpected behavior and vulnerabilities. Highly stateful network protocols pose a particular challenge to fuzzers, since a fuzzer needs to be aware of the states in order to find deep vulnerabilities. Even if a fuzzer is aware of the states of a stateful network protocol, there are still several challenges to overcome. The challenge we focus on is deciding which state to test next. To make this decision, the fuzzer needs to strike a balance between exploiting known states and exploring states not yet tested. We propose to model this exploration versus exploitation dilemma using a Multi-armed Bandit. In this work, we present two modeling approaches and preliminary experiments. We choose to model the state selection problem with (I) a stochastic Multi-armed Bandit, and (II) an adversarial Multi-armed Bandit. The latter takes into account that coverage can only be discovered once, and that the underlying reward probability therefore decreases over time. Although the adversarial Multi-armed Bandit models the state selection problem more accurately, our experiments show that both approaches lead to statistically indistinguishable fuzzer performance. Furthermore, we show that the baseline fuzzer AFLNet leads to significantly better results in terms of coverage. Building on these unintuitive preliminary results, we aim to investigate the behavior of the agents in more detail, to include additional modeling approaches, and to use additional Systems under Test for the evaluation.\",\"PeriodicalId\":220415,\"journal\":{\"name\":\"2023 IEEE European Symposium on Security and Privacy Workshops (EuroS&PW)\",\"volume\":\"146 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 IEEE European Symposium on Security and Privacy Workshops (EuroS&PW)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/EuroSPW59978.2023.00043\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE European Symposium on Security and Privacy Workshops (EuroS&PW)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/EuroSPW59978.2023.00043","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

工业控制系统的网络接口是攻击者的常见入口点，因此需要对其进行彻底的漏洞测试。执行此类测试的一种方法是使用网络模糊器，它随机改变网络数据包以诱导意外行为和漏洞。高度有状态的网络协议对模糊器构成了特别的挑战，因为模糊器需要了解状态才能发现深层漏洞。即使模糊器知道有状态网络协议的状态，仍然有几个挑战需要克服。我们关注的挑战是决定下一步测试哪个州。为了做出这个决定，fuzzer需要在探索已知状态和探索尚未测试的状态之间取得平衡。我们建议使用多臂强盗模型来模拟这种探索与开发的困境。在这项工作中，我们提出了两种建模方法和初步实验。我们选择用(I)一个随机的多臂强盗和(II)一个对抗的多臂强盗来建模状态选择问题。后者考虑到覆盖率只能被发现一次，因此潜在的奖励概率会随着时间的推移而减少。尽管对抗性多臂Bandit模型更准确地模拟了状态选择问题，但我们的实验表明，这两种方法都会导致统计上无法区分的模糊器性能。此外，我们表明，基线模糊器AFLNet在覆盖方面导致显着更好的结果。在这些不直观的初步结果的基础上，我们的目标是更详细地研究代理的行为，包括额外的建模方法，并使用额外的系统在测试中进行评估。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

The Bandit’s States: Modeling State Selection for Stateful Network Fuzzing as Multi-armed Bandit Problem

Network interfaces of Industrial Control Systems are a common entry point for attackers, and thus need to be thoroughly tested for vulnerabilities. One way to perform such tests is with network fuzzers, which randomly mutate network packets to induce unexpected behavior and vulnerabilities. Highly stateful network protocols pose a particular challenge to fuzzers, since a fuzzer needs to be aware of the states in order to find deep vulnerabilities. Even if a fuzzer is aware of the states of a stateful network protocol, there are still several challenges to overcome. The challenge we focus on is deciding which state to test next. To make this decision, the fuzzer needs to strike a balance between exploiting known states and exploring states not yet tested. We propose to model this exploration versus exploitation dilemma using a Multi-armed Bandit. In this work, we present two modeling approaches and preliminary experiments. We choose to model the state selection problem with (I) a stochastic Multi-armed Bandit, and (II) an adversarial Multi-armed Bandit. The latter takes into account that coverage can only be discovered once, and that the underlying reward probability therefore decreases over time. Although the adversarial Multi-armed Bandit models the state selection problem more accurately, our experiments show that both approaches lead to statistically indistinguishable fuzzer performance. Furthermore, we show that the baseline fuzzer AFLNet leads to significantly better results in terms of coverage. Building on these unintuitive preliminary results, we aim to investigate the behavior of the agents in more detail, to include additional modeling approaches, and to use additional Systems under Test for the evaluation.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2023 IEEE European Symposium on Security and Privacy Workshops (EuroS&PW)

自引率

0.00%

发文量