对 UWF-ZeekData22 中的罕见攻击战术进行重采样分类

Knowledge Pub Date : 2024-03-14 DOI:10.3390/knowledge4010006
S. Bagui, D. Mink, S. Bagui, Sakthivel Subramaniam
{"title":"对 UWF-ZeekData22 中的罕见攻击战术进行重采样分类","authors":"S. Bagui, D. Mink, S. Bagui, Sakthivel Subramaniam","doi":"10.3390/knowledge4010006","DOIUrl":null,"url":null,"abstract":"One of the major problems in classifying network attack tactics is the imbalanced nature of data. Typical network datasets have an extremely high percentage of normal or benign traffic and machine learners are skewed toward classes with more data; hence, attack data remain incorrectly classified. This paper addresses the class imbalance problem using resampling techniques on a newly created dataset, UWF-ZeekData22. This is the first dataset with tactic labels, labeled as per the MITRE ATT&CK framework. This dataset contains about half benign data and half attack tactic data, but specific tactics have a meager number of occurrences within the attack tactics. Our objective in this paper was to use resampling techniques to classify two rare tactics, privilege escalation and credential access, never before classified. The study also looks at the order of oversampling and undersampling. Varying resampling ratios were used with oversampling techniques such as BSMOTE and SVM-SMOTE and random undersampling without replacement was used. Based on the results, it can be observed that the order of oversampling and undersampling matters and, in many cases, even an oversampling ratio of 10% of the majority data is enough to obtain the best results.","PeriodicalId":510293,"journal":{"name":"Knowledge","volume":"66 2","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Resampling to Classify Rare Attack Tactics in UWF-ZeekData22\",\"authors\":\"S. Bagui, D. Mink, S. Bagui, Sakthivel Subramaniam\",\"doi\":\"10.3390/knowledge4010006\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"One of the major problems in classifying network attack tactics is the imbalanced nature of data. Typical network datasets have an extremely high percentage of normal or benign traffic and machine learners are skewed toward classes with more data; hence, attack data remain incorrectly classified. This paper addresses the class imbalance problem using resampling techniques on a newly created dataset, UWF-ZeekData22. This is the first dataset with tactic labels, labeled as per the MITRE ATT&CK framework. This dataset contains about half benign data and half attack tactic data, but specific tactics have a meager number of occurrences within the attack tactics. Our objective in this paper was to use resampling techniques to classify two rare tactics, privilege escalation and credential access, never before classified. The study also looks at the order of oversampling and undersampling. Varying resampling ratios were used with oversampling techniques such as BSMOTE and SVM-SMOTE and random undersampling without replacement was used. Based on the results, it can be observed that the order of oversampling and undersampling matters and, in many cases, even an oversampling ratio of 10% of the majority data is enough to obtain the best results.\",\"PeriodicalId\":510293,\"journal\":{\"name\":\"Knowledge\",\"volume\":\"66 2\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-03-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Knowledge\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3390/knowledge4010006\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Knowledge","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/knowledge4010006","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

网络攻击策略分类的主要问题之一是数据的不平衡性。典型的网络数据集有极高比例的正常或良性流量,而机器学习器偏向于数据较多的类别;因此,攻击数据仍会被错误分类。本文在新创建的数据集 UWF-ZeekData22 上使用重采样技术解决了类不平衡问题。这是首个带有战术标签的数据集,按照 MITRE ATT&CK 框架进行标注。该数据集包含大约一半良性数据和一半攻击战术数据,但特定战术在攻击战术中出现的次数很少。我们在本文中的目标是使用重采样技术对特权升级和凭据访问这两种罕见策略进行分类,这两种策略以前从未被分类过。研究还考察了过度取样和不足取样的顺序。在使用 BSMOTE 和 SVM-SMOTE 等超采样技术时,采用了不同的重采样比率;在使用随机欠采样技术时,则不进行替换。研究结果表明,超采样和欠采样的顺序非常重要,在许多情况下,即使超采样率为大部分数据的 10%,也足以获得最佳结果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Resampling to Classify Rare Attack Tactics in UWF-ZeekData22
One of the major problems in classifying network attack tactics is the imbalanced nature of data. Typical network datasets have an extremely high percentage of normal or benign traffic and machine learners are skewed toward classes with more data; hence, attack data remain incorrectly classified. This paper addresses the class imbalance problem using resampling techniques on a newly created dataset, UWF-ZeekData22. This is the first dataset with tactic labels, labeled as per the MITRE ATT&CK framework. This dataset contains about half benign data and half attack tactic data, but specific tactics have a meager number of occurrences within the attack tactics. Our objective in this paper was to use resampling techniques to classify two rare tactics, privilege escalation and credential access, never before classified. The study also looks at the order of oversampling and undersampling. Varying resampling ratios were used with oversampling techniques such as BSMOTE and SVM-SMOTE and random undersampling without replacement was used. Based on the results, it can be observed that the order of oversampling and undersampling matters and, in many cases, even an oversampling ratio of 10% of the majority data is enough to obtain the best results.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信