Theory-Guided Feature Selection in Cybercrime Data Science

Shiven Naidoo, Rennie Naidoo
{"title":"Theory-Guided Feature Selection in Cybercrime Data Science","authors":"Shiven Naidoo, Rennie Naidoo","doi":"10.34190/iccws.19.1.2009","DOIUrl":null,"url":null,"abstract":"Cybercrime data science is being significantly hampered by the presence of 'noisy' features within vast and complex datasets. We draw from the theoretical insights of the behavioural sciences to propose a feature selection model to enrich and improve the value and interpretability of cybercrime intelligence datasets. We piloted our theory-guided feature selection approach on a subset of intelligence datafeeds provided by a global fraud and cybercrime tracking firm. The results of the proposed social influence feature selection model show significant improvement in the interpretability of the machine learning-based exploratory analysis and advanced visualization techniques in an experimental setting. The feature selection model yielded rich insights about cybercriminal psychological tactics from social engineering scam data and has potential applicability in the areas of cyberthreat response and cybersecurity awareness training. Our study shows the value of an interdisciplinary theory-guided approach to cybercrime data analytics that integrates scientific knowledge from the behavioural sciences and data science expertise. Our paper concludes by suggesting avenues for future research on theory-guided feature selection seeking to incorporate behavioural science knowledge in cybercrime data science. We intend to refine, automate, evaluate, and scale our model in future research to assess its effectiveness in producing insights about cybercriminal activities and informing decision-making in a naturalistic and real-time setting. In future research efforts, we aim to automate the encoding of features and apply a wider range of machine learning tools and evaluation metrics to extract more meaningful insights into cybercriminal psychological tactics. We also intend to refine our model on larger datasets to enhance its efficiency and responsiveness to real-time cybercrime data.  We call on data scientists and cybercrime domain experts to work together to apply theory-guided feature selection to improve processes of knowledge discovery that enhance our cybersecurity capabilities.","PeriodicalId":429427,"journal":{"name":"International Conference on Cyber Warfare and Security","volume":" 75","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-03-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Conference on Cyber Warfare and Security","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.34190/iccws.19.1.2009","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Cybercrime data science is being significantly hampered by the presence of 'noisy' features within vast and complex datasets. We draw from the theoretical insights of the behavioural sciences to propose a feature selection model to enrich and improve the value and interpretability of cybercrime intelligence datasets. We piloted our theory-guided feature selection approach on a subset of intelligence datafeeds provided by a global fraud and cybercrime tracking firm. The results of the proposed social influence feature selection model show significant improvement in the interpretability of the machine learning-based exploratory analysis and advanced visualization techniques in an experimental setting. The feature selection model yielded rich insights about cybercriminal psychological tactics from social engineering scam data and has potential applicability in the areas of cyberthreat response and cybersecurity awareness training. Our study shows the value of an interdisciplinary theory-guided approach to cybercrime data analytics that integrates scientific knowledge from the behavioural sciences and data science expertise. Our paper concludes by suggesting avenues for future research on theory-guided feature selection seeking to incorporate behavioural science knowledge in cybercrime data science. We intend to refine, automate, evaluate, and scale our model in future research to assess its effectiveness in producing insights about cybercriminal activities and informing decision-making in a naturalistic and real-time setting. In future research efforts, we aim to automate the encoding of features and apply a wider range of machine learning tools and evaluation metrics to extract more meaningful insights into cybercriminal psychological tactics. We also intend to refine our model on larger datasets to enhance its efficiency and responsiveness to real-time cybercrime data.  We call on data scientists and cybercrime domain experts to work together to apply theory-guided feature selection to improve processes of knowledge discovery that enhance our cybersecurity capabilities.
网络犯罪数据科学中的理论指导特征选择
网络犯罪数据科学因庞大而复杂的数据集中存在 "噪声 "特征而受到严重阻碍。我们借鉴行为科学的理论见解,提出了一种特征选择模型,以丰富和提高网络犯罪情报数据集的价值和可解释性。我们在一家全球欺诈和网络犯罪追踪公司提供的情报数据子集上试用了我们以理论为指导的特征选择方法。所提出的社会影响特征选择模型的结果表明,在实验环境中,基于机器学习的探索性分析和高级可视化技术的可解释性有了显著提高。该特征选择模型从社交工程诈骗数据中获得了有关网络犯罪心理策略的丰富见解,在网络威胁响应和网络安全意识培训领域具有潜在的适用性。我们的研究表明,以跨学科理论为指导的网络犯罪数据分析方法将行为科学的科学知识和数据科学的专业知识融为一体,具有重要价值。本文最后提出了理论指导下特征选择的未来研究途径,旨在将行为科学知识纳入网络犯罪数据科学。我们打算在未来的研究中完善、自动化、评估和扩展我们的模型,以评估其在自然和实时环境中洞察网络犯罪活动和为决策提供信息的有效性。在未来的研究工作中,我们的目标是实现特征编码的自动化,并应用更广泛的机器学习工具和评估指标,以提取对网络犯罪心理策略更有意义的见解。我们还打算在更大的数据集上完善我们的模型,以提高其效率和对实时网络犯罪数据的响应能力。 我们呼吁数据科学家和网络犯罪领域专家共同努力,应用理论指导下的特征选择来改进知识发现过程,从而提高我们的网络安全能力。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信