Exploring the Evolution of Exploit-Sharing Hackers: An Unsupervised Graph Embedding Approach

2021 IEEE International Conference on Intelligence and Security Informatics (ISI) Pub Date : 2021-11-02 DOI:10.1109/ISI53945.2021.9624846

Kaeli Otto, Benjamin Ampel, S. Samtani, Hongyi Zhu, Hsinchun Chen

{"title":"Exploring the Evolution of Exploit-Sharing Hackers: An Unsupervised Graph Embedding Approach","authors":"Kaeli Otto, Benjamin Ampel, S. Samtani, Hongyi Zhu, Hsinchun Chen","doi":"10.1109/ISI53945.2021.9624846","DOIUrl":null,"url":null,"abstract":"Cybercrime was estimated to cost the global economy $945 billion in 2020. Increasingly, law enforcement agencies are using social network analysis (SNA) to identify key hackers from Dark Web hacker forums for targeted investigations. However, past approaches have primarily focused on analyzing key hackers at a single point in time and use a hacker’s structural features only. In this study, we propose a novel Hacker Evolution Identification Framework to identify how hackers evolve within hacker forums. The proposed framework has two novelties in its design. First, the framework captures features such as user statistics, node-level metrics, lexical measures, and post style, when representing each hacker with unsupervised graph embedding methods. Second, the framework incorporates mechanisms to align embedding spaces across multiple time-spells of data to facilitate analysis of how hackers evolve over time. Two experiments were conducted to assess the performance of prevailing graph embedding algorithms and nodal feature variations in the task of graph reconstruction in five time-spells. Results of our experiments indicate that Text-Associated Deep-Walk (TADW) with all of the proposed nodal features outperforms methods without nodal features in terms of Mean Average Precision in each time-spell. We illustrate the potential practical utility of the proposed framework with a case study on an English forum with 51,612 posts. The results produced by the framework in this case study identified key hackers posting piracy assets.","PeriodicalId":347770,"journal":{"name":"2021 IEEE International Conference on Intelligence and Security Informatics (ISI)","volume":"51 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE International Conference on Intelligence and Security Informatics (ISI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISI53945.2021.9624846","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

Cybercrime was estimated to cost the global economy $945 billion in 2020. Increasingly, law enforcement agencies are using social network analysis (SNA) to identify key hackers from Dark Web hacker forums for targeted investigations. However, past approaches have primarily focused on analyzing key hackers at a single point in time and use a hacker’s structural features only. In this study, we propose a novel Hacker Evolution Identification Framework to identify how hackers evolve within hacker forums. The proposed framework has two novelties in its design. First, the framework captures features such as user statistics, node-level metrics, lexical measures, and post style, when representing each hacker with unsupervised graph embedding methods. Second, the framework incorporates mechanisms to align embedding spaces across multiple time-spells of data to facilitate analysis of how hackers evolve over time. Two experiments were conducted to assess the performance of prevailing graph embedding algorithms and nodal feature variations in the task of graph reconstruction in five time-spells. Results of our experiments indicate that Text-Associated Deep-Walk (TADW) with all of the proposed nodal features outperforms methods without nodal features in terms of Mean Average Precision in each time-spell. We illustrate the potential practical utility of the proposed framework with a case study on an English forum with 51,612 posts. The results produced by the framework in this case study identified key hackers posting piracy assets.

查看原文本刊更多论文

探索漏洞共享黑客的演变:一种无监督图嵌入方法

据估计，到2020年，网络犯罪将给全球经济造成9450亿美元的损失。执法机构越来越多地使用社会网络分析(SNA)从暗网黑客论坛中识别关键黑客，以便进行有针对性的调查。然而，过去的方法主要集中在单个时间点上分析关键黑客，并且只使用黑客的结构特征。在这项研究中，我们提出了一个新的黑客进化识别框架来识别黑客如何在黑客论坛中进化。提出的框架在设计上有两个新颖之处。首先，当使用无监督图嵌入方法表示每个黑客时，该框架捕获诸如用户统计、节点级度量、词法度量和帖子样式等特征。其次，该框架结合了一些机制来调整跨多个时间段数据的嵌入空间，以促进对黑客如何随时间演变的分析。通过两个实验，评估了常用的图嵌入算法和节点特征变化在5个时间片段图重构任务中的性能。实验结果表明，具有所有节点特征的文本关联深度行走(TADW)在每个时间段的平均精度方面优于没有节点特征的方法。我们通过一个有51,612个帖子的英语论坛的案例研究来说明拟议框架的潜在实际效用。在本案例研究中，框架产生的结果确定了发布盗版资产的主要黑客。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2021 IEEE International Conference on Intelligence and Security Informatics (ISI)

自引率

0.00%

发文量