基于预训练GCN的深度融合流和拓扑特征的僵尸网络检测

IF 4.5 3区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

Computer Communications Pub Date : 2025-01-27 DOI:10.1016/j.comcom.2025.108084

Xiaoyuan Meng , Bo Lang , Yuhao Yan , Yanxi Liu

{"title":"基于预训练GCN的深度融合流和拓扑特征的僵尸网络检测","authors":"Xiaoyuan Meng , Bo Lang , Yuhao Yan , Yanxi Liu","doi":"10.1016/j.comcom.2025.108084","DOIUrl":null,"url":null,"abstract":"<div><div>The characteristics of botnets are mainly reflected in their network behaviors and the intercommunication relationships among their bots. The existing botnet detection methods typically use only one kind of feature, i.e., flow features or topological features; each feature type overlooks the other type of features and affects the resulting model performance. In this paper, for the first time, we propose a botnet detection model that uses a graph convolutional network (GCN) to deeply fuse flow features and topological features. We construct communication graphs from network traffic and represent node attributes with flow features. The extreme sample imbalance phenomenon exhibited by the existing public traffic datasets makes training a GCN model impractical. To address this problem, we propose a pretrained GCN framework that utilizes a public balanced artificial communication graph dataset to pretrain the GCN model, and the feature output obtained from the last hidden layer of the GCN model containing the flow and topology information is input into the Extra Tree classification model. Furthermore, our model can effectively detect command-and-control (C2) and peer-to-peer (P2P) botnets by simply adjusting the number of layers in the GCN. The experimental results obtained on public datasets demonstrate that our approach outperforms the current state-of-the-art botnet detection models. In addition, our model also performs well in real-world botnet detection scenarios.</div></div>","PeriodicalId":55224,"journal":{"name":"Computer Communications","volume":"233 ","pages":"Article 108084"},"PeriodicalIF":4.5000,"publicationDate":"2025-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Deeply fused flow and topology features for botnet detection based on a pretrained GCN\",\"authors\":\"Xiaoyuan Meng , Bo Lang , Yuhao Yan , Yanxi Liu\",\"doi\":\"10.1016/j.comcom.2025.108084\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>The characteristics of botnets are mainly reflected in their network behaviors and the intercommunication relationships among their bots. The existing botnet detection methods typically use only one kind of feature, i.e., flow features or topological features; each feature type overlooks the other type of features and affects the resulting model performance. In this paper, for the first time, we propose a botnet detection model that uses a graph convolutional network (GCN) to deeply fuse flow features and topological features. We construct communication graphs from network traffic and represent node attributes with flow features. The extreme sample imbalance phenomenon exhibited by the existing public traffic datasets makes training a GCN model impractical. To address this problem, we propose a pretrained GCN framework that utilizes a public balanced artificial communication graph dataset to pretrain the GCN model, and the feature output obtained from the last hidden layer of the GCN model containing the flow and topology information is input into the Extra Tree classification model. Furthermore, our model can effectively detect command-and-control (C2) and peer-to-peer (P2P) botnets by simply adjusting the number of layers in the GCN. The experimental results obtained on public datasets demonstrate that our approach outperforms the current state-of-the-art botnet detection models. In addition, our model also performs well in real-world botnet detection scenarios.</div></div>\",\"PeriodicalId\":55224,\"journal\":{\"name\":\"Computer Communications\",\"volume\":\"233 \",\"pages\":\"Article 108084\"},\"PeriodicalIF\":4.5000,\"publicationDate\":\"2025-01-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computer Communications\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0140366425000416\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Communications","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0140366425000416","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

摘要

僵尸网络的特征主要体现在其网络行为和机器人之间的相互通信关系上。现有的僵尸网络检测方法通常只使用一种特征，即流量特征或拓扑特征；每种特征类型都会忽略其他类型的特征，并影响最终的模型性能。在本文中，我们首次提出了一种利用图卷积网络（GCN）深度融合流特征和拓扑特征的僵尸网络检测模型。我们从网络流量中构造通信图，并用流特征表示节点属性。现有公共交通数据集表现出的极端样本不平衡现象使得GCN模型的训练不切实际。为了解决这一问题，我们提出了一种预训练GCN框架，该框架利用公共平衡人工通信图数据集对GCN模型进行预训练，并将GCN模型最后一层隐含的包含流和拓扑信息的特征输出输入Extra Tree分类模型。此外，我们的模型可以通过简单地调整GCN中的层数来有效地检测命令和控制（C2）和点对点（P2P）僵尸网络。在公共数据集上获得的实验结果表明，我们的方法优于当前最先进的僵尸网络检测模型。此外，我们的模型在真实的僵尸网络检测场景中也表现良好。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Deeply fused flow and topology features for botnet detection based on a pretrained GCN

The characteristics of botnets are mainly reflected in their network behaviors and the intercommunication relationships among their bots. The existing botnet detection methods typically use only one kind of feature, i.e., flow features or topological features; each feature type overlooks the other type of features and affects the resulting model performance. In this paper, for the first time, we propose a botnet detection model that uses a graph convolutional network (GCN) to deeply fuse flow features and topological features. We construct communication graphs from network traffic and represent node attributes with flow features. The extreme sample imbalance phenomenon exhibited by the existing public traffic datasets makes training a GCN model impractical. To address this problem, we propose a pretrained GCN framework that utilizes a public balanced artificial communication graph dataset to pretrain the GCN model, and the feature output obtained from the last hidden layer of the GCN model containing the flow and topology information is input into the Extra Tree classification model. Furthermore, our model can effectively detect command-and-control (C2) and peer-to-peer (P2P) botnets by simply adjusting the number of layers in the GCN. The experimental results obtained on public datasets demonstrate that our approach outperforms the current state-of-the-art botnet detection models. In addition, our model also performs well in real-world botnet detection scenarios.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Computer Communications 工程技术-电信学

CiteScore

14.10

自引率

5.00%

发文量

397

审稿时长

66 days

期刊介绍： Computer and Communications networks are key infrastructures of the information society with high socio-economic value as they contribute to the correct operations of many critical services (from healthcare to finance and transportation). Internet is the core of today''s computer-communication infrastructures. This has transformed the Internet, from a robust network for data transfer between computers, to a global, content-rich, communication and information system where contents are increasingly generated by the users, and distributed according to human social relations. Next-generation network technologies, architectures and protocols are therefore required to overcome the limitations of the legacy Internet and add new capabilities and services. The future Internet should be ubiquitous, secure, resilient, and closer to human communication paradigms. Computer Communications is a peer-reviewed international journal that publishes high-quality scientific articles (both theory and practice) and survey papers covering all aspects of future computer communication networks (on all layers, except the physical layer), with a special attention to the evolution of the Internet architecture, protocols, services, and applications.