通过源api增强加密流量分析：一种用于恶意流量检测的健壮方法

IF 4.8 2区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

Computers & Security Pub Date : 2025-05-23 DOI:10.1016/j.cose.2025.104529

Wanshuang Lin , Chunhe Xia , Tianbo Wang , Mengyao Liu , Yang Li

{"title":"通过源api增强加密流量分析：一种用于恶意流量检测的健壮方法","authors":"Wanshuang Lin , Chunhe Xia , Tianbo Wang , Mengyao Liu , Yang Li","doi":"10.1016/j.cose.2025.104529","DOIUrl":null,"url":null,"abstract":"<div><div>The widespread adoption of encryption protocols has increased the complexity of detecting malicious Android traffic. By randomizing payload content, encryption obscures semantically explicit features in network traffic, thereby concealing its behavioral intent. Although existing methods mitigate this issue by expanding feature sets or extracting spatiotemporal patterns, they do not fundamentally reconstruct the original payload semantics. In this paper, we propose RATD, a detection model that enhances encrypted traffic representation by introducing semantics of source-APIs. This approach leverages the correlation between system API calls made prior to traffic transmission (referred to as source APIs) and the behavioral intent within encrypted traffic, thereby compensating for semantic loss. First, we construct API-traffic association samples by monitoring network connection APIs. Then, we transform the API sequences into graphs and apply a Graph Convolutional Network (GCN) to learn their structural and semantic representations. These features are fused with corresponding traffic features through a multi-source encoder module. Finally, to address the challenges of limited data availability in real-world deployment, we introduce a representation enhancement module to improve model’s robustness in scenarios with missing data. Experimental results show that RATD is significantly better than the state-of-the-art models across multiple datasets. In particular, in scenarios with missing API data, the accuracy of our model decreases by at most 2.9%, showing a stronger environmental adaptability.</div></div>","PeriodicalId":51004,"journal":{"name":"Computers & Security","volume":"156 ","pages":"Article 104529"},"PeriodicalIF":4.8000,"publicationDate":"2025-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Enhancing encrypted traffic analysis via source APIs: A robust approach for malicious traffic detection\",\"authors\":\"Wanshuang Lin , Chunhe Xia , Tianbo Wang , Mengyao Liu , Yang Li\",\"doi\":\"10.1016/j.cose.2025.104529\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>The widespread adoption of encryption protocols has increased the complexity of detecting malicious Android traffic. By randomizing payload content, encryption obscures semantically explicit features in network traffic, thereby concealing its behavioral intent. Although existing methods mitigate this issue by expanding feature sets or extracting spatiotemporal patterns, they do not fundamentally reconstruct the original payload semantics. In this paper, we propose RATD, a detection model that enhances encrypted traffic representation by introducing semantics of source-APIs. This approach leverages the correlation between system API calls made prior to traffic transmission (referred to as source APIs) and the behavioral intent within encrypted traffic, thereby compensating for semantic loss. First, we construct API-traffic association samples by monitoring network connection APIs. Then, we transform the API sequences into graphs and apply a Graph Convolutional Network (GCN) to learn their structural and semantic representations. These features are fused with corresponding traffic features through a multi-source encoder module. Finally, to address the challenges of limited data availability in real-world deployment, we introduce a representation enhancement module to improve model’s robustness in scenarios with missing data. Experimental results show that RATD is significantly better than the state-of-the-art models across multiple datasets. In particular, in scenarios with missing API data, the accuracy of our model decreases by at most 2.9%, showing a stronger environmental adaptability.</div></div>\",\"PeriodicalId\":51004,\"journal\":{\"name\":\"Computers & Security\",\"volume\":\"156 \",\"pages\":\"Article 104529\"},\"PeriodicalIF\":4.8000,\"publicationDate\":\"2025-05-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computers & Security\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0167404825002184\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers & Security","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167404825002184","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

摘要

加密协议的广泛采用增加了检测恶意Android流量的复杂性。通过随机化有效负载内容，加密模糊了网络流量中语义明确的特征，从而隐藏了其行为意图。虽然现有的方法通过扩展特征集或提取时空模式来缓解这个问题，但它们并没有从根本上重构原始的有效负载语义。本文提出了一种通过引入源api语义来增强加密流量表示的检测模型RATD。这种方法利用了在流量传输之前进行的系统API调用（称为源API）与加密流量中的行为意图之间的相关性，从而补偿了语义损失。首先，通过监控网络连接api构建api -流量关联样本。然后，我们将API序列转换为图，并应用图卷积网络（GCN）来学习它们的结构和语义表示。这些特征通过多源编码器模块与相应的流量特征融合。最后，为了解决实际部署中数据可用性有限的挑战，我们引入了一个表示增强模块，以提高模型在缺少数据的场景中的鲁棒性。实验结果表明，在多个数据集上，RATD显著优于目前最先进的模型。特别是在缺少API数据的场景下，我们的模型准确率最多下降2.9%，表现出更强的环境适应性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Enhancing encrypted traffic analysis via source APIs: A robust approach for malicious traffic detection

The widespread adoption of encryption protocols has increased the complexity of detecting malicious Android traffic. By randomizing payload content, encryption obscures semantically explicit features in network traffic, thereby concealing its behavioral intent. Although existing methods mitigate this issue by expanding feature sets or extracting spatiotemporal patterns, they do not fundamentally reconstruct the original payload semantics. In this paper, we propose RATD, a detection model that enhances encrypted traffic representation by introducing semantics of source-APIs. This approach leverages the correlation between system API calls made prior to traffic transmission (referred to as source APIs) and the behavioral intent within encrypted traffic, thereby compensating for semantic loss. First, we construct API-traffic association samples by monitoring network connection APIs. Then, we transform the API sequences into graphs and apply a Graph Convolutional Network (GCN) to learn their structural and semantic representations. These features are fused with corresponding traffic features through a multi-source encoder module. Finally, to address the challenges of limited data availability in real-world deployment, we introduce a representation enhancement module to improve model’s robustness in scenarios with missing data. Experimental results show that RATD is significantly better than the state-of-the-art models across multiple datasets. In particular, in scenarios with missing API data, the accuracy of our model decreases by at most 2.9%, showing a stronger environmental adaptability.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Computers & Security 工程技术-计算机：信息系统

CiteScore

12.40

自引率

7.10%

发文量

365

审稿时长

10.7 months

期刊介绍： Computers & Security is the most respected technical journal in the IT security field. With its high-profile editorial board and informative regular features and columns, the journal is essential reading for IT security professionals around the world. Computers & Security provides you with a unique blend of leading edge research and sound practical management advice. It is aimed at the professional involved with computer security, audit, control and data integrity in all sectors - industry, commerce and academia. Recognized worldwide as THE primary source of reference for applied research and technical expertise it is your first step to fully secure systems.