Reinforcement Learning Based Accurate Detection of Malicious URLs with Multi-Feature Analysis

2021 IEEE/CIC International Conference on Communications in China (ICCC) Pub Date : 2021-07-28 DOI:10.1109/iccc52777.2021.9580433

Xiaoyue Wan, Pengmin Li, Yuhuan Wang, Wei Wei, Liang Xiao

{"title":"Reinforcement Learning Based Accurate Detection of Malicious URLs with Multi-Feature Analysis","authors":"Xiaoyue Wan, Pengmin Li, Yuhuan Wang, Wei Wei, Liang Xiao","doi":"10.1109/iccc52777.2021.9580433","DOIUrl":null,"url":null,"abstract":"Malicious URLs result in malware installation, privacy leakage and illegal funding of mobile devices and computers. However, attackers frequently change domain names of URLs to avoid static detection and the malicious URL detection has to address variance in structure of domain names, which seriously degrades the detection accuracy in fixed detection policy selection and impedes optimal policy selection with theoretical analysis. In this paper, we propose an accurate detection of malicious URLs to protect Internet users from accessing malicious URLs, which designs a multi-feature analysis architecture to exploit lexical and content-based features and applies reinforcement learning (RL) to choose the detection mode and parameter. We provide a lightweight RL-based detection with transfer learning and a deep RL-based detection to further improve the detection accuracy for the server with sufficient computation resources. Malicious URLs that have specific domain name features including long numeric string or high percentage of the numeric character or alphabetic string without syllables are considered and simulation results show that this scheme improves the detection accuracy and increases the utility compared with the benchmark scheme.","PeriodicalId":425118,"journal":{"name":"2021 IEEE/CIC International Conference on Communications in China (ICCC)","volume":"105 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE/CIC International Conference on Communications in China (ICCC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/iccc52777.2021.9580433","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Malicious URLs result in malware installation, privacy leakage and illegal funding of mobile devices and computers. However, attackers frequently change domain names of URLs to avoid static detection and the malicious URL detection has to address variance in structure of domain names, which seriously degrades the detection accuracy in fixed detection policy selection and impedes optimal policy selection with theoretical analysis. In this paper, we propose an accurate detection of malicious URLs to protect Internet users from accessing malicious URLs, which designs a multi-feature analysis architecture to exploit lexical and content-based features and applies reinforcement learning (RL) to choose the detection mode and parameter. We provide a lightweight RL-based detection with transfer learning and a deep RL-based detection to further improve the detection accuracy for the server with sufficient computation resources. Malicious URLs that have specific domain name features including long numeric string or high percentage of the numeric character or alphabetic string without syllables are considered and simulation results show that this scheme improves the detection accuracy and increases the utility compared with the benchmark scheme.

查看原文本刊更多论文

基于多特征分析的强化学习恶意url准确检测

恶意url导致恶意软件安装，隐私泄露和非法资金的移动设备和计算机。然而，攻击者经常改变URL的域名以避免静态检测，恶意URL检测需要解决域名结构的变化，这严重降低了固定检测策略选择的检测精度，阻碍了理论分析的最优策略选择。在本文中，我们提出了一种准确检测恶意url的方法，以保护互联网用户免受恶意url的访问。该方法设计了一种多特征分析架构，利用词法特征和基于内容的特征，并应用强化学习(RL)来选择检测模式和参数。我们提供了一种基于迁移学习的轻量级rl检测和一种基于深度rl的检测，以进一步提高具有足够计算资源的服务器的检测精度。仿真结果表明，与基准方案相比，该方案提高了检测精度，提高了实用性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2021 IEEE/CIC International Conference on Communications in China (ICCC)

自引率

0.00%

发文量