基于生成对抗网络的URL生成识别方法

International Conference on Signal Processing and Communication Security Pub Date : 2022-11-02 DOI:10.1117/12.2655361

Zong-Rong Li, Denghui Ma, Nanfang Li, Xiang Li, Ning Zhang, H. Cao

{"title":"基于生成对抗网络的URL生成识别方法","authors":"Zong-Rong Li, Denghui Ma, Nanfang Li, Xiang Li, Ning Zhang, H. Cao","doi":"10.1117/12.2655361","DOIUrl":null,"url":null,"abstract":"The currently used URL identification methods require a large number of tags. The malicious URLs update quickly, and it is difficult to collect enough comprehensive URL tags, resulting in unstable identification accuracy. After calculating the boundary similarity of the URL string, the Skip-Gram model is used to embed the URL. The processed word vector is used as the generator input of the semi-supervised learning GAN to obtain the malicious URL identification result. The experimental results show that the accuracy of the URL recognition using GAN is higher than 96%, the fluctuation of the F1 value is small, and the recognition results are more reliable.","PeriodicalId":105577,"journal":{"name":"International Conference on Signal Processing and Communication Security","volume":"47 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"URL generative recognition method based on generative countermeasure network\",\"authors\":\"Zong-Rong Li, Denghui Ma, Nanfang Li, Xiang Li, Ning Zhang, H. Cao\",\"doi\":\"10.1117/12.2655361\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The currently used URL identification methods require a large number of tags. The malicious URLs update quickly, and it is difficult to collect enough comprehensive URL tags, resulting in unstable identification accuracy. After calculating the boundary similarity of the URL string, the Skip-Gram model is used to embed the URL. The processed word vector is used as the generator input of the semi-supervised learning GAN to obtain the malicious URL identification result. The experimental results show that the accuracy of the URL recognition using GAN is higher than 96%, the fluctuation of the F1 value is small, and the recognition results are more reliable.\",\"PeriodicalId\":105577,\"journal\":{\"name\":\"International Conference on Signal Processing and Communication Security\",\"volume\":\"47 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-11-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Conference on Signal Processing and Communication Security\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1117/12.2655361\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Conference on Signal Processing and Communication Security","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1117/12.2655361","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

目前使用的URL识别方法需要大量的标签。恶意URL更新速度快，难以收集到足够全面的URL标签，导致识别精度不稳定。在计算URL字符串的边界相似度后，使用Skip-Gram模型嵌入URL。将处理后的词向量作为半监督学习GAN的生成器输入，获得恶意URL识别结果。实验结果表明，利用GAN进行URL识别的准确率高于96%，F1值波动较小，识别结果更加可靠。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

URL generative recognition method based on generative countermeasure network

The currently used URL identification methods require a large number of tags. The malicious URLs update quickly, and it is difficult to collect enough comprehensive URL tags, resulting in unstable identification accuracy. After calculating the boundary similarity of the URL string, the Skip-Gram model is used to embed the URL. The processed word vector is used as the generator input of the semi-supervised learning GAN to obtain the malicious URL identification result. The experimental results show that the accuracy of the URL recognition using GAN is higher than 96%, the fluctuation of the F1 value is small, and the recognition results are more reliable.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

International Conference on Signal Processing and Communication Security

自引率

0.00%

发文量