使用监督机器学习算法根据网络特征对恶意和良性网站进行分类

2021 5th Cyber Security in Networking Conference (CSNet) Pub Date : 2021-10-12 DOI:10.1109/CSNet52717.2021.9614273

S. Kaddoura

{"title":"使用监督机器学习算法根据网络特征对恶意和良性网站进行分类","authors":"S. Kaddoura","doi":"10.1109/CSNet52717.2021.9614273","DOIUrl":null,"url":null,"abstract":"Due to the increase in Internet usage through the past years, cyber-attacks have rapidly increased, leading to high personal information and financial loss. Cyberattacks can include phishing, spamming, and malware. Because websites, the most common element of the Internet, are widely used, hackers find their targets to attack. Therefore, the detection of malicious websites is critical for organizations and individuals to increase security. The earlier a malicious website is detected, the faster it is defended. In this paper, a dataset is analyzed and applied to multiple supervised machine learning models such as Random Forest, Artificial Neural Network, K-nearest neighbors, and Support Vector Machine. The dataset attributes are extracted based on the application layer and different network characteristics. The experimental studies with many benign and malicious websites obtained from real-life Internet resources show a high prediction performance. Due to the imbalanced dataset studied in this paper, the F1-score was measured instead of the accuracy. The support vector machine algorithm showed the highest performance over all the other algorithms studied, with a value of 92%.","PeriodicalId":360654,"journal":{"name":"2021 5th Cyber Security in Networking Conference (CSNet)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Classification of malicious and benign websites by network features using supervised machine learning algorithms\",\"authors\":\"S. Kaddoura\",\"doi\":\"10.1109/CSNet52717.2021.9614273\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Due to the increase in Internet usage through the past years, cyber-attacks have rapidly increased, leading to high personal information and financial loss. Cyberattacks can include phishing, spamming, and malware. Because websites, the most common element of the Internet, are widely used, hackers find their targets to attack. Therefore, the detection of malicious websites is critical for organizations and individuals to increase security. The earlier a malicious website is detected, the faster it is defended. In this paper, a dataset is analyzed and applied to multiple supervised machine learning models such as Random Forest, Artificial Neural Network, K-nearest neighbors, and Support Vector Machine. The dataset attributes are extracted based on the application layer and different network characteristics. The experimental studies with many benign and malicious websites obtained from real-life Internet resources show a high prediction performance. Due to the imbalanced dataset studied in this paper, the F1-score was measured instead of the accuracy. The support vector machine algorithm showed the highest performance over all the other algorithms studied, with a value of 92%.\",\"PeriodicalId\":360654,\"journal\":{\"name\":\"2021 5th Cyber Security in Networking Conference (CSNet)\",\"volume\":\"4 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-10-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 5th Cyber Security in Networking Conference (CSNet)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CSNet52717.2021.9614273\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 5th Cyber Security in Networking Conference (CSNet)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CSNet52717.2021.9614273","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 4

摘要

由于过去几年互联网使用量的增加，网络攻击迅速增加，导致大量个人信息和经济损失。网络攻击包括网络钓鱼、垃圾邮件和恶意软件。因为网站是互联网最常见的元素，被广泛使用，黑客找到了他们的攻击目标。因此，检测恶意网站对于组织和个人提高安全性至关重要。恶意网站越早被发现，防御就越快。本文分析了一个数据集，并将其应用于随机森林、人工神经网络、k近邻和支持向量机等多种监督机器学习模型。基于应用层和不同网络特征提取数据集属性。通过对真实互联网资源中获取的大量良性和恶意网站的实验研究，显示出较高的预测性能。由于本文所研究的数据不平衡，我们测量的是f1得分，而不是准确性。支持向量机算法在所有其他算法中表现出最高的性能，其值为92%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Classification of malicious and benign websites by network features using supervised machine learning algorithms

Due to the increase in Internet usage through the past years, cyber-attacks have rapidly increased, leading to high personal information and financial loss. Cyberattacks can include phishing, spamming, and malware. Because websites, the most common element of the Internet, are widely used, hackers find their targets to attack. Therefore, the detection of malicious websites is critical for organizations and individuals to increase security. The earlier a malicious website is detected, the faster it is defended. In this paper, a dataset is analyzed and applied to multiple supervised machine learning models such as Random Forest, Artificial Neural Network, K-nearest neighbors, and Support Vector Machine. The dataset attributes are extracted based on the application layer and different network characteristics. The experimental studies with many benign and malicious websites obtained from real-life Internet resources show a high prediction performance. Due to the imbalanced dataset studied in this paper, the F1-score was measured instead of the accuracy. The support vector machine algorithm showed the highest performance over all the other algorithms studied, with a value of 92%.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2021 5th Cyber Security in Networking Conference (CSNet)

自引率

0.00%

发文量