使用监督机器学习算法根据网络特征对恶意和良性网站进行分类

S. Kaddoura
{"title":"使用监督机器学习算法根据网络特征对恶意和良性网站进行分类","authors":"S. Kaddoura","doi":"10.1109/CSNet52717.2021.9614273","DOIUrl":null,"url":null,"abstract":"Due to the increase in Internet usage through the past years, cyber-attacks have rapidly increased, leading to high personal information and financial loss. Cyberattacks can include phishing, spamming, and malware. Because websites, the most common element of the Internet, are widely used, hackers find their targets to attack. Therefore, the detection of malicious websites is critical for organizations and individuals to increase security. The earlier a malicious website is detected, the faster it is defended. In this paper, a dataset is analyzed and applied to multiple supervised machine learning models such as Random Forest, Artificial Neural Network, K-nearest neighbors, and Support Vector Machine. The dataset attributes are extracted based on the application layer and different network characteristics. The experimental studies with many benign and malicious websites obtained from real-life Internet resources show a high prediction performance. Due to the imbalanced dataset studied in this paper, the F1-score was measured instead of the accuracy. The support vector machine algorithm showed the highest performance over all the other algorithms studied, with a value of 92%.","PeriodicalId":360654,"journal":{"name":"2021 5th Cyber Security in Networking Conference (CSNet)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Classification of malicious and benign websites by network features using supervised machine learning algorithms\",\"authors\":\"S. Kaddoura\",\"doi\":\"10.1109/CSNet52717.2021.9614273\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Due to the increase in Internet usage through the past years, cyber-attacks have rapidly increased, leading to high personal information and financial loss. Cyberattacks can include phishing, spamming, and malware. Because websites, the most common element of the Internet, are widely used, hackers find their targets to attack. Therefore, the detection of malicious websites is critical for organizations and individuals to increase security. The earlier a malicious website is detected, the faster it is defended. In this paper, a dataset is analyzed and applied to multiple supervised machine learning models such as Random Forest, Artificial Neural Network, K-nearest neighbors, and Support Vector Machine. The dataset attributes are extracted based on the application layer and different network characteristics. The experimental studies with many benign and malicious websites obtained from real-life Internet resources show a high prediction performance. Due to the imbalanced dataset studied in this paper, the F1-score was measured instead of the accuracy. The support vector machine algorithm showed the highest performance over all the other algorithms studied, with a value of 92%.\",\"PeriodicalId\":360654,\"journal\":{\"name\":\"2021 5th Cyber Security in Networking Conference (CSNet)\",\"volume\":\"4 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-10-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 5th Cyber Security in Networking Conference (CSNet)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CSNet52717.2021.9614273\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 5th Cyber Security in Networking Conference (CSNet)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CSNet52717.2021.9614273","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4

摘要

由于过去几年互联网使用量的增加,网络攻击迅速增加,导致大量个人信息和经济损失。网络攻击包括网络钓鱼、垃圾邮件和恶意软件。因为网站是互联网最常见的元素,被广泛使用,黑客找到了他们的攻击目标。因此,检测恶意网站对于组织和个人提高安全性至关重要。恶意网站越早被发现,防御就越快。本文分析了一个数据集,并将其应用于随机森林、人工神经网络、k近邻和支持向量机等多种监督机器学习模型。基于应用层和不同网络特征提取数据集属性。通过对真实互联网资源中获取的大量良性和恶意网站的实验研究,显示出较高的预测性能。由于本文所研究的数据不平衡,我们测量的是f1得分,而不是准确性。支持向量机算法在所有其他算法中表现出最高的性能,其值为92%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Classification of malicious and benign websites by network features using supervised machine learning algorithms
Due to the increase in Internet usage through the past years, cyber-attacks have rapidly increased, leading to high personal information and financial loss. Cyberattacks can include phishing, spamming, and malware. Because websites, the most common element of the Internet, are widely used, hackers find their targets to attack. Therefore, the detection of malicious websites is critical for organizations and individuals to increase security. The earlier a malicious website is detected, the faster it is defended. In this paper, a dataset is analyzed and applied to multiple supervised machine learning models such as Random Forest, Artificial Neural Network, K-nearest neighbors, and Support Vector Machine. The dataset attributes are extracted based on the application layer and different network characteristics. The experimental studies with many benign and malicious websites obtained from real-life Internet resources show a high prediction performance. Due to the imbalanced dataset studied in this paper, the F1-score was measured instead of the accuracy. The support vector machine algorithm showed the highest performance over all the other algorithms studied, with a value of 92%.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信