DaE2: Unmasking malicious URLs by leveraging diverse and efficient ensemble machine learning for online security

IF 4.8 2区 计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS
Abiodun Esther Omolara , Moatsum Alawida
{"title":"DaE2: Unmasking malicious URLs by leveraging diverse and efficient ensemble machine learning for online security","authors":"Abiodun Esther Omolara ,&nbsp;Moatsum Alawida","doi":"10.1016/j.cose.2024.104170","DOIUrl":null,"url":null,"abstract":"<div><div>Over 5.44 billion people now use the Internet, making it a vital part of daily life, enabling communication, e-commerce, education, and more. However, this huge Internet connectivity also raises concerns about online privacy and security, particularly with the rise of malicious Uniform Resource Locators (URLs). Recently, conventional ensemble models have attracted attention due to their notable benefits of reducing the variance in models, enhancing predictive performance, improving prediction accuracy, and demonstrating high generalization potential. But, its application in addressing the challenge of malicious URLs is still an open problem. These URLs often hide behind static links in emails or web pages, posing a threat to individuals and organizations. Despite blacklisting services, many harmful sites evade detection due to inadequate scrutiny or recent creation. Hence, to improve URL detection, a Diverse and Efficient Ensemble (DaE2) machine learning algorithm was developed using four ensemble models, that is, AdaBoost, Bagging, Stacking, and Voting to classify URLs. After preprocessing, the experimental result shown that all models achieved over 80 % accuracy, with AdaBoost reaching 98.5 % and Stacking offering the fastest runtime. AdaBoost and Bagging also delivered strong performance, with F1 scores of 0.980 and 0.976, respectively.</div></div>","PeriodicalId":51004,"journal":{"name":"Computers & Security","volume":null,"pages":null},"PeriodicalIF":4.8000,"publicationDate":"2024-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers & Security","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167404824004759","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

Abstract

Over 5.44 billion people now use the Internet, making it a vital part of daily life, enabling communication, e-commerce, education, and more. However, this huge Internet connectivity also raises concerns about online privacy and security, particularly with the rise of malicious Uniform Resource Locators (URLs). Recently, conventional ensemble models have attracted attention due to their notable benefits of reducing the variance in models, enhancing predictive performance, improving prediction accuracy, and demonstrating high generalization potential. But, its application in addressing the challenge of malicious URLs is still an open problem. These URLs often hide behind static links in emails or web pages, posing a threat to individuals and organizations. Despite blacklisting services, many harmful sites evade detection due to inadequate scrutiny or recent creation. Hence, to improve URL detection, a Diverse and Efficient Ensemble (DaE2) machine learning algorithm was developed using four ensemble models, that is, AdaBoost, Bagging, Stacking, and Voting to classify URLs. After preprocessing, the experimental result shown that all models achieved over 80 % accuracy, with AdaBoost reaching 98.5 % and Stacking offering the fastest runtime. AdaBoost and Bagging also delivered strong performance, with F1 scores of 0.980 and 0.976, respectively.
DaE2:利用多样化和高效的集合机器学习为在线安全揭开恶意 URL 的面纱
目前有超过 54.4 亿人使用互联网,互联网已成为日常生活的重要组成部分,使通信、电子商务、教育等成为可能。然而,巨大的互联网连接也引发了人们对网络隐私和安全的担忧,特别是随着恶意统一资源定位器(URL)的兴起。最近,传统的集合模型因其在减少模型方差、增强预测性能、提高预测准确性和展示高泛化潜力等方面的显著优势而备受关注。但是,它在应对恶意 URL 挑战方面的应用仍是一个未决问题。这些 URL 通常隐藏在电子邮件或网页的静态链接后面,对个人和组织构成威胁。尽管有黑名单服务,但许多有害网站由于审查不充分或最近才创建而逃避检测。因此,为了改进 URL 检测,我们开发了一种多样化高效集合(DaE2)机器学习算法,使用四种集合模型,即 AdaBoost、Bagging、Stacking 和 Voting 来对 URL 进行分类。预处理后的实验结果表明,所有模型的准确率都超过了 80%,其中 AdaBoost 的准确率达到了 98.5%,Stacking 的运行时间最快。AdaBoost 和 Bagging 的性能也很强,F1 分数分别为 0.980 和 0.976。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Computers & Security
Computers & Security 工程技术-计算机:信息系统
CiteScore
12.40
自引率
7.10%
发文量
365
审稿时长
10.7 months
期刊介绍: Computers & Security is the most respected technical journal in the IT security field. With its high-profile editorial board and informative regular features and columns, the journal is essential reading for IT security professionals around the world. Computers & Security provides you with a unique blend of leading edge research and sound practical management advice. It is aimed at the professional involved with computer security, audit, control and data integrity in all sectors - industry, commerce and academia. Recognized worldwide as THE primary source of reference for applied research and technical expertise it is your first step to fully secure systems.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信