An Improved Dandelion Optimizer Algorithm for Spam Detection: Next-Generation Email Filtering System

IF 2.6 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

Computers Pub Date : 2023-09-28 DOI:10.3390/computers12100196

Mohammad Tubishat, Feras Al-Obeidat, Ali Safaa Sadiq, Seyedali Mirjalili

{"title":"An Improved Dandelion Optimizer Algorithm for Spam Detection: Next-Generation Email Filtering System","authors":"Mohammad Tubishat, Feras Al-Obeidat, Ali Safaa Sadiq, Seyedali Mirjalili","doi":"10.3390/computers12100196","DOIUrl":null,"url":null,"abstract":"Spam emails have become a pervasive issue in recent years, as internet users receive increasing amounts of unwanted or fake emails. To combat this issue, automatic spam detection methods have been proposed, which aim to classify emails into spam and non-spam categories. Machine learning techniques have been utilized for this task with considerable success. In this paper, we introduce a novel approach to spam email detection by presenting significant advancements to the Dandelion Optimizer (DO) algorithm. The DO is a relatively new nature-inspired optimization algorithm inspired by the flight of dandelion seeds. While the DO shows promise, it faces challenges, especially in high-dimensional problems such as feature selection for spam detection. Our primary contributions focus on enhancing the DO algorithm. Firstly, we introduce a new local search algorithm based on flipping (LSAF), designed to improve the DO’s ability to find the best solutions. Secondly, we propose a reduction equation that streamlines the population size during algorithm execution, reducing computational complexity. To showcase the effectiveness of our modified DO algorithm, which we refer to as the Improved DO (IDO), we conduct a comprehensive evaluation using the Spam base dataset from the UCI repository. However, we emphasize that our primary objective is to advance the DO algorithm, with spam email detection serving as a case study application. Comparative analysis against several popular algorithms, including Particle Swarm Optimization (PSO), the Genetic Algorithm (GA), Generalized Normal Distribution Optimization (GNDO), the Chimp Optimization Algorithm (ChOA), the Grasshopper Optimization Algorithm (GOA), Ant Lion Optimizer (ALO), and the Dragonfly Algorithm (DA), demonstrates the superior performance of our proposed IDO algorithm. It excels in accuracy, fitness, and the number of selected features, among other metrics. Our results clearly indicate that the IDO overcomes the local optima problem commonly associated with the standard DO algorithm, owing to the incorporation of LSAF and the reduction in equation methods. In summary, our paper underscores the significant advancement made in the form of the IDO algorithm, which represents a promising approach for solving high-dimensional optimization problems, with a keen focus on practical applications in real-world systems. While we employ spam email detection as a case study, our primary contribution lies in the improved DO algorithm, which is efficient, accurate, and outperforms several state-of-the-art algorithms in various metrics. This work opens avenues for enhancing optimization techniques and their applications in machine learning.","PeriodicalId":46292,"journal":{"name":"Computers","volume":"98 1","pages":"0"},"PeriodicalIF":2.6000,"publicationDate":"2023-09-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/computers12100196","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}

引用次数: 0

Abstract

Spam emails have become a pervasive issue in recent years, as internet users receive increasing amounts of unwanted or fake emails. To combat this issue, automatic spam detection methods have been proposed, which aim to classify emails into spam and non-spam categories. Machine learning techniques have been utilized for this task with considerable success. In this paper, we introduce a novel approach to spam email detection by presenting significant advancements to the Dandelion Optimizer (DO) algorithm. The DO is a relatively new nature-inspired optimization algorithm inspired by the flight of dandelion seeds. While the DO shows promise, it faces challenges, especially in high-dimensional problems such as feature selection for spam detection. Our primary contributions focus on enhancing the DO algorithm. Firstly, we introduce a new local search algorithm based on flipping (LSAF), designed to improve the DO’s ability to find the best solutions. Secondly, we propose a reduction equation that streamlines the population size during algorithm execution, reducing computational complexity. To showcase the effectiveness of our modified DO algorithm, which we refer to as the Improved DO (IDO), we conduct a comprehensive evaluation using the Spam base dataset from the UCI repository. However, we emphasize that our primary objective is to advance the DO algorithm, with spam email detection serving as a case study application. Comparative analysis against several popular algorithms, including Particle Swarm Optimization (PSO), the Genetic Algorithm (GA), Generalized Normal Distribution Optimization (GNDO), the Chimp Optimization Algorithm (ChOA), the Grasshopper Optimization Algorithm (GOA), Ant Lion Optimizer (ALO), and the Dragonfly Algorithm (DA), demonstrates the superior performance of our proposed IDO algorithm. It excels in accuracy, fitness, and the number of selected features, among other metrics. Our results clearly indicate that the IDO overcomes the local optima problem commonly associated with the standard DO algorithm, owing to the incorporation of LSAF and the reduction in equation methods. In summary, our paper underscores the significant advancement made in the form of the IDO algorithm, which represents a promising approach for solving high-dimensional optimization problems, with a keen focus on practical applications in real-world systems. While we employ spam email detection as a case study, our primary contribution lies in the improved DO algorithm, which is efficient, accurate, and outperforms several state-of-the-art algorithms in various metrics. This work opens avenues for enhancing optimization techniques and their applications in machine learning.

查看原文本刊更多论文

一种用于垃圾邮件检测的改进蒲公英优化算法:下一代电子邮件过滤系统

近年来，随着互联网用户收到越来越多不想要的或虚假的电子邮件，垃圾邮件已经成为一个普遍存在的问题。为了解决这个问题，已经提出了自动垃圾邮件检测方法，其目的是将电子邮件分为垃圾邮件和非垃圾邮件两类。机器学习技术已经被用于这项任务，并取得了相当大的成功。在本文中，我们通过介绍蒲公英优化器(DO)算法的重大进步，介绍了一种新的垃圾邮件检测方法。DO是一种相对较新的受蒲公英种子飞行启发的自然优化算法。虽然DO显示出了希望，但它也面临着挑战，特别是在高维问题中，如垃圾邮件检测的特征选择。我们的主要贡献集中在增强DO算法上。首先，我们引入了一种新的基于翻转的局部搜索算法(LSAF)，旨在提高DO找到最优解的能力。其次，我们提出了一个简化算法执行过程中种群大小的简化方程，降低了计算复杂度。为了展示改进后的DO算法(我们称之为改进DO (IDO))的有效性，我们使用UCI存储库中的垃圾邮件基础数据集进行了全面评估。然而，我们强调，我们的主要目标是推进DO算法，垃圾邮件检测作为一个案例研究应用。通过与粒子群优化(PSO)、遗传算法(GA)、广义正态分布优化(GNDO)、黑猩猩优化算法(ChOA)、蚱蜢优化算法(GOA)、蚂蚁狮子优化器(ALO)和蜻蜓算法(DA)等几种常用算法的比较分析，证明了本文提出的IDO算法具有优越的性能。它在准确性、适应性和所选特征的数量以及其他指标方面表现出色。我们的结果清楚地表明，IDO克服了通常与标准DO算法相关的局部最优问题，这是由于纳入了LSAF和简化了方程方法。总之，我们的论文强调了IDO算法的重大进步，它代表了一种解决高维优化问题的有前途的方法，并热切关注在现实世界系统中的实际应用。虽然我们采用垃圾邮件检测作为案例研究，但我们的主要贡献在于改进的DO算法，该算法高效、准确，并且在各种指标上优于几种最先进的算法。这项工作为增强优化技术及其在机器学习中的应用开辟了道路。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊