{"title":"使用智能算法的欺诈电子邮件检测:传统方法与深度学习技术的比较","authors":"Yunus Korkmaz","doi":"10.1016/j.ipm.2025.104416","DOIUrl":null,"url":null,"abstract":"<div><div>Fraud emails pose a persistent cybersecurity threat by tricking recipients into disclosing sensitive information. This study evaluates and compares the performance of traditional machine learning and deep learning techniques for fraud email detection using a publicly available dataset containing 17,538 emails. Features were extracted using Term Frequency-Inverse Document Frequency (TF-IDF). Traditional models including Naive Bayes, Logistic Regression, XGBoost, and Random Forest achieved up to 98.52 % accuracy, while deep learning models like Bi-LSTM and GRU reached a maximum accuracy of 97.71 %. Evaluation metrics such as confusion matrices, ROC curves, and AUC scores were used for comprehensive performance comparison. Results demonstrate that traditional models can outperform deep learning models on text-based email data with proper feature engineering, offering efficient and scalable solutions for fraud detection systems.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"63 2","pages":"Article 104416"},"PeriodicalIF":6.9000,"publicationDate":"2025-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Fraud E-mail detection using intelligent algorithms: Comparison of traditional approaches with deep learning techniques\",\"authors\":\"Yunus Korkmaz\",\"doi\":\"10.1016/j.ipm.2025.104416\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Fraud emails pose a persistent cybersecurity threat by tricking recipients into disclosing sensitive information. This study evaluates and compares the performance of traditional machine learning and deep learning techniques for fraud email detection using a publicly available dataset containing 17,538 emails. Features were extracted using Term Frequency-Inverse Document Frequency (TF-IDF). Traditional models including Naive Bayes, Logistic Regression, XGBoost, and Random Forest achieved up to 98.52 % accuracy, while deep learning models like Bi-LSTM and GRU reached a maximum accuracy of 97.71 %. Evaluation metrics such as confusion matrices, ROC curves, and AUC scores were used for comprehensive performance comparison. Results demonstrate that traditional models can outperform deep learning models on text-based email data with proper feature engineering, offering efficient and scalable solutions for fraud detection systems.</div></div>\",\"PeriodicalId\":50365,\"journal\":{\"name\":\"Information Processing & Management\",\"volume\":\"63 2\",\"pages\":\"Article 104416\"},\"PeriodicalIF\":6.9000,\"publicationDate\":\"2025-09-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Information Processing & Management\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0306457325003577\",\"RegionNum\":1,\"RegionCategory\":\"管理学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Processing & Management","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0306457325003577","RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
Fraud E-mail detection using intelligent algorithms: Comparison of traditional approaches with deep learning techniques
Fraud emails pose a persistent cybersecurity threat by tricking recipients into disclosing sensitive information. This study evaluates and compares the performance of traditional machine learning and deep learning techniques for fraud email detection using a publicly available dataset containing 17,538 emails. Features were extracted using Term Frequency-Inverse Document Frequency (TF-IDF). Traditional models including Naive Bayes, Logistic Regression, XGBoost, and Random Forest achieved up to 98.52 % accuracy, while deep learning models like Bi-LSTM and GRU reached a maximum accuracy of 97.71 %. Evaluation metrics such as confusion matrices, ROC curves, and AUC scores were used for comprehensive performance comparison. Results demonstrate that traditional models can outperform deep learning models on text-based email data with proper feature engineering, offering efficient and scalable solutions for fraud detection systems.
期刊介绍:
Information Processing and Management is dedicated to publishing cutting-edge original research at the convergence of computing and information science. Our scope encompasses theory, methods, and applications across various domains, including advertising, business, health, information science, information technology marketing, and social computing.
We aim to cater to the interests of both primary researchers and practitioners by offering an effective platform for the timely dissemination of advanced and topical issues in this interdisciplinary field. The journal places particular emphasis on original research articles, research survey articles, research method articles, and articles addressing critical applications of research. Join us in advancing knowledge and innovation at the intersection of computing and information science.