Aya S. Noah, Naglaa E. Ghannam, Gaber A. Elsharawy, Abeer S. Desuky
{"title":"An Intelligent System for Detecting Fake Materials on the Internet","authors":"Aya S. Noah, Naglaa E. Ghannam, Gaber A. Elsharawy, Abeer S. Desuky","doi":"10.5815/ijmecs.2023.05.04","DOIUrl":null,"url":null,"abstract":"There has been a significant rise in internet usage in recent years, which has led to the presence of data theft and the diversity of counterfeit materials. This has resulted the proliferation of cybercrimes and the theft of personal data via social media, e-mail, and phishing websites that are similar to the websites commonly used to grab user data details like that of a credit card or login ID. Phishing, a prevalent form of cybercrime, poses a danger to online security through the theft of personal information, and with the emergence of the COVID-19 virus, which has led to people and organizations being drawn towards the Internet and many people and companies being forced to work remotely, it has led to an increase in the existing phishing threats. Previously, hackers took advantage of the situation to infiltrate the devices of people and companies in numerous ways, which caused huge financial losses and damage to organizations. Based on previous results and research, Machine Learning (ML) is selected by researchers as an efficient method for identifying malicious software web pages from original web pages. This paper presents 30 characteristics of websites, which are analyzed using a correlation matrix to determine the relationship between variables. Feature selection is performed through a wrapper method and Extra Tree Classifiers (ETC) to identify the top-ranked characteristics (Features) for website classification. To evaluate web pages, various machine learning techniques such as Random Forest Tree (RF), Multilayer Perceptron (MLP), Decision Tree (DT), and Support Vector Machine (SVM) are used. The results of monitoring indicate that MLP, a deep neural network, outperforms all other techniques in terms of performance.","PeriodicalId":36486,"journal":{"name":"International Journal of Modern Education and Computer Science","volume":"102 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Modern Education and Computer Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5815/ijmecs.2023.05.04","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"Social Sciences","Score":null,"Total":0}
引用次数: 0
Abstract
There has been a significant rise in internet usage in recent years, which has led to the presence of data theft and the diversity of counterfeit materials. This has resulted the proliferation of cybercrimes and the theft of personal data via social media, e-mail, and phishing websites that are similar to the websites commonly used to grab user data details like that of a credit card or login ID. Phishing, a prevalent form of cybercrime, poses a danger to online security through the theft of personal information, and with the emergence of the COVID-19 virus, which has led to people and organizations being drawn towards the Internet and many people and companies being forced to work remotely, it has led to an increase in the existing phishing threats. Previously, hackers took advantage of the situation to infiltrate the devices of people and companies in numerous ways, which caused huge financial losses and damage to organizations. Based on previous results and research, Machine Learning (ML) is selected by researchers as an efficient method for identifying malicious software web pages from original web pages. This paper presents 30 characteristics of websites, which are analyzed using a correlation matrix to determine the relationship between variables. Feature selection is performed through a wrapper method and Extra Tree Classifiers (ETC) to identify the top-ranked characteristics (Features) for website classification. To evaluate web pages, various machine learning techniques such as Random Forest Tree (RF), Multilayer Perceptron (MLP), Decision Tree (DT), and Support Vector Machine (SVM) are used. The results of monitoring indicate that MLP, a deep neural network, outperforms all other techniques in terms of performance.