{"title":"Investigating the effect of feature selection and dimensionality reduction on phishing website classification problem","authors":"Pradeep Singh, Niti Jain, Ambar Maini","doi":"10.1109/NGCT.2015.7375147","DOIUrl":null,"url":null,"abstract":"Phishing is a term given to the method of gaining unauthorized access to a person's private information like passwords, account or credit card details. It is a deception technique that utilizes social engineering & technology to convince a victim to provide personal information, usually for monetary benefits. Phishing attacks have become frequent and involve the risk of identity theft and financial losses. Detection of phishing website has become very important for online banking and e-commerce users. We proposed an effective model that is based on preprocessing (Feature selection and dimensionality reduction) and classification DataMining algorithms. These algorithms were used to characterize and identify all the factors to classify the phishing website. We implemented five different classification algorithm and four preprocessing techniques to classify a websites legitimate or phishy. We also compared their respective performances in terms of accuracy and AUC.","PeriodicalId":216294,"journal":{"name":"2015 1st International Conference on Next Generation Computing Technologies (NGCT)","volume":"148 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 1st International Conference on Next Generation Computing Technologies (NGCT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/NGCT.2015.7375147","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 11
Abstract
Phishing is a term given to the method of gaining unauthorized access to a person's private information like passwords, account or credit card details. It is a deception technique that utilizes social engineering & technology to convince a victim to provide personal information, usually for monetary benefits. Phishing attacks have become frequent and involve the risk of identity theft and financial losses. Detection of phishing website has become very important for online banking and e-commerce users. We proposed an effective model that is based on preprocessing (Feature selection and dimensionality reduction) and classification DataMining algorithms. These algorithms were used to characterize and identify all the factors to classify the phishing website. We implemented five different classification algorithm and four preprocessing techniques to classify a websites legitimate or phishy. We also compared their respective performances in terms of accuracy and AUC.