{"title":"Pelican: Continual Adaptation for Phishing Detection","authors":"Wernsen Wong, G. Dobbie","doi":"10.1109/ICDMW51313.2020.00067","DOIUrl":null,"url":null,"abstract":"An increasing number of people are using social media services and with it comes a more attractive outlet for phishing attacks. Our initial focus is to analyze Twitter as it is one of the most popular social media services. Phishers on Twitter curate tweets that lead users to websites that download malware. This is a major issue as phishers can then gain access to the user's digital identity and perform malicious acts. Phishing attacks have the potential to be similar in different regions, perhaps at different times. We have developed a novel semi-supervised machine learning algorithm, which we call Pelican, that detects potential phishing attacks in real-time on Twitter. Pelican can be used for early detection of potential phishing attacks and is able to detect potential new attacks without pre-existing assumptions about the type of data or understanding of the characteristics of the attacks. The technique uses ensembles and sampling methods to handle class imbalances in real-world applications. The technique continuously detects unusual behaviour or changes in Twitter. We have investigated changes in trends across Twitter to detect changes in online behaviour of potential phishing links. The technique uses a change detector that enables automatic retraining when there is unusual behaviour detected. Pelican is a novel technique that adapts to changes within phishing attacks in real-time. The technique detects 93.94% of the phishing tweets in real-world data that we collected over a 9 month period, which is higher than benchmark algorithms.","PeriodicalId":426846,"journal":{"name":"2020 International Conference on Data Mining Workshops (ICDMW)","volume":"333 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 International Conference on Data Mining Workshops (ICDMW)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDMW51313.2020.00067","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
An increasing number of people are using social media services and with it comes a more attractive outlet for phishing attacks. Our initial focus is to analyze Twitter as it is one of the most popular social media services. Phishers on Twitter curate tweets that lead users to websites that download malware. This is a major issue as phishers can then gain access to the user's digital identity and perform malicious acts. Phishing attacks have the potential to be similar in different regions, perhaps at different times. We have developed a novel semi-supervised machine learning algorithm, which we call Pelican, that detects potential phishing attacks in real-time on Twitter. Pelican can be used for early detection of potential phishing attacks and is able to detect potential new attacks without pre-existing assumptions about the type of data or understanding of the characteristics of the attacks. The technique uses ensembles and sampling methods to handle class imbalances in real-world applications. The technique continuously detects unusual behaviour or changes in Twitter. We have investigated changes in trends across Twitter to detect changes in online behaviour of potential phishing links. The technique uses a change detector that enables automatic retraining when there is unusual behaviour detected. Pelican is a novel technique that adapts to changes within phishing attacks in real-time. The technique detects 93.94% of the phishing tweets in real-world data that we collected over a 9 month period, which is higher than benchmark algorithms.