{"title":"探索性实验,通过使用网络堆栈的特征来识别虚假网站","authors":"J. Koepke, S. Kaza, A. Abbasi","doi":"10.1109/ISI.2012.6284144","DOIUrl":null,"url":null,"abstract":"Users on the web are unknowingly becoming more susceptible to scams from cyber deviants and malicious websites. There has been much work in the identification of malicious websites using application layer features based on content (HTML, images, links, etc.) and a plethora of classification techniques. However, there has been little work on using features from the other layers in the Open Systems Interconnection (OSI) network stack. Capturing features from the transport and internet layers of the network stack based on responses to various Hypertext Transfer Protocol (HTTP) requests may allow for increased classification accuracy. In this paper, we use learning techniques (Winnow, Logit Regression, Naïve Bayes, J48, and Bayesian) utilizing these new features to identify fake pharmacy websites. The results show that using transport and Internet layer features yields an accuracy of 80% to 95% for detecting fake websites using standard machine learning algorithms. The results suggest that many organizations may be hosting multiple websites using shared code and hosting services to enable them to produce the maximum number of fraudulent websites.","PeriodicalId":199734,"journal":{"name":"2012 IEEE International Conference on Intelligence and Security Informatics","volume":"221 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":"{\"title\":\"Exploratory experiments to identify fake websites by using features from the network stack\",\"authors\":\"J. Koepke, S. Kaza, A. Abbasi\",\"doi\":\"10.1109/ISI.2012.6284144\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Users on the web are unknowingly becoming more susceptible to scams from cyber deviants and malicious websites. There has been much work in the identification of malicious websites using application layer features based on content (HTML, images, links, etc.) and a plethora of classification techniques. However, there has been little work on using features from the other layers in the Open Systems Interconnection (OSI) network stack. Capturing features from the transport and internet layers of the network stack based on responses to various Hypertext Transfer Protocol (HTTP) requests may allow for increased classification accuracy. In this paper, we use learning techniques (Winnow, Logit Regression, Naïve Bayes, J48, and Bayesian) utilizing these new features to identify fake pharmacy websites. The results show that using transport and Internet layer features yields an accuracy of 80% to 95% for detecting fake websites using standard machine learning algorithms. The results suggest that many organizations may be hosting multiple websites using shared code and hosting services to enable them to produce the maximum number of fraudulent websites.\",\"PeriodicalId\":199734,\"journal\":{\"name\":\"2012 IEEE International Conference on Intelligence and Security Informatics\",\"volume\":\"221 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2012-06-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2012 IEEE International Conference on Intelligence and Security Informatics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ISI.2012.6284144\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 IEEE International Conference on Intelligence and Security Informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISI.2012.6284144","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6
摘要
网络用户在不知不觉中变得更容易受到网络变态和恶意网站的欺骗。在使用基于内容(HTML、图像、链接等)的应用层特征和大量分类技术来识别恶意网站方面已经做了很多工作。然而,在使用开放系统互连(OSI)网络堆栈中其他层的特性方面,很少有工作。基于对各种超文本传输协议(Hypertext Transfer Protocol, HTTP)请求的响应,从网络堆栈的传输层和互联网层捕获特性,可以提高分类的准确性。在本文中,我们使用学习技术(Winnow, Logit Regression, Naïve贝叶斯,J48和贝叶斯)利用这些新特征来识别假冒药店网站。结果表明,使用传输和互联网层特征,使用标准机器学习算法检测虚假网站的准确率为80%至95%。结果表明,许多组织可能使用共享代码和托管服务托管多个网站,使他们能够产生最大数量的欺诈性网站。
Exploratory experiments to identify fake websites by using features from the network stack
Users on the web are unknowingly becoming more susceptible to scams from cyber deviants and malicious websites. There has been much work in the identification of malicious websites using application layer features based on content (HTML, images, links, etc.) and a plethora of classification techniques. However, there has been little work on using features from the other layers in the Open Systems Interconnection (OSI) network stack. Capturing features from the transport and internet layers of the network stack based on responses to various Hypertext Transfer Protocol (HTTP) requests may allow for increased classification accuracy. In this paper, we use learning techniques (Winnow, Logit Regression, Naïve Bayes, J48, and Bayesian) utilizing these new features to identify fake pharmacy websites. The results show that using transport and Internet layer features yields an accuracy of 80% to 95% for detecting fake websites using standard machine learning algorithms. The results suggest that many organizations may be hosting multiple websites using shared code and hosting services to enable them to produce the maximum number of fraudulent websites.