{"title":"基于多视图逻辑回归的可疑URL过滤","authors":"Ke-Wei Su, Kuo-Ping Wu, Hahn-Ming Lee, Te-En Wei","doi":"10.1109/ASIAJCIS.2013.19","DOIUrl":null,"url":null,"abstract":"The current malicious URLs detecting techniques based on whole URL information are hard to detect the obfuscated malicious URLs. The most precise way to identify a malicious URL is verifying the corresponding web page contents. However, it costs very much in time, traffic and computing resource. Therefore, a filtering process that detecting more suspicious URLs which should be further verified is required in practice. In this work, we propose a suspicious URL filtering approach based on multi-view analysis in order to reduce the impact from URL obfuscation techniques. URLs are composed of several portions, each portion has a specific use. The proposed method intends to learn the characteristics from multiple portions (multi-view) of URLs for giving the suspicion level of each portion. Adjusting the suspicion threshold of each portion, the proposed system would select the most suspicious URLs. This work uses the real dataset from T. Co. to evaluate the proposed system. The requests from T. Co. are (1) detection rate should be less than 25%, (2) missing rate should be lower than 25%, and (3) the process with one hour data should be end in an hour. The experiment results show that our approach is effective, is capable to reserve more malicious URLs in the selected suspicious ones and satisfy the requests given by practical environment, such as T. Co. daily works.","PeriodicalId":286298,"journal":{"name":"2013 Eighth Asia Joint Conference on Information Security","volume":"78 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"19","resultStr":"{\"title\":\"Suspicious URL Filtering Based on Logistic Regression with Multi-view Analysis\",\"authors\":\"Ke-Wei Su, Kuo-Ping Wu, Hahn-Ming Lee, Te-En Wei\",\"doi\":\"10.1109/ASIAJCIS.2013.19\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The current malicious URLs detecting techniques based on whole URL information are hard to detect the obfuscated malicious URLs. The most precise way to identify a malicious URL is verifying the corresponding web page contents. However, it costs very much in time, traffic and computing resource. Therefore, a filtering process that detecting more suspicious URLs which should be further verified is required in practice. In this work, we propose a suspicious URL filtering approach based on multi-view analysis in order to reduce the impact from URL obfuscation techniques. URLs are composed of several portions, each portion has a specific use. The proposed method intends to learn the characteristics from multiple portions (multi-view) of URLs for giving the suspicion level of each portion. Adjusting the suspicion threshold of each portion, the proposed system would select the most suspicious URLs. This work uses the real dataset from T. Co. to evaluate the proposed system. The requests from T. Co. are (1) detection rate should be less than 25%, (2) missing rate should be lower than 25%, and (3) the process with one hour data should be end in an hour. The experiment results show that our approach is effective, is capable to reserve more malicious URLs in the selected suspicious ones and satisfy the requests given by practical environment, such as T. Co. daily works.\",\"PeriodicalId\":286298,\"journal\":{\"name\":\"2013 Eighth Asia Joint Conference on Information Security\",\"volume\":\"78 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2013-07-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"19\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2013 Eighth Asia Joint Conference on Information Security\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ASIAJCIS.2013.19\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 Eighth Asia Joint Conference on Information Security","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ASIAJCIS.2013.19","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Suspicious URL Filtering Based on Logistic Regression with Multi-view Analysis
The current malicious URLs detecting techniques based on whole URL information are hard to detect the obfuscated malicious URLs. The most precise way to identify a malicious URL is verifying the corresponding web page contents. However, it costs very much in time, traffic and computing resource. Therefore, a filtering process that detecting more suspicious URLs which should be further verified is required in practice. In this work, we propose a suspicious URL filtering approach based on multi-view analysis in order to reduce the impact from URL obfuscation techniques. URLs are composed of several portions, each portion has a specific use. The proposed method intends to learn the characteristics from multiple portions (multi-view) of URLs for giving the suspicion level of each portion. Adjusting the suspicion threshold of each portion, the proposed system would select the most suspicious URLs. This work uses the real dataset from T. Co. to evaluate the proposed system. The requests from T. Co. are (1) detection rate should be less than 25%, (2) missing rate should be lower than 25%, and (3) the process with one hour data should be end in an hour. The experiment results show that our approach is effective, is capable to reserve more malicious URLs in the selected suspicious ones and satisfy the requests given by practical environment, such as T. Co. daily works.