{"title":"基于语义的Web过滤高精度自主分散URL分类系统","authors":"Khalid Mahmood, Hironao Takahashi, Asif Raza, Asma Qaiser, Aadil Farooqui","doi":"10.1109/ISADS.2015.34","DOIUrl":null,"url":null,"abstract":"Currently cyberspace has got about one billion registered websites, and it is imperative to accurately categorize voluminous number of website/URLs for the purpose of URL filtering and marketing segmentation. This paper presents autonomous decentralized semantic based large-scale URL/web classification system for web filtering using Yago2s and DS-onto knowledgebase. As many predefined categories are highly overlapping or semantically similar, proposed word sense disambiguation algorithm along with inference engine design brings high accuracy for classification of URLs in to 120 different categories. Evaluation results show that it achieves 90-93% of accuracy which is much higher than that obtained by currently used URL classification systems.","PeriodicalId":282286,"journal":{"name":"2015 IEEE Twelfth International Symposium on Autonomous Decentralized Systems","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":"{\"title\":\"Semantic Based Highly Accurate Autonomous Decentralized URL Classification System for Web Filtering\",\"authors\":\"Khalid Mahmood, Hironao Takahashi, Asif Raza, Asma Qaiser, Aadil Farooqui\",\"doi\":\"10.1109/ISADS.2015.34\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Currently cyberspace has got about one billion registered websites, and it is imperative to accurately categorize voluminous number of website/URLs for the purpose of URL filtering and marketing segmentation. This paper presents autonomous decentralized semantic based large-scale URL/web classification system for web filtering using Yago2s and DS-onto knowledgebase. As many predefined categories are highly overlapping or semantically similar, proposed word sense disambiguation algorithm along with inference engine design brings high accuracy for classification of URLs in to 120 different categories. Evaluation results show that it achieves 90-93% of accuracy which is much higher than that obtained by currently used URL classification systems.\",\"PeriodicalId\":282286,\"journal\":{\"name\":\"2015 IEEE Twelfth International Symposium on Autonomous Decentralized Systems\",\"volume\":\"10 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-03-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2015 IEEE Twelfth International Symposium on Autonomous Decentralized Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ISADS.2015.34\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 IEEE Twelfth International Symposium on Autonomous Decentralized Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISADS.2015.34","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Semantic Based Highly Accurate Autonomous Decentralized URL Classification System for Web Filtering
Currently cyberspace has got about one billion registered websites, and it is imperative to accurately categorize voluminous number of website/URLs for the purpose of URL filtering and marketing segmentation. This paper presents autonomous decentralized semantic based large-scale URL/web classification system for web filtering using Yago2s and DS-onto knowledgebase. As many predefined categories are highly overlapping or semantically similar, proposed word sense disambiguation algorithm along with inference engine design brings high accuracy for classification of URLs in to 120 different categories. Evaluation results show that it achieves 90-93% of accuracy which is much higher than that obtained by currently used URL classification systems.