{"title":"语义网技术启发的聚焦爬虫的调查","authors":"Hai Dong, F. Hussain, E. Chang","doi":"10.1109/ICDIM.2008.4746736","DOIUrl":null,"url":null,"abstract":"Crawlers are software which can traverse the Internet and retrieve Webpages by hyperlinks. In the face of the inundant spam Websites, traditional Web crawlers cannot function well to solve this problem. Semantic focused crawlers utilize semantic web technologies to analyze the semantics of hyperlinks and Web documents. This paper briefly reviews the recent studies on one category of semantic focused crawlers - ontology-based focused crawlers, which are a series of crawlers that utilize ontologies to link the fetched Web documents with the ontological concepts (topics). The purpose of this is to organize and categorize Web documents, or filtering irrelevant Webpages with regards to the topics. A brief comparison are made among these crawlers,from six perspectives - domain, working environment, special functions, technologies utilized, evaluation metrics and evaluation results. The conclusion with respect to this comparison is made in the final section.","PeriodicalId":415013,"journal":{"name":"2008 Third International Conference on Digital Information Management","volume":"11 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2008-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"26","resultStr":"{\"title\":\"A survey in semantic web technologies-inspired focused crawlers\",\"authors\":\"Hai Dong, F. Hussain, E. Chang\",\"doi\":\"10.1109/ICDIM.2008.4746736\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Crawlers are software which can traverse the Internet and retrieve Webpages by hyperlinks. In the face of the inundant spam Websites, traditional Web crawlers cannot function well to solve this problem. Semantic focused crawlers utilize semantic web technologies to analyze the semantics of hyperlinks and Web documents. This paper briefly reviews the recent studies on one category of semantic focused crawlers - ontology-based focused crawlers, which are a series of crawlers that utilize ontologies to link the fetched Web documents with the ontological concepts (topics). The purpose of this is to organize and categorize Web documents, or filtering irrelevant Webpages with regards to the topics. A brief comparison are made among these crawlers,from six perspectives - domain, working environment, special functions, technologies utilized, evaluation metrics and evaluation results. The conclusion with respect to this comparison is made in the final section.\",\"PeriodicalId\":415013,\"journal\":{\"name\":\"2008 Third International Conference on Digital Information Management\",\"volume\":\"11 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2008-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"26\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2008 Third International Conference on Digital Information Management\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICDIM.2008.4746736\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2008 Third International Conference on Digital Information Management","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDIM.2008.4746736","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A survey in semantic web technologies-inspired focused crawlers
Crawlers are software which can traverse the Internet and retrieve Webpages by hyperlinks. In the face of the inundant spam Websites, traditional Web crawlers cannot function well to solve this problem. Semantic focused crawlers utilize semantic web technologies to analyze the semantics of hyperlinks and Web documents. This paper briefly reviews the recent studies on one category of semantic focused crawlers - ontology-based focused crawlers, which are a series of crawlers that utilize ontologies to link the fetched Web documents with the ontological concepts (topics). The purpose of this is to organize and categorize Web documents, or filtering irrelevant Webpages with regards to the topics. A brief comparison are made among these crawlers,from six perspectives - domain, working environment, special functions, technologies utilized, evaluation metrics and evaluation results. The conclusion with respect to this comparison is made in the final section.