O. Vikas, Nitin Chiluka, Purushottam K. Ray, Girraj Meena, A. Meshram, Amit Gupta, Abhishek Sisodia
{"title":"WebMiner——剖析基于超级对等的增量主题特定网络爬虫","authors":"O. Vikas, Nitin Chiluka, Purushottam K. Ray, Girraj Meena, A. Meshram, Amit Gupta, Abhishek Sisodia","doi":"10.1109/ICN.2007.104","DOIUrl":null,"url":null,"abstract":"This paper introduces \"WebMiner\", a super-peer based P2P system for building an incremental topic-specific Web crawler. This develops a topic-based repository of Web pages that would later be used in the construction of ontologies. Current crawlers suffer from centralized architecture, having single point of failure and heavy load. Super-peer systems strike a balance between the inherent efficiency of centralized search and the autonomity, load balancing and robustness to attacks, provided by distributed search, with heterogeneity of capabilities across peers. In this paper, we discuss the architecture of WebMiner in detail including the construction of the super-peer overlay network and the working of the system, which includes feature of crawling the hidden Web.","PeriodicalId":117154,"journal":{"name":"Sixth International Conference on Networking (ICN'07)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2007-04-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":"{\"title\":\"WebMiner--Anatomy of Super Peer Based Incremental Topic-Specific Web Crawler\",\"authors\":\"O. Vikas, Nitin Chiluka, Purushottam K. Ray, Girraj Meena, A. Meshram, Amit Gupta, Abhishek Sisodia\",\"doi\":\"10.1109/ICN.2007.104\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper introduces \\\"WebMiner\\\", a super-peer based P2P system for building an incremental topic-specific Web crawler. This develops a topic-based repository of Web pages that would later be used in the construction of ontologies. Current crawlers suffer from centralized architecture, having single point of failure and heavy load. Super-peer systems strike a balance between the inherent efficiency of centralized search and the autonomity, load balancing and robustness to attacks, provided by distributed search, with heterogeneity of capabilities across peers. In this paper, we discuss the architecture of WebMiner in detail including the construction of the super-peer overlay network and the working of the system, which includes feature of crawling the hidden Web.\",\"PeriodicalId\":117154,\"journal\":{\"name\":\"Sixth International Conference on Networking (ICN'07)\",\"volume\":\"4 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2007-04-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"8\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Sixth International Conference on Networking (ICN'07)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICN.2007.104\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Sixth International Conference on Networking (ICN'07)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICN.2007.104","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
WebMiner--Anatomy of Super Peer Based Incremental Topic-Specific Web Crawler
This paper introduces "WebMiner", a super-peer based P2P system for building an incremental topic-specific Web crawler. This develops a topic-based repository of Web pages that would later be used in the construction of ontologies. Current crawlers suffer from centralized architecture, having single point of failure and heavy load. Super-peer systems strike a balance between the inherent efficiency of centralized search and the autonomity, load balancing and robustness to attacks, provided by distributed search, with heterogeneity of capabilities across peers. In this paper, we discuss the architecture of WebMiner in detail including the construction of the super-peer overlay network and the working of the system, which includes feature of crawling the hidden Web.