{"title":"Using mobile crawlers to search the Web efficiently","authors":"J. Hammer, Jan Fiedler","doi":"10.5555/543101.543105","DOIUrl":null,"url":null,"abstract":"Due to the enormous growth of the World Wide Web, search engines have become indispensable tools for Web navigation. In order to provide powerful search facilities, search engines maintain comprehensive indices for documents and their contents on the Web by continuously downloading Web pages for processing. In this paper, we demonstrate an alternative, more efficient approach to the “download-first process-later” strategy of existing search engines by using mobile crawlers. The major advantage of the mobile approach is that the analysis portion of the crawling process is done locally where the data resides rather than remotely inside the Web search engine. This can significantly reduce network load which, in turn, can improve the performance of the crawling process. In this report, we provide a detailed description of our architecture supporting mobile Web crawling and report on its novel features as well as the rational behind some of the important design decisions that were driving our development. In order to demonstrate the viability of our approach and to validate our mobile crawling architecture, we have implemented a prototype that uses the UF intranet as its testbed. Based on this experimental prototype, we conducted a detailed evaluation of the benefits of mobile Web crawling.","PeriodicalId":177607,"journal":{"name":"ACIS Int. J. Comput. Inf. Sci.","volume":"7 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2000-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"52","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACIS Int. J. Comput. Inf. Sci.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5555/543101.543105","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 52
Abstract
Due to the enormous growth of the World Wide Web, search engines have become indispensable tools for Web navigation. In order to provide powerful search facilities, search engines maintain comprehensive indices for documents and their contents on the Web by continuously downloading Web pages for processing. In this paper, we demonstrate an alternative, more efficient approach to the “download-first process-later” strategy of existing search engines by using mobile crawlers. The major advantage of the mobile approach is that the analysis portion of the crawling process is done locally where the data resides rather than remotely inside the Web search engine. This can significantly reduce network load which, in turn, can improve the performance of the crawling process. In this report, we provide a detailed description of our architecture supporting mobile Web crawling and report on its novel features as well as the rational behind some of the important design decisions that were driving our development. In order to demonstrate the viability of our approach and to validate our mobile crawling architecture, we have implemented a prototype that uses the UF intranet as its testbed. Based on this experimental prototype, we conducted a detailed evaluation of the benefits of mobile Web crawling.