{"title":"Relevant document crawling with usage pattern and domain profile based page ranking","authors":"A. Gupta, A. Dixit, A. Sharma","doi":"10.1109/ICISCON.2013.6524186","DOIUrl":null,"url":null,"abstract":"WWW is a distributed heterogeneous information resource. With the exponential growth of WWW, it has become difficult to access desired information that matches with user needs and interest. In spite of strong crawling, indexing and page ranking techniques, the returned result-sets of the search engine lack in accuracy and preciseness. Large number of irrelevant links, topic drift, and load on servers are some of the other issues that need to be addressed towards developing an efficient search engine. In this paper a solution is being proposed for the development of a crawling technique that attempts to reduce server load by taking advantage of migrants for downloading the relevant pages; pertaining to a specific topic only. The downloaded documents are then ranked considering user preferences and past usage patterns of the web page thereby improving the quality of retuned result-sets.","PeriodicalId":216110,"journal":{"name":"2013 International Conference on Information Systems and Computer Networks","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-03-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 International Conference on Information Systems and Computer Networks","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICISCON.2013.6524186","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 10
Abstract
WWW is a distributed heterogeneous information resource. With the exponential growth of WWW, it has become difficult to access desired information that matches with user needs and interest. In spite of strong crawling, indexing and page ranking techniques, the returned result-sets of the search engine lack in accuracy and preciseness. Large number of irrelevant links, topic drift, and load on servers are some of the other issues that need to be addressed towards developing an efficient search engine. In this paper a solution is being proposed for the development of a crawling technique that attempts to reduce server load by taking advantage of migrants for downloading the relevant pages; pertaining to a specific topic only. The downloaded documents are then ranked considering user preferences and past usage patterns of the web page thereby improving the quality of retuned result-sets.