{"title":"Online Library Content Generation Using Focused Crawling Based Upon Meta Tags and Tf-Idf","authors":"Mukesh Kumar, R. Vig","doi":"10.1109/ISCBI.2013.73","DOIUrl":null,"url":null,"abstract":"Electronic library is the collection of digital information related to an individual domain and in turn to all domains. A focused crawler traverses the Web looking for the pages most relevant to a domain and at the same time discarding the irrelevant pages and hence is helpful for generating the-e contents for digital library related to a particular domain. In this paper a focused crawling technique to generate online contents for e-library is proposed. The applicability of the proposed approach is shown by retrieving the documents which are highly related to a single domain. The quality of the pages included into the library is derived from the relevancy measure of the page with the content of domain related pages.","PeriodicalId":311471,"journal":{"name":"2013 International Symposium on Computational and Business Intelligence","volume":"13 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 International Symposium on Computational and Business Intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISCBI.2013.73","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Electronic library is the collection of digital information related to an individual domain and in turn to all domains. A focused crawler traverses the Web looking for the pages most relevant to a domain and at the same time discarding the irrelevant pages and hence is helpful for generating the-e contents for digital library related to a particular domain. In this paper a focused crawling technique to generate online contents for e-library is proposed. The applicability of the proposed approach is shown by retrieving the documents which are highly related to a single domain. The quality of the pages included into the library is derived from the relevancy measure of the page with the content of domain related pages.