{"title":"Similarity Computation of Web Pages of Focused Crawler","authors":"Hu Yu, Liu Bingwu, Yan Fang","doi":"10.1109/IFITA.2010.308","DOIUrl":null,"url":null,"abstract":"Due to the dynamic nature of the Web, it becomes harder to find relevant and recent information. More and more people begin to use focused crawler to get information in their special fields today. However, the Similarity Computation based text is incompetent, because the page consists of not only text but also multimedia contents, such as image, audio, video and so on. In the field of the focused crawler the page structure plays a key role in the similarity computation too. In this paper we introduce a new method to have similarity computation according the page structure and content which can make web page similarity computation exactly and crawling efficiently which will bring benefits for Web analysis and get information easily for users.","PeriodicalId":393802,"journal":{"name":"2010 International Forum on Information Technology and Applications","volume":"9 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 International Forum on Information Technology and Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IFITA.2010.308","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6
Abstract
Due to the dynamic nature of the Web, it becomes harder to find relevant and recent information. More and more people begin to use focused crawler to get information in their special fields today. However, the Similarity Computation based text is incompetent, because the page consists of not only text but also multimedia contents, such as image, audio, video and so on. In the field of the focused crawler the page structure plays a key role in the similarity computation too. In this paper we introduce a new method to have similarity computation according the page structure and content which can make web page similarity computation exactly and crawling efficiently which will bring benefits for Web analysis and get information easily for users.