{"title":"Similarity Computation of Web Pages","authors":"Peng Shi, Lianhong Ding, Bingwu Liu","doi":"10.1109/KAMW.2008.4810606","DOIUrl":null,"url":null,"abstract":"Web page is the main contents on the World Wide Web. Similarity of Web pages is very helpful for Web content analysis. Text similarity, usually called similarity computation, has been investigated for decades in artificial intelligence area. Some similarity computation methods have been used to compare Web pages. However, text based similarity computation methods are incompetent for Web page comparing, because Web page consists of not only text but also multimedia contents, such as image, audio, video and so on. This paper proposes a new approach to evaluate the similarity of Web pages considering all the contents on them. It can make Web page similarity computation exactly and bring benefits for Web analysis.","PeriodicalId":375613,"journal":{"name":"2008 IEEE International Symposium on Knowledge Acquisition and Modeling Workshop","volume":"25 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2008-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2008 IEEE International Symposium on Knowledge Acquisition and Modeling Workshop","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/KAMW.2008.4810606","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8
Abstract
Web page is the main contents on the World Wide Web. Similarity of Web pages is very helpful for Web content analysis. Text similarity, usually called similarity computation, has been investigated for decades in artificial intelligence area. Some similarity computation methods have been used to compare Web pages. However, text based similarity computation methods are incompetent for Web page comparing, because Web page consists of not only text but also multimedia contents, such as image, audio, video and so on. This paper proposes a new approach to evaluate the similarity of Web pages considering all the contents on them. It can make Web page similarity computation exactly and bring benefits for Web analysis.