基于条件马尔可夫随机漫步的网页重要性分析

Tie-Yan Liu, Wei-Ying Ma
{"title":"基于条件马尔可夫随机漫步的网页重要性分析","authors":"Tie-Yan Liu, Wei-Ying Ma","doi":"10.1109/WI.2005.161","DOIUrl":null,"url":null,"abstract":"In this paper, we propose a novel method to calculate the Web page importance based on a conditional Markov random walk model. The main assumption in this model is that given the hyperlinks in a Web page, users are not really randomly clicking one of them. Instead, many factors may bias their behaviors, for example, the anchor text, the content relevance and the previous experiences when visiting the Web site that a destination page belongs to. As one of the results, the user might tend to visit those pages in high-quality Web sites with higher probability. To implement this idea, we reformulate the Web graph to be a two-layer structure, and the Web page importance is calculated by conditional random walk in this new Web graph. Experiments on the topic distillation task of TREC 2003 Web track showed that our new method can achieve about 18% improvement on mean average precision (MAP) and 16% on precision at 10 (P@10) over the PageRank algorithm.","PeriodicalId":213856,"journal":{"name":"The 2005 IEEE/WIC/ACM International Conference on Web Intelligence (WI'05)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2005-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"15","resultStr":"{\"title\":\"Webpage importance analysis using conditional Markov random walk\",\"authors\":\"Tie-Yan Liu, Wei-Ying Ma\",\"doi\":\"10.1109/WI.2005.161\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we propose a novel method to calculate the Web page importance based on a conditional Markov random walk model. The main assumption in this model is that given the hyperlinks in a Web page, users are not really randomly clicking one of them. Instead, many factors may bias their behaviors, for example, the anchor text, the content relevance and the previous experiences when visiting the Web site that a destination page belongs to. As one of the results, the user might tend to visit those pages in high-quality Web sites with higher probability. To implement this idea, we reformulate the Web graph to be a two-layer structure, and the Web page importance is calculated by conditional random walk in this new Web graph. Experiments on the topic distillation task of TREC 2003 Web track showed that our new method can achieve about 18% improvement on mean average precision (MAP) and 16% on precision at 10 (P@10) over the PageRank algorithm.\",\"PeriodicalId\":213856,\"journal\":{\"name\":\"The 2005 IEEE/WIC/ACM International Conference on Web Intelligence (WI'05)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2005-09-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"15\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"The 2005 IEEE/WIC/ACM International Conference on Web Intelligence (WI'05)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/WI.2005.161\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"The 2005 IEEE/WIC/ACM International Conference on Web Intelligence (WI'05)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/WI.2005.161","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 15

摘要

本文提出了一种基于条件马尔可夫随机游走模型的网页重要性计算方法。该模型的主要假设是,给定Web页面中的超链接,用户实际上不会随机单击其中一个。相反,许多因素可能会影响他们的行为,例如,锚文本、内容相关性以及访问目标页面所属网站时的先前经验。作为结果之一,用户可能倾向于以更高的概率访问高质量Web站点中的这些页面。为了实现这一思想,我们将Web图重新表述为一个双层结构,并在这个新的Web图中通过条件随机游动来计算Web页面的重要性。在TREC 2003 Web track的主题蒸馏任务上进行的实验表明,与PageRank算法相比,新方法的平均精度(MAP)提高了18%左右,10 (P@10)的精度提高了16%左右。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Webpage importance analysis using conditional Markov random walk
In this paper, we propose a novel method to calculate the Web page importance based on a conditional Markov random walk model. The main assumption in this model is that given the hyperlinks in a Web page, users are not really randomly clicking one of them. Instead, many factors may bias their behaviors, for example, the anchor text, the content relevance and the previous experiences when visiting the Web site that a destination page belongs to. As one of the results, the user might tend to visit those pages in high-quality Web sites with higher probability. To implement this idea, we reformulate the Web graph to be a two-layer structure, and the Web page importance is calculated by conditional random walk in this new Web graph. Experiments on the topic distillation task of TREC 2003 Web track showed that our new method can achieve about 18% improvement on mean average precision (MAP) and 16% on precision at 10 (P@10) over the PageRank algorithm.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信