Consensus-based ranking of wikipedia topics

Waleed Nema, Yinshan Tang
{"title":"Consensus-based ranking of wikipedia topics","authors":"Waleed Nema, Yinshan Tang","doi":"10.1145/3106426.3106529","DOIUrl":null,"url":null,"abstract":"To improve the effectiveness of users' information seeking experience in interactive web search we hypothesize how people might be influenced when making relevance judgment decisions by introducing the Consensus Theory & Relevance Judgment Model (CT&M). This is combined with a practical path to assess the extent of difference between suggestions of current search engines versus user expectations. A user-centered, evidence-based, phenomenology approach is used to improve on Google PageRank (GPR) in two ways. The first by biasing GPR's equal navigation probability assumption using (f)actual usage stats as implicit user consensus which leads to the StatsRank (SR) algorithm. Secondly, we aggregate users' explicit ranking to derive Consensus Rank (CR) which is shown to predict individual user ranking significantly better than GPR and meta-search of modern search engines Google and Yahoo/Bing real-time. CT&M contextualizes CR, SR, and a live open online web experiment, called The Ranking Game, which is based on the August-2016 English Wikipedia corpus (12.7 million pages) and Page View Statistics for May to July 2016. Limiting this work to Wikipedia makes GPR topic-based since any Wikipedia page is focused on one topic. TREC's pooling is used to merge top 20 results from major search engines and present an alphabetized list for users' explicit ranking via drag and drop. The same platform captures implicit data for future research and can be used for controlled experiments. Our contributions are: CT&M, SR, CR, and the open online user feedback web experiment research platform.","PeriodicalId":20685,"journal":{"name":"Proceedings of the 7th International Conference on Web Intelligence, Mining and Semantics","volume":"14 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2017-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 7th International Conference on Web Intelligence, Mining and Semantics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3106426.3106529","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

To improve the effectiveness of users' information seeking experience in interactive web search we hypothesize how people might be influenced when making relevance judgment decisions by introducing the Consensus Theory & Relevance Judgment Model (CT&M). This is combined with a practical path to assess the extent of difference between suggestions of current search engines versus user expectations. A user-centered, evidence-based, phenomenology approach is used to improve on Google PageRank (GPR) in two ways. The first by biasing GPR's equal navigation probability assumption using (f)actual usage stats as implicit user consensus which leads to the StatsRank (SR) algorithm. Secondly, we aggregate users' explicit ranking to derive Consensus Rank (CR) which is shown to predict individual user ranking significantly better than GPR and meta-search of modern search engines Google and Yahoo/Bing real-time. CT&M contextualizes CR, SR, and a live open online web experiment, called The Ranking Game, which is based on the August-2016 English Wikipedia corpus (12.7 million pages) and Page View Statistics for May to July 2016. Limiting this work to Wikipedia makes GPR topic-based since any Wikipedia page is focused on one topic. TREC's pooling is used to merge top 20 results from major search engines and present an alphabetized list for users' explicit ranking via drag and drop. The same platform captures implicit data for future research and can be used for controlled experiments. Our contributions are: CT&M, SR, CR, and the open online user feedback web experiment research platform.
基于共识的维基百科主题排名
为了提高交互式网络搜索中用户信息寻求体验的有效性,我们通过引入共识理论和关联判断模型(CT&M)来假设人们在做出关联判断决策时可能受到的影响。这与实际路径相结合,以评估当前搜索引擎的建议与用户期望之间的差异程度。以用户为中心,以证据为基础的现象学方法用于从两个方面提高Google PageRank (GPR)。第一个是通过使用(f)实际使用统计作为隐含用户共识来偏倚GPR的相等导航概率假设,从而导致StatsRank (SR)算法。其次,我们汇总用户的显式排名,得出共识排名(Consensus Rank, CR),该排名预测个人用户排名的效果明显优于GPR和现代搜索引擎谷歌和雅虎/必应的实时元搜索。CT&M将CR、SR和一个名为“排名游戏”(The Ranking Game)的实时开放网络实验结合起来,该实验基于2016年8月至2016年8月的英文维基百科语料库(1270万页)和2016年5月至7月的页面浏览量统计数据。将这项工作限制在维基百科使GPR基于主题,因为任何维基百科页面都专注于一个主题。TREC的池用于合并来自主要搜索引擎的前20个结果,并通过拖放显示按字母顺序排列的列表,以便用户明确排名。同样的平台为未来的研究捕获隐含数据,并可用于控制实验。我们的贡献是:CT&M, SR, CR和开放的在线用户反馈网络实验研究平台。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信