基于评论的维基百科文章排名

Y. Ganjisaffar, S. Javanmardi, C. Lopes
{"title":"基于评论的维基百科文章排名","authors":"Y. Ganjisaffar, S. Javanmardi, C. Lopes","doi":"10.1109/CASON.2009.14","DOIUrl":null,"url":null,"abstract":"Wikipedia, the largest encyclopedia on the Web, is often seen as the most successful example of crowdsourcing. The encyclopedic knowledge it accumulated over the years is so large that one often uses search engines, to find information in it. In contrast to regular Web pages, Wikipedia is fairly structured, and articles are usually accompanied with history pages, categories and talk pages. The meta-data available in these pages can be analyzed to gain a better understanding of the content and quality of the articles. We discuss how the rich meta-data available in wiki pages can be used to provide better search results in Wikipedia. Built on the studies on \"Wisdom of Crowds\" and the effectiveness of the knowledge collected by a large number of people, we investigate the effect of incorporating the extent of review of an article in the quality of rankings of the search results. The extent of review is measured by the number of distinct editors contributed to the articles and is extracted by processing Wikipedia's history pages. We compare different ranking algorithms that explore combinations of text-relevancy, PageRank, and extent of review. The results show that the review-based ranking algorithm which combines the extent of review and text-relevancy outperforms the rest; it is more accurate and less computationally expensive compared to PageRank-based rankings.","PeriodicalId":425748,"journal":{"name":"2009 International Conference on Computational Aspects of Social Networks","volume":"99 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":"{\"title\":\"Review-Based Ranking of Wikipedia Articles\",\"authors\":\"Y. Ganjisaffar, S. Javanmardi, C. Lopes\",\"doi\":\"10.1109/CASON.2009.14\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Wikipedia, the largest encyclopedia on the Web, is often seen as the most successful example of crowdsourcing. The encyclopedic knowledge it accumulated over the years is so large that one often uses search engines, to find information in it. In contrast to regular Web pages, Wikipedia is fairly structured, and articles are usually accompanied with history pages, categories and talk pages. The meta-data available in these pages can be analyzed to gain a better understanding of the content and quality of the articles. We discuss how the rich meta-data available in wiki pages can be used to provide better search results in Wikipedia. Built on the studies on \\\"Wisdom of Crowds\\\" and the effectiveness of the knowledge collected by a large number of people, we investigate the effect of incorporating the extent of review of an article in the quality of rankings of the search results. The extent of review is measured by the number of distinct editors contributed to the articles and is extracted by processing Wikipedia's history pages. We compare different ranking algorithms that explore combinations of text-relevancy, PageRank, and extent of review. The results show that the review-based ranking algorithm which combines the extent of review and text-relevancy outperforms the rest; it is more accurate and less computationally expensive compared to PageRank-based rankings.\",\"PeriodicalId\":425748,\"journal\":{\"name\":\"2009 International Conference on Computational Aspects of Social Networks\",\"volume\":\"99 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2009-06-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"10\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2009 International Conference on Computational Aspects of Social Networks\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CASON.2009.14\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2009 International Conference on Computational Aspects of Social Networks","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CASON.2009.14","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 10

摘要

维基百科是网络上最大的百科全书,经常被视为众包最成功的例子。它多年来积累的百科知识是如此之多,以至于人们经常使用搜索引擎来查找其中的信息。与普通网页相比,维基百科的结构相当合理,文章通常附有历史页面、分类和讨论页面。可以对这些页面中可用的元数据进行分析,以便更好地了解文章的内容和质量。我们将讨论如何使用wiki页面中可用的丰富元数据在Wikipedia中提供更好的搜索结果。基于对“群体智慧”和大量人收集的知识的有效性的研究,我们调查了将文章的评论程度纳入搜索结果排名质量的影响。评论的程度是由不同编辑对文章做出贡献的数量来衡量的,并通过处理维基百科的历史页面来提取。我们比较了探索文本相关性、PageRank和评论程度组合的不同排名算法。结果表明,结合评论程度和文本相关性的基于评论的排名算法优于其他算法;与基于pagerank的排名相比,它更准确,计算成本更低。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Review-Based Ranking of Wikipedia Articles
Wikipedia, the largest encyclopedia on the Web, is often seen as the most successful example of crowdsourcing. The encyclopedic knowledge it accumulated over the years is so large that one often uses search engines, to find information in it. In contrast to regular Web pages, Wikipedia is fairly structured, and articles are usually accompanied with history pages, categories and talk pages. The meta-data available in these pages can be analyzed to gain a better understanding of the content and quality of the articles. We discuss how the rich meta-data available in wiki pages can be used to provide better search results in Wikipedia. Built on the studies on "Wisdom of Crowds" and the effectiveness of the knowledge collected by a large number of people, we investigate the effect of incorporating the extent of review of an article in the quality of rankings of the search results. The extent of review is measured by the number of distinct editors contributed to the articles and is extracted by processing Wikipedia's history pages. We compare different ranking algorithms that explore combinations of text-relevancy, PageRank, and extent of review. The results show that the review-based ranking algorithm which combines the extent of review and text-relevancy outperforms the rest; it is more accurate and less computationally expensive compared to PageRank-based rankings.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信