Information Evolution in Wikipedia

Andrea Ceroni, Mihai Georgescu, U. Gadiraju, Kaweh Djafari Naini, M. Fisichella
{"title":"Information Evolution in Wikipedia","authors":"Andrea Ceroni, Mihai Georgescu, U. Gadiraju, Kaweh Djafari Naini, M. Fisichella","doi":"10.1145/2641580.2641612","DOIUrl":null,"url":null,"abstract":"The Web of data is constantly evolving based on the dynamics of its content. Current Web search engine technologies consider static collections and do not factor in explicitly or implicitly available temporal information, that can be leveraged to gain insights into the dynamics of the data. In this paper, we hypothesize that by employing the temporal aspect as the primary means for capturing the evolution of entities, it is possible to provide entity-based accessibility to Web archives. We empirically show that the edit activity on Wikipedia can be exploited to provide evidence of the evolution of Wikipedia pages over time, both in terms of their content and in terms of their temporally defined relationships, classified in literature as events. Finally, we present results from our extensive analysis of a dataset consisting of 31,998 Wikipedia pages describing politicians, and observations from in-depth case studies. Our findings reflect the usefulness of leveraging temporal information in order to study the evolution of entities and breed promising grounds for further research.","PeriodicalId":447989,"journal":{"name":"Proceedings of The International Symposium on Open Collaboration","volume":"37 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-08-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"14","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of The International Symposium on Open Collaboration","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2641580.2641612","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 14

Abstract

The Web of data is constantly evolving based on the dynamics of its content. Current Web search engine technologies consider static collections and do not factor in explicitly or implicitly available temporal information, that can be leveraged to gain insights into the dynamics of the data. In this paper, we hypothesize that by employing the temporal aspect as the primary means for capturing the evolution of entities, it is possible to provide entity-based accessibility to Web archives. We empirically show that the edit activity on Wikipedia can be exploited to provide evidence of the evolution of Wikipedia pages over time, both in terms of their content and in terms of their temporally defined relationships, classified in literature as events. Finally, we present results from our extensive analysis of a dataset consisting of 31,998 Wikipedia pages describing politicians, and observations from in-depth case studies. Our findings reflect the usefulness of leveraging temporal information in order to study the evolution of entities and breed promising grounds for further research.
维基百科中的信息进化
数据网络是基于其内容的动态而不断发展的。当前的Web搜索引擎技术考虑的是静态集合,不考虑显式或隐式可用的时态信息,这些信息可以用来深入了解数据的动态。在本文中,我们假设通过使用时间方面作为捕获实体演变的主要手段,就有可能为Web档案提供基于实体的可访问性。我们的经验表明,维基百科上的编辑活动可以用来提供维基百科页面随时间演变的证据,无论是在内容方面,还是在时间上定义的关系方面,在文献中被归类为事件。最后,我们展示了对31,998个描述政治家的维基百科页面的数据集的广泛分析结果,以及对深入案例研究的观察结果。我们的研究结果反映了利用时间信息来研究实体进化的有用性,并为进一步的研究提供了有希望的基础。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信