A hybrid method for detecting outdated information in Wikipedia infoboxes

Thong Tran, T. Cao
{"title":"A hybrid method for detecting outdated information in Wikipedia infoboxes","authors":"Thong Tran, T. Cao","doi":"10.1109/RIVF.2013.6719874","DOIUrl":null,"url":null,"abstract":"Wikipedia has grown fast and become a major information resource for users as well as for many knowledge bases derived from it. However it is still edited manually while the world is changing rapidly. In this paper, we propose a method to detect outdated attribute values in Wikipedia infoboxes by using facts extracted from the general Web. Our proposed method extracts new information by combining pattern-based approach with entity-search-based approach to deal with the diversity of natural language presentation forms of facts on the Web. Our experimental results show that the achieved accuracies of the proposed method are 70% and 82% respectively on the chief-executive-officer attribute and the number-of-employees attribute in company infoboxes. It significantly improves the accuracy of the single pattern-based or entity-search-based method. The results also reveal the striking truth about the outdated status of Wikipedia.","PeriodicalId":121216,"journal":{"name":"The 2013 RIVF International Conference on Computing & Communication Technologies - Research, Innovation, and Vision for Future (RIVF)","volume":"2018 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"The 2013 RIVF International Conference on Computing & Communication Technologies - Research, Innovation, and Vision for Future (RIVF)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/RIVF.2013.6719874","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

Wikipedia has grown fast and become a major information resource for users as well as for many knowledge bases derived from it. However it is still edited manually while the world is changing rapidly. In this paper, we propose a method to detect outdated attribute values in Wikipedia infoboxes by using facts extracted from the general Web. Our proposed method extracts new information by combining pattern-based approach with entity-search-based approach to deal with the diversity of natural language presentation forms of facts on the Web. Our experimental results show that the achieved accuracies of the proposed method are 70% and 82% respectively on the chief-executive-officer attribute and the number-of-employees attribute in company infoboxes. It significantly improves the accuracy of the single pattern-based or entity-search-based method. The results also reveal the striking truth about the outdated status of Wikipedia.
一种检测维基百科信息框中过时信息的混合方法
维基百科发展迅速,已成为用户的主要信息资源,也成为许多知识库的主要来源。然而,它仍然是手动编辑,而世界正在迅速变化。在本文中,我们提出了一种使用从一般Web中提取的事实来检测维基百科信息框中过时属性值的方法。本文提出的方法将基于模式的方法与基于实体搜索的方法相结合来提取新信息,以处理Web上事实的自然语言表示形式的多样性。实验结果表明,该方法对公司信息框中首席执行官属性和员工人数属性的识别准确率分别达到70%和82%。它大大提高了基于单一模式或基于实体搜索的方法的准确性。调查结果也揭示了一个惊人的事实:维基百科已经过时了。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信