XLORE 3:来自异构维基知识资源的大规模多语言知识图谱

IF 5.4 2区 计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS
Kaisheng Zeng, Hailong Jin, Xin Lv, Fangwei Zhu, Lei Hou, Yi Zhang, Fan Pang, Yu Qi, Dingxiao Liu, Juanzi Li, Ling Feng
{"title":"XLORE 3:来自异构维基知识资源的大规模多语言知识图谱","authors":"Kaisheng Zeng, Hailong Jin, Xin Lv, Fangwei Zhu, Lei Hou, Yi Zhang, Fan Pang, Yu Qi, Dingxiao Liu, Juanzi Li, Ling Feng","doi":"10.1145/3660521","DOIUrl":null,"url":null,"abstract":"\n In recent years, Knowledge Graph (KG) has attracted significant attention from academia and industry, resulting in the development of numerous technologies for KG construction, completion, and application. XLORE is one of the largest multilingual KGs built from Baidu Baike and Wikipedia via a series of knowledge modelling and acquisition methods. In this paper, we utilize systematic methods to improve XLORE’s data quality and present its latest version, XLORE 3, which enables the effective integration and management of heterogeneous knowledge from diverse resources. Compared with previous versions, XLORE 3 has three major advantages: 1) We design a comprehensive and reasonable schema, namely XLORE ontology, which can effectively organize and manage entities from various resources. 2) We merge equivalent entities in different languages to facilitate knowledge sharing. We provide a large-scale entity linking system to establish the associations between unstructured text and structured KG. 3) We design a multi-strategy knowledge completion framework, which leverages pre-trained language models and vast amounts of unstructured text to discover missing and new facts. The resulting KG contains 446 concepts, 2,608 properties, 66 million entities, and more than 2 billion facts. It is available and downloadable online\n \n 1\n \n , providing a valuable resource for researchers and practitioners in various fields.\n","PeriodicalId":50936,"journal":{"name":"ACM Transactions on Information Systems","volume":null,"pages":null},"PeriodicalIF":5.4000,"publicationDate":"2024-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"XLORE 3: A Large-scale Multilingual Knowledge Graph from Heterogeneous Wiki Knowledge Resources\",\"authors\":\"Kaisheng Zeng, Hailong Jin, Xin Lv, Fangwei Zhu, Lei Hou, Yi Zhang, Fan Pang, Yu Qi, Dingxiao Liu, Juanzi Li, Ling Feng\",\"doi\":\"10.1145/3660521\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"\\n In recent years, Knowledge Graph (KG) has attracted significant attention from academia and industry, resulting in the development of numerous technologies for KG construction, completion, and application. XLORE is one of the largest multilingual KGs built from Baidu Baike and Wikipedia via a series of knowledge modelling and acquisition methods. In this paper, we utilize systematic methods to improve XLORE’s data quality and present its latest version, XLORE 3, which enables the effective integration and management of heterogeneous knowledge from diverse resources. Compared with previous versions, XLORE 3 has three major advantages: 1) We design a comprehensive and reasonable schema, namely XLORE ontology, which can effectively organize and manage entities from various resources. 2) We merge equivalent entities in different languages to facilitate knowledge sharing. We provide a large-scale entity linking system to establish the associations between unstructured text and structured KG. 3) We design a multi-strategy knowledge completion framework, which leverages pre-trained language models and vast amounts of unstructured text to discover missing and new facts. The resulting KG contains 446 concepts, 2,608 properties, 66 million entities, and more than 2 billion facts. It is available and downloadable online\\n \\n 1\\n \\n , providing a valuable resource for researchers and practitioners in various fields.\\n\",\"PeriodicalId\":50936,\"journal\":{\"name\":\"ACM Transactions on Information Systems\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":5.4000,\"publicationDate\":\"2024-05-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACM Transactions on Information Systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1145/3660521\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Information Systems","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1145/3660521","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

摘要

近年来,知识图谱(Knowledge Graph,KG)引起了学术界和产业界的极大关注,开发出了大量用于构建、完善和应用知识图谱的技术。XLORE 是目前最大的多语言知识图谱之一,由百度百科和维基百科通过一系列知识建模和获取方法构建而成。本文利用系统方法提高了XLORE的数据质量,并介绍了其最新版本XLORE 3,该版本能够有效整合和管理来自不同资源的异构知识。与之前的版本相比,XLORE 3 有三大优势:1)我们设计了一个全面合理的模式,即XLORE本体,它可以有效地组织和管理来自各种资源的实体。2)我们合并了不同语言中的等价实体,以促进知识共享。我们提供了一个大规模实体链接系统,以建立非结构化文本和结构化 KG 之间的关联。3) 我们设计了一个多策略知识补全框架,利用预先训练好的语言模型和海量非结构化文本来发现缺失的和新的事实。由此产生的知识库包含 446 个概念、2,608 个属性、6,600 万个实体和 20 多亿个事实。它可在线获取和下载1 ,为各领域的研究人员和从业人员提供了宝贵的资源。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
XLORE 3: A Large-scale Multilingual Knowledge Graph from Heterogeneous Wiki Knowledge Resources
In recent years, Knowledge Graph (KG) has attracted significant attention from academia and industry, resulting in the development of numerous technologies for KG construction, completion, and application. XLORE is one of the largest multilingual KGs built from Baidu Baike and Wikipedia via a series of knowledge modelling and acquisition methods. In this paper, we utilize systematic methods to improve XLORE’s data quality and present its latest version, XLORE 3, which enables the effective integration and management of heterogeneous knowledge from diverse resources. Compared with previous versions, XLORE 3 has three major advantages: 1) We design a comprehensive and reasonable schema, namely XLORE ontology, which can effectively organize and manage entities from various resources. 2) We merge equivalent entities in different languages to facilitate knowledge sharing. We provide a large-scale entity linking system to establish the associations between unstructured text and structured KG. 3) We design a multi-strategy knowledge completion framework, which leverages pre-trained language models and vast amounts of unstructured text to discover missing and new facts. The resulting KG contains 446 concepts, 2,608 properties, 66 million entities, and more than 2 billion facts. It is available and downloadable online 1 , providing a valuable resource for researchers and practitioners in various fields.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
ACM Transactions on Information Systems
ACM Transactions on Information Systems 工程技术-计算机:信息系统
CiteScore
9.40
自引率
14.30%
发文量
165
审稿时长
>12 weeks
期刊介绍: The ACM Transactions on Information Systems (TOIS) publishes papers on information retrieval (such as search engines, recommender systems) that contain: new principled information retrieval models or algorithms with sound empirical validation; observational, experimental and/or theoretical studies yielding new insights into information retrieval or information seeking; accounts of applications of existing information retrieval techniques that shed light on the strengths and weaknesses of the techniques; formalization of new information retrieval or information seeking tasks and of methods for evaluating the performance on those tasks; development of content (text, image, speech, video, etc) analysis methods to support information retrieval and information seeking; development of computational models of user information preferences and interaction behaviors; creation and analysis of evaluation methodologies for information retrieval and information seeking; or surveys of existing work that propose a significant synthesis. The information retrieval scope of ACM Transactions on Information Systems (TOIS) appeals to industry practitioners for its wealth of creative ideas, and to academic researchers for its descriptions of their colleagues'' work.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信