The Microsoft Academic Knowledge Graph enhanced: Author name disambiguation, publication classification, and embeddings

IF 4.1 Q1 INFORMATION SCIENCE & LIBRARY SCIENCE
Michael Färber, Lin Ao
{"title":"The Microsoft Academic Knowledge Graph enhanced: Author name disambiguation, publication classification, and embeddings","authors":"Michael Färber, Lin Ao","doi":"10.1162/qss_a_00183","DOIUrl":null,"url":null,"abstract":"Abstract Although several large knowledge graphs have been proposed in the scholarly field, such graphs are limited with respect to several data quality dimensions such as accuracy and coverage. In this article, we present methods for enhancing the Microsoft Academic Knowledge Graph (MAKG), a recently published large-scale knowledge graph containing metadata about scientific publications and associated authors, venues, and affiliations. Based on a qualitative analysis of the MAKG, we address three aspects. First, we adopt and evaluate unsupervised approaches for large-scale author name disambiguation. Second, we develop and evaluate methods for tagging publications by their discipline and by keywords, facilitating enhanced search and recommendation of publications and associated entities. Third, we compute and evaluate embeddings for all 239 million publications, 243 million authors, 49,000 journals, and 16,000 conference entities in the MAKG based on several state-of-the-art embedding techniques. Finally, we provide statistics for the updated MAKG. Our final MAKG is publicly available at https://makg.org and can be used for the search or recommendation of scholarly entities, as well as enhanced scientific impact quantification.","PeriodicalId":34021,"journal":{"name":"Quantitative Science Studies","volume":"3 1","pages":"51-98"},"PeriodicalIF":4.1000,"publicationDate":"2022-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"17","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Quantitative Science Studies","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1162/qss_a_00183","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"INFORMATION SCIENCE & LIBRARY SCIENCE","Score":null,"Total":0}
引用次数: 17

Abstract

Abstract Although several large knowledge graphs have been proposed in the scholarly field, such graphs are limited with respect to several data quality dimensions such as accuracy and coverage. In this article, we present methods for enhancing the Microsoft Academic Knowledge Graph (MAKG), a recently published large-scale knowledge graph containing metadata about scientific publications and associated authors, venues, and affiliations. Based on a qualitative analysis of the MAKG, we address three aspects. First, we adopt and evaluate unsupervised approaches for large-scale author name disambiguation. Second, we develop and evaluate methods for tagging publications by their discipline and by keywords, facilitating enhanced search and recommendation of publications and associated entities. Third, we compute and evaluate embeddings for all 239 million publications, 243 million authors, 49,000 journals, and 16,000 conference entities in the MAKG based on several state-of-the-art embedding techniques. Finally, we provide statistics for the updated MAKG. Our final MAKG is publicly available at https://makg.org and can be used for the search or recommendation of scholarly entities, as well as enhanced scientific impact quantification.
微软学术知识图谱增强:作者姓名消歧、出版物分类和嵌入
虽然在学术领域已经提出了一些大型知识图谱,但这些图谱在准确性和覆盖范围等几个数据质量维度上受到限制。在本文中,我们提出了增强微软学术知识图谱(MAKG)的方法,这是一个最近发布的大规模知识图谱,包含有关科学出版物和相关作者、场所和附属机构的元数据。在对MAKG进行定性分析的基础上,我们从三个方面进行了探讨。首先,我们采用并评估了大规模作者姓名消歧的无监督方法。其次,我们开发和评估按学科和关键词标记出版物的方法,促进出版物和相关实体的增强搜索和推荐。第三,基于几种最先进的嵌入技术,我们计算和评估了MAKG中所有2.39亿出版物、2.43亿作者、49,000种期刊和16,000个会议实体的嵌入。最后,我们为更新后的MAKG提供统计信息。我们最终的MAKG可在https://makg.org上公开获取,可用于搜索或推荐学术实体,以及增强的科学影响量化。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Quantitative Science Studies
Quantitative Science Studies INFORMATION SCIENCE & LIBRARY SCIENCE-
CiteScore
12.10
自引率
12.50%
发文量
46
审稿时长
22 weeks
期刊介绍:
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信