S2AND: A Benchmark and Evaluation System for Author Name Disambiguation

2021 ACM/IEEE Joint Conference on Digital Libraries (JCDL) Pub Date : 2021-03-12 DOI:10.1109/JCDL52503.2021.00029

Shivashankar Subramanian, Daniel King, Doug Downey, Sergey Feldman

{"title":"S2AND: A Benchmark and Evaluation System for Author Name Disambiguation","authors":"Shivashankar Subramanian, Daniel King, Doug Downey, Sergey Feldman","doi":"10.1109/JCDL52503.2021.00029","DOIUrl":null,"url":null,"abstract":"Author Name Disambiguation (AND) is the task of resolving which author mentions in a bibliographic database refer to the same real-world person, and is a critical ingredient of digital library applications such as search and citation analysis. While many AND algorithms have been proposed, comparing them is difficult because they often employ distinct features and are evaluated on different datasets. In response to this challenge, we present S2AND, a unified benchmark dataset for AND on scholarly papers, as well as an open-source reference model implementation. Our dataset harmonizes eight disparate AND datasets into a uniform format, with a single rich feature set drawn from the Semantic Scholar (S2) database. Our evaluation suite for S2AND reports performance split by facets like publication year and number of papers, allowing researchers to track both global performance and measures of fairness across facet values. Our experiments show that because previous datasets tend to cover idiosyncratic and biased slices of the literature, algorithms trained to perform well on one on them may generalize poorly to others. By contrast, we show how training on a union of datasets in S2AND results in more robust models that perform well even on datasets unseen in training. The resulting AND model also substantially improves over the production algorithm in S2, reducing error by over 50% in terms of B3 F1. We release our unified dataset, model code, trained models, and evaluation suite to the research community.11https://github.com/allenai/S2AND/","PeriodicalId":112400,"journal":{"name":"2021 ACM/IEEE Joint Conference on Digital Libraries (JCDL)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"21","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 ACM/IEEE Joint Conference on Digital Libraries (JCDL)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/JCDL52503.2021.00029","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 21

Abstract

Author Name Disambiguation (AND) is the task of resolving which author mentions in a bibliographic database refer to the same real-world person, and is a critical ingredient of digital library applications such as search and citation analysis. While many AND algorithms have been proposed, comparing them is difficult because they often employ distinct features and are evaluated on different datasets. In response to this challenge, we present S2AND, a unified benchmark dataset for AND on scholarly papers, as well as an open-source reference model implementation. Our dataset harmonizes eight disparate AND datasets into a uniform format, with a single rich feature set drawn from the Semantic Scholar (S2) database. Our evaluation suite for S2AND reports performance split by facets like publication year and number of papers, allowing researchers to track both global performance and measures of fairness across facet values. Our experiments show that because previous datasets tend to cover idiosyncratic and biased slices of the literature, algorithms trained to perform well on one on them may generalize poorly to others. By contrast, we show how training on a union of datasets in S2AND results in more robust models that perform well even on datasets unseen in training. The resulting AND model also substantially improves over the production algorithm in S2, reducing error by over 50% in terms of B3 F1. We release our unified dataset, model code, trained models, and evaluation suite to the research community.11https://github.com/allenai/S2AND/

查看原文本刊更多论文

and:作者姓名消歧的基准与评价体系

作者姓名消歧(AND)是解决书目数据库中提到的作者指的是同一个人的任务，是数字图书馆应用程序(如搜索和引文分析)的关键组成部分。虽然已经提出了许多AND算法，但比较它们是困难的，因为它们通常采用不同的特征，并且在不同的数据集上进行评估。为了应对这一挑战，我们提出了S2AND，这是一个统一的学术论文AND基准数据集，以及一个开源参考模型实现。我们的数据集将八个不同的AND数据集协调成统一的格式，并从Semantic Scholar (S2)数据库中提取了一个丰富的特征集。我们的S2AND评估套件根据发表年份和论文数量等方面报告绩效，使研究人员能够跟踪全球绩效和跨方面价值的公平性措施。我们的实验表明，由于以前的数据集往往涵盖了文献的特殊和有偏见的部分，经过训练在其中一个上表现良好的算法可能在其他数据集上泛化得很差。相比之下，我们展示了如何在S2AND中对数据集的并集进行训练，从而产生更健壮的模型，即使在训练中未见过的数据集上也能表现良好。由此产生的AND模型也大大改进了S2中的生成算法，在B3 F1方面减少了50%以上的误差。我们将我们的统一数据集、模型代码、训练模型和评估套件发布到研究社区

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2021 ACM/IEEE Joint Conference on Digital Libraries (JCDL)

自引率

0.00%

发文量