Generalizing Cross-Document Event Coreference Resolution Across Multiple Corpora

IF 5.3 2区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Computational Linguistics Pub Date : 2020-11-24 DOI:10.1162/coli_a_00407

M. Bugert, Nils Reimers, Iryna Gurevych

{"title":"Generalizing Cross-Document Event Coreference Resolution Across Multiple Corpora","authors":"M. Bugert, Nils Reimers, Iryna Gurevych","doi":"10.1162/coli_a_00407","DOIUrl":null,"url":null,"abstract":"Cross-document event coreference resolution (CDCR) is an NLP task in which mentions of events need to be identified and clustered throughout a collection of documents. CDCR aims to benefit downstream multidocument applications, but despite recent progress on corpora and system development, downstream improvements from applying CDCR have not been shown yet. We make the observation that every CDCR system to date was developed, trained, and tested only on a single respective corpus. This raises strong concerns on their generalizability—a must-have for downstream applications where the magnitude of domains or event mentions is likely to exceed those found in a curated corpus. To investigate this assumption, we define a uniform evaluation setup involving three CDCR corpora: ECB+, the Gun Violence Corpus, and the Football Coreference Corpus (which we reannotate on token level to make our analysis possible). We compare a corpus-independent, feature-based system against a recent neural system developed for ECB+. Although being inferior in absolute numbers, the feature-based system shows more consistent performance across all corpora whereas the neural system is hit-or-miss. Via model introspection, we find that the importance of event actions, event time, and so forth, for resolving coreference in practice varies greatly between the corpora. Additional analysis shows that several systems overfit on the structure of the ECB+ corpus. We conclude with recommendations on how to achieve generally applicable CDCR systems in the future—the most important being that evaluation on multiple CDCR corpora is strongly necessary. To facilitate future research, we release our dataset, annotation guidelines, and system implementation to the public.1","PeriodicalId":55229,"journal":{"name":"Computational Linguistics","volume":"47 1","pages":"1-40"},"PeriodicalIF":5.3000,"publicationDate":"2020-11-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computational Linguistics","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1162/coli_a_00407","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 9

Abstract

Cross-document event coreference resolution (CDCR) is an NLP task in which mentions of events need to be identified and clustered throughout a collection of documents. CDCR aims to benefit downstream multidocument applications, but despite recent progress on corpora and system development, downstream improvements from applying CDCR have not been shown yet. We make the observation that every CDCR system to date was developed, trained, and tested only on a single respective corpus. This raises strong concerns on their generalizability—a must-have for downstream applications where the magnitude of domains or event mentions is likely to exceed those found in a curated corpus. To investigate this assumption, we define a uniform evaluation setup involving three CDCR corpora: ECB+, the Gun Violence Corpus, and the Football Coreference Corpus (which we reannotate on token level to make our analysis possible). We compare a corpus-independent, feature-based system against a recent neural system developed for ECB+. Although being inferior in absolute numbers, the feature-based system shows more consistent performance across all corpora whereas the neural system is hit-or-miss. Via model introspection, we find that the importance of event actions, event time, and so forth, for resolving coreference in practice varies greatly between the corpora. Additional analysis shows that several systems overfit on the structure of the ECB+ corpus. We conclude with recommendations on how to achieve generally applicable CDCR systems in the future—the most important being that evaluation on multiple CDCR corpora is strongly necessary. To facilitate future research, we release our dataset, annotation guidelines, and system implementation to the public.1

查看原文本刊更多论文

跨多个公司的跨文档事件引用解析的通用化

跨文档事件共引用解析（CDCR）是一项NLP任务，其中需要在整个文档集合中识别和聚集事件的提及。CDCR旨在使下游多文档应用程序受益，但尽管最近在语料库和系统开发方面取得了进展，但应用CDCR的下游改进尚未显示出来。我们观察到，迄今为止，每个CDCR系统都是仅在单个相应的语料库上开发、训练和测试的。这引发了人们对其可推广性的强烈担忧——这是下游应用程序的必备条件，在这些应用程序中，领域或事件提及的数量可能超过策划语料库中的数量。为了研究这一假设，我们定义了一个统一的评估设置，涉及三个CDCR语料库：ECB+、枪支暴力语料库和足球参考语料库（我们在令牌级别重新标记，使我们的分析成为可能）。我们将一个独立于语料库、基于特征的系统与最近为ECB+开发的神经系统进行了比较。尽管在绝对数量上较差，但基于特征的系统在所有语料库中表现出更一致的性能，而神经系统则是命中或未命中的。通过模型内省，我们发现，在实践中，事件动作、事件时间等对解决共指的重要性在语料库之间存在很大差异。额外的分析表明，几个系统对ECB+语料库的结构进行了过度拟合。最后，我们就如何在未来实现普遍适用的CDCR系统提出了建议——最重要的是，对多个CDCR语料库的评估是非常必要的。为了促进未来的研究，我们向公众发布了我们的数据集、注释指南和系统实现。1

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Computational Linguistics 工程技术-计算机：跨学科应用

CiteScore

15.80

自引率

0.00%

发文量

审稿时长

>12 weeks

期刊介绍： Computational Linguistics, the longest-running publication dedicated solely to the computational and mathematical aspects of language and the design of natural language processing systems, provides university and industry linguists, computational linguists, AI and machine learning researchers, cognitive scientists, speech specialists, and philosophers with the latest insights into the computational aspects of language research.