Leveraging syntactic parsing to improve event annotation matching

Proceedings of the First Workshop on Aggregating and Analysing Crowdsourced Annotations for NLP Pub Date : 2019-11-01 DOI:10.18653/v1/D19-5903

Camiel Colruyt, Orphée De Clercq, Veronique Hoste

引用次数: 1

Abstract

Detecting event mentions is the first step in event extraction from text and annotating them is a notoriously difficult task. Evaluating annotator consistency is crucial when building datasets for mention detection. When event mentions are allowed to cover many tokens, annotators may disagree on their span, which means that overlapping annotations may then refer to the same event or to different events. This paper explores different fuzzy-matching functions which aim to resolve this ambiguity. The functions extract the sets of syntactic heads present in the annotations, use the Dice coefficient to measure the similarity between sets and return a judgment based on a given threshold. The functions are tested against the judgment of a human evaluator and a comparison is made between sets of tokens and sets of syntactic heads. The best-performing function is a head-based function that is found to agree with the human evaluator in 89% of cases.

查看原文本刊更多论文

利用语法解析来改进事件注释匹配

检测事件提及是从文本中提取事件的第一步，而对事件进行注释是一项非常困难的任务。在构建用于提及检测的数据集时，评估注释器一致性至关重要。当事件提及被允许覆盖许多标记时，注释者可能对它们的范围不一致，这意味着重叠的注释可能会引用相同的事件或不同的事件。本文探讨了不同的模糊匹配函数来解决这种歧义。这些函数提取注解中出现的语法头的集合，使用Dice系数度量集合之间的相似性，并根据给定的阈值返回判断。根据人工评估器的判断对函数进行测试，并在标记集和语法头集之间进行比较。表现最好的函数是基于头部的函数，在89%的情况下，它与人类评估者一致。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the First Workshop on Aggregating and Analysing Crowdsourced Annotations for NLP

自引率

0.00%

发文量