Deep Cross-Lingual Coreference Resolution for Less-Resourced Languages: The Case of Basque

Proceedings of the Second Workshop on Computational Models of Reference, Anaphora and Coreference Pub Date : 1900-01-01 DOI:10.18653/v1/W19-2806

Gorka Urbizu, A. Soraluze, Olatz Arregi

引用次数: 11

Abstract

In this paper, we present a cross-lingual neural coreference resolution system for a less-resourced language such as Basque. To begin with, we build the first neural coreference resolution system for Basque, training it with the relatively small EPEC-KORREF corpus (45,000 words). Next, a cross-lingual coreference resolution system is designed. With this approach, the system learns from a bigger English corpus, using cross-lingual embeddings, to perform the coreference resolution for Basque. The cross-lingual system obtains slightly better results (40.93 F1 CoNLL) than the monolingual system (39.12 F1 CoNLL), without using any Basque language corpus to train it.

查看原文本刊更多论文

资源匮乏语言的深度跨语言共同参考解析:巴斯克语案例

在本文中，我们提出了一种跨语言神经共指解析系统，用于资源较少的语言，如巴斯克语。首先，我们为巴斯克语建立了第一个神经共指解析系统，使用相对较小的EPEC-KORREF语料库(45000字)进行训练。其次，设计了一种跨语言共参考解析系统。通过这种方法，系统从更大的英语语料库中学习，使用跨语言嵌入来执行巴斯克语的共同参考解析。在不使用任何巴斯克语语料库进行训练的情况下，跨语系统获得的结果(40.93 F1 CoNLL)略好于单语系统(39.12 F1 CoNLL)。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the Second Workshop on Computational Models of Reference, Anaphora and Coreference

自引率

0.00%

发文量