{"title":"Deep Cross-Lingual Coreference Resolution for Less-Resourced Languages: The Case of Basque","authors":"Gorka Urbizu, A. Soraluze, Olatz Arregi","doi":"10.18653/v1/W19-2806","DOIUrl":null,"url":null,"abstract":"In this paper, we present a cross-lingual neural coreference resolution system for a less-resourced language such as Basque. To begin with, we build the first neural coreference resolution system for Basque, training it with the relatively small EPEC-KORREF corpus (45,000 words). Next, a cross-lingual coreference resolution system is designed. With this approach, the system learns from a bigger English corpus, using cross-lingual embeddings, to perform the coreference resolution for Basque. The cross-lingual system obtains slightly better results (40.93 F1 CoNLL) than the monolingual system (39.12 F1 CoNLL), without using any Basque language corpus to train it.","PeriodicalId":339077,"journal":{"name":"Proceedings of the Second Workshop on Computational Models of Reference, Anaphora and Coreference","volume":"31 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Second Workshop on Computational Models of Reference, Anaphora and Coreference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.18653/v1/W19-2806","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 11
Abstract
In this paper, we present a cross-lingual neural coreference resolution system for a less-resourced language such as Basque. To begin with, we build the first neural coreference resolution system for Basque, training it with the relatively small EPEC-KORREF corpus (45,000 words). Next, a cross-lingual coreference resolution system is designed. With this approach, the system learns from a bigger English corpus, using cross-lingual embeddings, to perform the coreference resolution for Basque. The cross-lingual system obtains slightly better results (40.93 F1 CoNLL) than the monolingual system (39.12 F1 CoNLL), without using any Basque language corpus to train it.
在本文中,我们提出了一种跨语言神经共指解析系统,用于资源较少的语言,如巴斯克语。首先,我们为巴斯克语建立了第一个神经共指解析系统,使用相对较小的EPEC-KORREF语料库(45000字)进行训练。其次,设计了一种跨语言共参考解析系统。通过这种方法,系统从更大的英语语料库中学习,使用跨语言嵌入来执行巴斯克语的共同参考解析。在不使用任何巴斯克语语料库进行训练的情况下,跨语系统获得的结果(40.93 F1 CoNLL)略好于单语系统(39.12 F1 CoNLL)。