Using Cross-Lingual Part of Speech Tagging for Partially Reconstructing the Classic Language Family Tree Model

Workshop on Computational Approaches to Historical Language Change Pub Date : 1900-01-01 DOI:10.18653/v1/2022.lchange-1.8

Anat Samohi, Daniel Weisberg Mitelman, Kfir Bar

引用次数: 0

Abstract

The tree model is well known for expressing the historic evolution of languages. This model has been considered as a method of describing genetic relationships between languages. Nevertheless, some researchers question the model’s ability to predict the proximity between two languages, since it represents genetic relatedness rather than linguistic resemblance. Defining other language proximity models has been an active research area for many years. In this paper we explore a part-of-speech model for defining proximity between languages using a multilingual language model that was fine-tuned on the task of cross-lingual part-of-speech tagging. We train the model on one language and evaluate it on another; the measured performance is then used to define the proximity between the two languages. By further developing the model, we show that it can reconstruct some parts of the tree model.

查看原文本刊更多论文

用跨语言词性标注部分重构经典语言谱系树模型

树形模型以表达语言的历史演变而闻名。这个模型被认为是描述语言之间遗传关系的一种方法。然而，一些研究人员质疑该模型预测两种语言之间接近性的能力，因为它代表的是基因相关性，而不是语言相似性。多年来，定义其他语言接近模型一直是一个活跃的研究领域。在本文中，我们探索了一个词性模型，用于定义语言之间的接近性，该模型使用多语言语言模型对跨语言词性标记任务进行了微调。我们在一种语言上训练模型，在另一种语言上评估它;然后使用测量的性能来定义两种语言之间的接近度。通过对模型的进一步开发，我们证明了该模型可以重构树模型的某些部分。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Workshop on Computational Approaches to Historical Language Change

自引率

0.00%

发文量