Using Cross-Lingual Part of Speech Tagging for Partially Reconstructing the Classic Language Family Tree Model

Anat Samohi, Daniel Weisberg Mitelman, Kfir Bar
{"title":"Using Cross-Lingual Part of Speech Tagging for Partially Reconstructing the Classic Language Family Tree Model","authors":"Anat Samohi, Daniel Weisberg Mitelman, Kfir Bar","doi":"10.18653/v1/2022.lchange-1.8","DOIUrl":null,"url":null,"abstract":"The tree model is well known for expressing the historic evolution of languages. This model has been considered as a method of describing genetic relationships between languages. Nevertheless, some researchers question the model’s ability to predict the proximity between two languages, since it represents genetic relatedness rather than linguistic resemblance. Defining other language proximity models has been an active research area for many years. In this paper we explore a part-of-speech model for defining proximity between languages using a multilingual language model that was fine-tuned on the task of cross-lingual part-of-speech tagging. We train the model on one language and evaluate it on another; the measured performance is then used to define the proximity between the two languages. By further developing the model, we show that it can reconstruct some parts of the tree model.","PeriodicalId":120650,"journal":{"name":"Workshop on Computational Approaches to Historical Language Change","volume":"2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Workshop on Computational Approaches to Historical Language Change","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.18653/v1/2022.lchange-1.8","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

The tree model is well known for expressing the historic evolution of languages. This model has been considered as a method of describing genetic relationships between languages. Nevertheless, some researchers question the model’s ability to predict the proximity between two languages, since it represents genetic relatedness rather than linguistic resemblance. Defining other language proximity models has been an active research area for many years. In this paper we explore a part-of-speech model for defining proximity between languages using a multilingual language model that was fine-tuned on the task of cross-lingual part-of-speech tagging. We train the model on one language and evaluate it on another; the measured performance is then used to define the proximity between the two languages. By further developing the model, we show that it can reconstruct some parts of the tree model.
用跨语言词性标注部分重构经典语言谱系树模型
树形模型以表达语言的历史演变而闻名。这个模型被认为是描述语言之间遗传关系的一种方法。然而,一些研究人员质疑该模型预测两种语言之间接近性的能力,因为它代表的是基因相关性,而不是语言相似性。多年来,定义其他语言接近模型一直是一个活跃的研究领域。在本文中,我们探索了一个词性模型,用于定义语言之间的接近性,该模型使用多语言语言模型对跨语言词性标记任务进行了微调。我们在一种语言上训练模型,在另一种语言上评估它;然后使用测量的性能来定义两种语言之间的接近度。通过对模型的进一步开发,我们证明了该模型可以重构树模型的某些部分。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信