Modelling admixture across language levels to evaluate deep history claims

IF 2.1 0 LANGUAGE & LINGUISTICS

Journal of Language Evolution Pub Date : 2023-03-29 DOI:10.1093/jole/lzad002

Nataliia Hübler, Simon J. Greenhill

{"title":"Modelling admixture across language levels to evaluate deep history claims","authors":"Nataliia Hübler, Simon J. Greenhill","doi":"10.1093/jole/lzad002","DOIUrl":null,"url":null,"abstract":"\n The so-called ‘Altaic’ languages have been subject of debate for over 200 years. An array of different data sets have been used to investigate the genealogical relationships between them, but the controversy persists. The new data with a high potential for such cases in historical linguistics are structural features, which are sometimes declared to be prone to borrowing and discarded from the very beginning and at other times considered to have an especially precise historical signal reaching further back in time than other types of linguistic data. We investigate the performance of typological features across different domains of language by using an admixture model from genetics. As implemented in the software STRUCTURE, this model allows us to account for both a genealogical and an areal signal in the data. Our analysis shows that morphological features have the strongest genealogical signal and syntactic features diffuse most easily. When using only morphological structural data, the model is able to correctly identify three language families: Turkic, Mongolic, and Tungusic, whereas Japonic and Koreanic languages are assigned the same ancestry.","PeriodicalId":37118,"journal":{"name":"Journal of Language Evolution","volume":" ","pages":""},"PeriodicalIF":2.1000,"publicationDate":"2023-03-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Language Evolution","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/jole/lzad002","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"0","JCRName":"LANGUAGE & LINGUISTICS","Score":null,"Total":0}

引用次数: 0

Abstract

The so-called ‘Altaic’ languages have been subject of debate for over 200 years. An array of different data sets have been used to investigate the genealogical relationships between them, but the controversy persists. The new data with a high potential for such cases in historical linguistics are structural features, which are sometimes declared to be prone to borrowing and discarded from the very beginning and at other times considered to have an especially precise historical signal reaching further back in time than other types of linguistic data. We investigate the performance of typological features across different domains of language by using an admixture model from genetics. As implemented in the software STRUCTURE, this model allows us to account for both a genealogical and an areal signal in the data. Our analysis shows that morphological features have the strongest genealogical signal and syntactic features diffuse most easily. When using only morphological structural data, the model is able to correctly identify three language families: Turkic, Mongolic, and Tungusic, whereas Japonic and Koreanic languages are assigned the same ancestry.

查看原文本刊更多论文

模拟跨语言水平的混合，以评估深刻的历史主张

所谓的“阿尔泰语系”语言已经争论了200多年。一系列不同的数据集被用来调查他们之间的家谱关系，但争议仍然存在。在历史语言学中最有可能出现这种情况的新数据是结构特征，这些特征有时被认为容易从一开始就被借用和丢弃，有时被认为比其他类型的语言数据具有特别精确的历史信号，可以追溯到更早的时间。我们通过使用遗传学的混合模型来研究不同语言领域的类型特征的表现。正如在软件STRUCTURE中实现的那样，该模型允许我们同时考虑数据中的系谱信号和面信号。我们的分析表明，形态学特征具有最强的谱系信号，而句法特征最容易扩散。当仅使用形态结构数据时，该模型能够正确识别三个语族:突厥语、蒙古语和通古斯语，而日语和朝鲜语被分配到相同的祖先。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of Language Evolution Social Sciences-Linguistics and Language

CiteScore

4.50

自引率

7.70%

发文量