{"title":"从基本词汇推断突厥语言的系统发育:密集接触情况下词汇统计方法的局限性","authors":"Ilya M Egorov, Anna V Dybo, Alexei S Kassian","doi":"10.1093/jole/lzac006","DOIUrl":null,"url":null,"abstract":"This article provides an attempt to revise the phylogenetic structure of the Turkic family using a computational lexicostatistical approach. The methodological framework of the present research is characterized by the following features: (1) wordlists with strictly controlled semantics; (2) step-by-step reconstruction using Swadesh wordlists for proto-languages; (3) three stages of post-processing of the input data (analysis of root cognacy, elimination of derivational drift, and optimization of homoplasy); (4) application of several computational algorithms (Starling neighbor-joining, Bayesian MCMC, and maximum parsimony). The analysis provided confirms the status of Chuvash as the first outlier and suggests a subsequent multifurcation of Proto-Nuclear-Turkic into eight branches. The Siberian Turkic group is a purely areal unity, that is, Yakut-Dolgan, Tofa-Tuvinian, Khakas-Mrassu, Sarygh Yugur and Altai do not form a clade. Altai is grouped together with the Kipchak languages as a separate taxon; it does not show a particularly close relationship with Kirghiz, which belongs to another Kipchak subgroup. Karluk is a low-level taxon inside the Kipchak clade.","PeriodicalId":37118,"journal":{"name":"Journal of Language Evolution","volume":null,"pages":null},"PeriodicalIF":2.1000,"publicationDate":"2022-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Phylogeny of the Turkic Languages Inferred from Basic Vocabulary: Limitations of the Lexicostatistical Methods in an Intensive Contact Situation\",\"authors\":\"Ilya M Egorov, Anna V Dybo, Alexei S Kassian\",\"doi\":\"10.1093/jole/lzac006\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This article provides an attempt to revise the phylogenetic structure of the Turkic family using a computational lexicostatistical approach. The methodological framework of the present research is characterized by the following features: (1) wordlists with strictly controlled semantics; (2) step-by-step reconstruction using Swadesh wordlists for proto-languages; (3) three stages of post-processing of the input data (analysis of root cognacy, elimination of derivational drift, and optimization of homoplasy); (4) application of several computational algorithms (Starling neighbor-joining, Bayesian MCMC, and maximum parsimony). The analysis provided confirms the status of Chuvash as the first outlier and suggests a subsequent multifurcation of Proto-Nuclear-Turkic into eight branches. The Siberian Turkic group is a purely areal unity, that is, Yakut-Dolgan, Tofa-Tuvinian, Khakas-Mrassu, Sarygh Yugur and Altai do not form a clade. Altai is grouped together with the Kipchak languages as a separate taxon; it does not show a particularly close relationship with Kirghiz, which belongs to another Kipchak subgroup. Karluk is a low-level taxon inside the Kipchak clade.\",\"PeriodicalId\":37118,\"journal\":{\"name\":\"Journal of Language Evolution\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":2.1000,\"publicationDate\":\"2022-07-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Language Evolution\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1093/jole/lzac006\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"0\",\"JCRName\":\"LANGUAGE & LINGUISTICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Language Evolution","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/jole/lzac006","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"0","JCRName":"LANGUAGE & LINGUISTICS","Score":null,"Total":0}
引用次数: 0
摘要
这篇文章提供了一个尝试修改突厥家族的系统发育结构使用计算词典统计方法。本研究的方法论框架具有以下特点:(1)严格控制语义的词表;(2)利用Swadesh词表对原语言进行分步重建;(3)输入数据的三个后处理阶段(词根同源性分析、导数漂移消除和同质性优化);(4)几种计算算法(Starling neighbor-joining, Bayesian MCMC, maximum parsimony)的应用。分析证实了Chuvash作为第一个异常的地位,并提出了原始核突厥语系随后的多分支,分为八个分支。西伯利亚突厥群是一个纯粹的地区统一,也就是说,雅库特-多尔干,托法-图维尼亚,Khakas-Mrassu, Sarygh Yugur和阿尔泰不形成一个分支。阿尔泰语与奇普恰克语归为一个单独的分类群;它并没有显示出与吉尔吉斯语的特别密切的关系,吉尔吉斯语属于另一个奇普察克亚群。Karluk是Kipchak分支中的一个低级分类单元。
Phylogeny of the Turkic Languages Inferred from Basic Vocabulary: Limitations of the Lexicostatistical Methods in an Intensive Contact Situation
This article provides an attempt to revise the phylogenetic structure of the Turkic family using a computational lexicostatistical approach. The methodological framework of the present research is characterized by the following features: (1) wordlists with strictly controlled semantics; (2) step-by-step reconstruction using Swadesh wordlists for proto-languages; (3) three stages of post-processing of the input data (analysis of root cognacy, elimination of derivational drift, and optimization of homoplasy); (4) application of several computational algorithms (Starling neighbor-joining, Bayesian MCMC, and maximum parsimony). The analysis provided confirms the status of Chuvash as the first outlier and suggests a subsequent multifurcation of Proto-Nuclear-Turkic into eight branches. The Siberian Turkic group is a purely areal unity, that is, Yakut-Dolgan, Tofa-Tuvinian, Khakas-Mrassu, Sarygh Yugur and Altai do not form a clade. Altai is grouped together with the Kipchak languages as a separate taxon; it does not show a particularly close relationship with Kirghiz, which belongs to another Kipchak subgroup. Karluk is a low-level taxon inside the Kipchak clade.