{"title":"Phylogeny of the Turkic Languages Inferred from Basic Vocabulary: Limitations of the Lexicostatistical Methods in an Intensive Contact Situation","authors":"Ilya M Egorov, Anna V Dybo, Alexei S Kassian","doi":"10.1093/jole/lzac006","DOIUrl":"https://doi.org/10.1093/jole/lzac006","url":null,"abstract":"This article provides an attempt to revise the phylogenetic structure of the Turkic family using a computational lexicostatistical approach. The methodological framework of the present research is characterized by the following features: (1) wordlists with strictly controlled semantics; (2) step-by-step reconstruction using Swadesh wordlists for proto-languages; (3) three stages of post-processing of the input data (analysis of root cognacy, elimination of derivational drift, and optimization of homoplasy); (4) application of several computational algorithms (Starling neighbor-joining, Bayesian MCMC, and maximum parsimony). The analysis provided confirms the status of Chuvash as the first outlier and suggests a subsequent multifurcation of Proto-Nuclear-Turkic into eight branches. The Siberian Turkic group is a purely areal unity, that is, Yakut-Dolgan, Tofa-Tuvinian, Khakas-Mrassu, Sarygh Yugur and Altai do not form a clade. Altai is grouped together with the Kipchak languages as a separate taxon; it does not show a particularly close relationship with Kirghiz, which belongs to another Kipchak subgroup. Karluk is a low-level taxon inside the Kipchak clade.","PeriodicalId":37118,"journal":{"name":"Journal of Language Evolution","volume":null,"pages":null},"PeriodicalIF":2.6,"publicationDate":"2022-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138519790","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Bayesian methods for ancestral state reconstruction in morphosyntax: Exploring the history of argument marking strategies in a large language family","authors":"Joshua L. Phillips, Claire Bowern","doi":"10.1093/jole/lzac002","DOIUrl":"https://doi.org/10.1093/jole/lzac002","url":null,"abstract":"\u0000 Bayesian phylogenetic methods have been gaining traction and currency in historical linguistics, as their potential for uncovering elements of language change is increasingly understood. Here, we demonstrate a proof of concept for using ancestral state reconstruction methods to reconstruct changes in morphology. We use a simple Brownian motion model of character evolution to test how splits in ergative marking evolve across Pama-Nyungan, a large family of Australian languages. We are able to recover linguistically plausible paths of change, as well as rejecting implausible paths. The results of these analyses elucidate constraints on changes that have led to extensive synchronic variation in an interlocking morphological system. They further provide evidence of an ergative–accusative split traceable to Proto-Pama-Nyungan.","PeriodicalId":37118,"journal":{"name":"Journal of Language Evolution","volume":null,"pages":null},"PeriodicalIF":2.6,"publicationDate":"2022-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44792758","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"BayesVarbrul: a unified multidimensional analysis of language change in a speaker community","authors":"Xia Hua","doi":"10.1093/jole/lzac004","DOIUrl":"https://doi.org/10.1093/jole/lzac004","url":null,"abstract":"\u0000 Exchange in ideas between language evolution and biological evolution has a long history, due to a shared theoretical foundation between language and biology as two evolving systems. Both systems evolve in terms of the frequency of a variant in a population for each of a large number of variables, that is how often a particular variant of a language variable is used in a speaker community and how many individuals in a biological population carry a particular variant of a gene. The way these frequencies change has been modelled under a similar mathematical framework. Here, I show how we can use concepts from genome wide association studies that identify the source of natural selection and the genes under selection in a biological population to study how social factors affect the usage of language variables in a speaker community or how some social groups use some language variables differently from other groups. Using the Gurindji Kriol language as a case study, I show how this approach unifies existing mathematical and statistical tools in studying language evolution over a large number of speakers and a large number of language variables, which provides a promising link between micro- and macro-evolution in language. The approach is named BayesVarbrul and is ready to apply to datasets other than the Gurindji Kriol dataset, including existing corpus data. The code and the instructions are available at https://github.com/huaxia1985/BayesVarbrul.","PeriodicalId":37118,"journal":{"name":"Journal of Language Evolution","volume":null,"pages":null},"PeriodicalIF":2.6,"publicationDate":"2022-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48421279","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Noga Zaslavsky, Karee Garvin, Charles Kemp, Naftali Tishby, Terry Regier
{"title":"The evolution of color naming reflects pressure for efficiency: Evidence from the recent past","authors":"Noga Zaslavsky, Karee Garvin, Charles Kemp, Naftali Tishby, Terry Regier","doi":"10.1093/jole/lzac001","DOIUrl":"https://doi.org/10.1093/jole/lzac001","url":null,"abstract":"It has been proposed that semantic systems evolve under pressure for efficiency. This hypothesis has so far been supported largely indirectly, by synchronic cross-language comparison, rather than directly by diachronic data. Here, we directly test this hypothesis in the domain of color naming, by analyzing recent diachronic data from Nafaanra, a language of Ghana and Côte d’Ivoire, and comparing it with quantitative predictions derived from the mathematical theory of efficient data compression. We show that color naming in Nafaanra has changed over the past four decades while remaining near-optimally efficient, and that this outcome would be unlikely under a random drift process that maintains structured color categories without pressure for efficiency. To our knowledge, this finding provides the first direct evidence that color naming evolves under pressure for efficiency, supporting the hypothesis that efficiency shapes the evolution of the lexicon.","PeriodicalId":37118,"journal":{"name":"Journal of Language Evolution","volume":null,"pages":null},"PeriodicalIF":2.6,"publicationDate":"2022-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138519794","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Methodological Problems in Quantitative Research on Environmental Effects in Phonology","authors":"F. Hartmann","doi":"10.1093/jole/lzac003","DOIUrl":"https://doi.org/10.1093/jole/lzac003","url":null,"abstract":"\u0000 This paper engages with the quantitative methodology underlying studies proposing a link between environment and phonology by replicating three prominent studies on ejectives and altitude, vowels and humidity, and sonority and ambient temperature. It argues that there are several issues regarding the methodological footing of such correlational studies. Further, the paper finds that the problems of statistically analyzing environmental datasets in phonology run deeper than the focus on individual phonetic features suggests: there are several overarching patterns of correlations to be found in these datasets that, if not understood and accounted for, render mistaking spurious correlations for real effects inevitable. This paper further makes concrete suggestions for what is needed to move beyond pairwise correlational studies between environmental and phonological variables in future investigations.","PeriodicalId":37118,"journal":{"name":"Journal of Language Evolution","volume":null,"pages":null},"PeriodicalIF":2.6,"publicationDate":"2022-04-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43189859","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A simulation on coevolution between language and multiple cognitive abilities","authors":"T. Gong, L. Shuai, Xiaolong Yang","doi":"10.1093/jole/lzab006","DOIUrl":"https://doi.org/10.1093/jole/lzab006","url":null,"abstract":"\u0000 We propose a coevolution scenario between language and two cognitive abilities, namely shared intentionality and lexical memory, under a conceptual framework that integrates biological evolution of language learners and cultural evolution of communal language among language users. Piggybacking on a well-attested agent-based model on the origin of simple lexicon and constituent word order out of holistic utterances, we demonstrate: (1) once adopted by early hominins to handle preliminary linguistic materials, along with the origin of an evolving communal language having a high mutual understandability among language users, the initially low levels of the two cognitive abilities are boosted and get ratcheted at sufficiently high levels in language users for proficient language learning and use; (2) the socio-cultural environment is indispensable for the coevolution, and natural selection (selecting highly understandable adults to produce offspring), not cultural selection (choosing highly understandable adults to teach offspring), drives the coevolution. This work modifies existing models and theories of coevolution between language and human cognition and clarifies theoretical controversies regarding the roles of natural and cultural selections on language evolution.","PeriodicalId":37118,"journal":{"name":"Journal of Language Evolution","volume":null,"pages":null},"PeriodicalIF":2.6,"publicationDate":"2022-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48347890","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Correcting a bias in TIGER rates resulting from high amounts of invariant and singleton cognate sets","authors":"Johann-Mattis List","doi":"10.1093/jole/lzab007","DOIUrl":"https://doi.org/10.1093/jole/lzab007","url":null,"abstract":"\u0000 In a recent issue of the Journal of Language Evolution, Syrjänen et al. (2021) investigate the suitability of computing Cummins and McInerney’s (2011) TIGER rates for estimating the tree-likeness of linguistic datasets compiled for phylogenetic reconstruction. The authors test the TIGER rates on a diverse sample of simulated data, which by and large confirms the usefulness of TIGER rates as an analytic tool for investigating linguistic data, but they test them only on one real-world dataset of Uralic languages which turns out to behave quite differently from the simulated data. When testing the TIGER rates on additional datasets, I detected a bias in the computation which leads to an unnatural increase in those cases where a dataset contains many characters with invariant or singleton states. To overcome this problem, I suggest a modified variant of TIGER rates, which is provided in the form of a freely available Python package. Testing the modified TIGER scores on the simulated data of Syrjänen et al. shows that the corrected TIGER rates still readily distinguish between different degrees of tree-likeness. Testing them on a dataset in which the number of singletons and invariants was artificially increased further shows that the corrected TIGER rates are not influenced by the bias. A final tests on seven linguistic datasets show the usefulness of the corrected TIGER rates on a larger variety of linguistic datasets and illustrate the importance to take specific aspects of linguistic data into account when using biological methods in the domain of language evolution.","PeriodicalId":37118,"journal":{"name":"Journal of Language Evolution","volume":null,"pages":null},"PeriodicalIF":2.6,"publicationDate":"2022-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43164472","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
K. Syrjänen, L. Maurits, Unni Leino, T. Honkola, J. Rota, O. Vesakoski
{"title":"Crouching TIGER, hidden structure: Exploring the nature of linguistic data using TIGER values","authors":"K. Syrjänen, L. Maurits, Unni Leino, T. Honkola, J. Rota, O. Vesakoski","doi":"10.1093/jole/lzab004","DOIUrl":"https://doi.org/10.1093/jole/lzab004","url":null,"abstract":"\u0000 In recent years, techniques such as Bayesian inference of phylogeny have become a standard part of the quantitative linguistic toolkit. While these tools successfully model the tree-like component of a linguistic dataset, real-world datasets generally include a combination of tree-like and nontree-like signals. Alongside developing techniques for modeling nontree-like data, an important requirement for future quantitative work is to build a principled understanding of this structural complexity of linguistic datasets. Some techniques exist for exploring the general structure of a linguistic dataset, such as NeighborNets, δ scores, and Q-residuals; however, these methods are not without limitations or drawbacks. In general, the question of what kinds of historical structure a linguistic dataset can contain and how these might be detected or measured remains critically underexplored from an objective, quantitative perspective. In this article, we propose TIGER values, a metric that estimates the internal consistency of a genetic dataset, as an additional metric for assessing how tree-like a linguistic dataset is. We use TIGER values to explore simulated language data ranging from very tree-like to completely unstructured, and also use them to analyze a cognate-coded basic vocabulary dataset of Uralic languages. As a point of comparison for the TIGER values, we also explore the same data using δ scores, Q-residuals, and NeighborNets. Our results suggest that TIGER values are capable of both ranking tree-like datasets according to their degree of treelikeness, as well as distinguishing datasets with tree-like structure from datasets with a nontree-like structure. Consequently, we argue that TIGER values serve as a useful metric for measuring the historical heterogeneity of datasets. Our results also highlight the complexities in measuring treelikeness from linguistic data, and how the metrics approach this question from different perspectives.","PeriodicalId":37118,"journal":{"name":"Journal of Language Evolution","volume":null,"pages":null},"PeriodicalIF":2.6,"publicationDate":"2021-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46292851","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The philosophical interpretation of language game theory","authors":"Nick Zangwill","doi":"10.1093/jole/lzab003","DOIUrl":"https://doi.org/10.1093/jole/lzab003","url":null,"abstract":"\u0000 I give an informal presentation of the evolutionary game theoretic approach to the conventions that constitute linguistic meaning. The aim is to give a philosophical interpretation of the project, which accounts for the role of game theoretic mathematics in explaining linguistic phenomena. I articulate the main virtue of this sort of account, which is its psychological economy, and I point to the casual mechanisms that are the ground of the application of evolutionary game theory to linguistic phenomena. Lastly, I consider the objection that the account cannot explain predication, logic, and compositionality.","PeriodicalId":37118,"journal":{"name":"Journal of Language Evolution","volume":null,"pages":null},"PeriodicalIF":2.6,"publicationDate":"2021-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46907149","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Konstantin Hoffmann, R. Bouckaert, Simon J. Greenhill, D. Kühnert
{"title":"Bayesian phylogenetic analysis of linguistic data using BEAST","authors":"Konstantin Hoffmann, R. Bouckaert, Simon J. Greenhill, D. Kühnert","doi":"10.1093/jole/lzab005","DOIUrl":"https://doi.org/10.1093/jole/lzab005","url":null,"abstract":"\u0000 Bayesian phylogenetic methods provide a set of tools to efficiently evaluate large linguistic datasets by reconstructing phylogenies—family trees—that represent the history of language families. These methods provide a powerful way to test hypotheses about prehistory, regarding the subgrouping, origins, expansion, and timing of the languages and their speakers. Through phylogenetics, we gain insights into the process of language evolution in general and into how fast individual features change in particular. This article introduces Bayesian phylogenetics as applied to languages. We describe substitution models for cognate evolution, molecular clock models for the evolutionary rate along the branches of a tree, and tree generating processes suitable for linguistic data. We explain how to find the best-suited model using path sampling or nested sampling. The theoretical background of these models is supplemented by a practical tutorial describing how to set up a Bayesian phylogenetic analysis using the software tool BEAST2.","PeriodicalId":37118,"journal":{"name":"Journal of Language Evolution","volume":null,"pages":null},"PeriodicalIF":2.6,"publicationDate":"2021-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44792974","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}