T. Honkola, Jenni Santaharju, K. Syrjänen, K. Pajusalu
{"title":"Clustering Lexical Variation of Finnic Languages Based on Atlas Linguarum Fennicarum","authors":"T. Honkola, Jenni Santaharju, K. Syrjänen, K. Pajusalu","doi":"10.3176/LU.2019.3.01","DOIUrl":null,"url":null,"abstract":"The article focuses on lexical relations of the Finnic languages. Here we studied whether lexical data is suitable for detecting the coarse-grained and fine-grained substructure within the Finnic group. We evaluated this by clustering old lexical variation from a dialectal dataset covering the whole Finnic speaker area (Atlas Linguarum Fennicarum; ALFE) using quantitative methods adopted from population genetics, and by comparing our results to groups suggested by earlier linguistic literature. We found the main lexical division between north-eastern and south-western Finnic. According to our lexical analysis, the Finnic languages are Finnish, North Estonian, South Estonian, Livonian, Karelian, Veps, and Votic-Ingrian. These groups matched well with the earlier suggested divisions, and we concluded that lexical data could be utilised more often in defining linguistic sub-structures, especially in linguistic situations that involve dialect continua.","PeriodicalId":35135,"journal":{"name":"Linguistica Uralica","volume":"1 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2019-09-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Linguistica Uralica","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3176/LU.2019.3.01","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"Arts and Humanities","Score":null,"Total":0}
引用次数: 2
Abstract
The article focuses on lexical relations of the Finnic languages. Here we studied whether lexical data is suitable for detecting the coarse-grained and fine-grained substructure within the Finnic group. We evaluated this by clustering old lexical variation from a dialectal dataset covering the whole Finnic speaker area (Atlas Linguarum Fennicarum; ALFE) using quantitative methods adopted from population genetics, and by comparing our results to groups suggested by earlier linguistic literature. We found the main lexical division between north-eastern and south-western Finnic. According to our lexical analysis, the Finnic languages are Finnish, North Estonian, South Estonian, Livonian, Karelian, Veps, and Votic-Ingrian. These groups matched well with the earlier suggested divisions, and we concluded that lexical data could be utilised more often in defining linguistic sub-structures, especially in linguistic situations that involve dialect continua.
本文主要研究芬尼语的词汇关系。在这里,我们研究了词法数据是否适合检测Finnic组中的粗粒度和细粒度子结构。我们通过使用群体遗传学中采用的定量方法,对覆盖整个芬尼语使用者区域的方言数据集(Atlas Linguarum Fennicarum;ALFE)中的旧词汇变异进行聚类,并将我们的结果与早期语言学文献中提出的组进行比较,来评估这一点。我们发现芬尼语东北部和西南部之间的主要词汇划分。根据我们的词汇分析,芬兰人的语言有芬兰语、北爱沙尼亚语、南爱沙尼亚语、利沃尼亚语、卡累利阿语、韦普斯语和英语元音。这些组与早期提出的划分非常吻合,我们得出结论,词汇数据可以更频繁地用于定义语言子结构,尤其是在涉及方言延续的语言情况下。