大规模计算机正向重建为法语历时音韵学提供了新的视角

IF 0.6 2区 文学 0 LANGUAGE & LINGUISTICS
Diachronica Pub Date : 2022-11-28 DOI:10.1075/dia.20027.mar
Clayton Marr, David R. Mortensen
{"title":"大规模计算机正向重建为法语历时音韵学提供了新的视角","authors":"Clayton Marr, David R. Mortensen","doi":"10.1075/dia.20027.mar","DOIUrl":null,"url":null,"abstract":"\nTraditionally, historical phonologists have relied on tedious manual derivations to sequence the sound changes that have shaped the phonological evolution of languages. However, humans are prone to errors, and cannot track thousands of parallel derivations in any efficient manner. We demonstrate computerized forward reconstruction (CFR), deriving each etymon in parallel, as a task with metrics to optimize, and as a tool which drastically facilitates inquiry. To this end we present DiaSim, an application which simulates “cascades” of diachronic developments over a language’s lexicon and provides various diagnostics for “debugging” those cascades. We test our method on a Latin-to-French reflex prediction task, using a newly compiled, publicly available dataset FLLex consisting of 1368 paired Latin and Modern French forms. We also introduce a second dataset, FLLAPS, which maps 310 reflexes from Latin through five attested intermediate stages up to Modern French, derived from Pope’s (1934) periodic development tables. We present publicly available rule cascades: the baseline BaseCLEF and BaseCLEF* cascades, based on Pope’s (1934) widely-cited view of French development, and DiaCLEF, made from incremental corrections to BaseCLEF aided by DiaSim’s diagnostics. DiaCLEF outperforms the baselines by large margins, improving raw accuracy on FLLex from 3.2% to 84.9% of etyma, with similarly large improvements for each of FLLAPS’ periods. Changes were made to build DiaCLEF considering only the baseline and DiaSim’s diagnostics, but they often independently reproduced past work in French diachronic phonology, corroborating both our procedure and past endeavors; we discuss the implications of some of our findings in detail.","PeriodicalId":44637,"journal":{"name":"Diachronica","volume":" ","pages":""},"PeriodicalIF":0.6000,"publicationDate":"2022-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Large-scale computerized forward reconstruction yields new perspectives in French diachronic phonology\",\"authors\":\"Clayton Marr, David R. Mortensen\",\"doi\":\"10.1075/dia.20027.mar\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"\\nTraditionally, historical phonologists have relied on tedious manual derivations to sequence the sound changes that have shaped the phonological evolution of languages. However, humans are prone to errors, and cannot track thousands of parallel derivations in any efficient manner. We demonstrate computerized forward reconstruction (CFR), deriving each etymon in parallel, as a task with metrics to optimize, and as a tool which drastically facilitates inquiry. To this end we present DiaSim, an application which simulates “cascades” of diachronic developments over a language’s lexicon and provides various diagnostics for “debugging” those cascades. We test our method on a Latin-to-French reflex prediction task, using a newly compiled, publicly available dataset FLLex consisting of 1368 paired Latin and Modern French forms. We also introduce a second dataset, FLLAPS, which maps 310 reflexes from Latin through five attested intermediate stages up to Modern French, derived from Pope’s (1934) periodic development tables. We present publicly available rule cascades: the baseline BaseCLEF and BaseCLEF* cascades, based on Pope’s (1934) widely-cited view of French development, and DiaCLEF, made from incremental corrections to BaseCLEF aided by DiaSim’s diagnostics. DiaCLEF outperforms the baselines by large margins, improving raw accuracy on FLLex from 3.2% to 84.9% of etyma, with similarly large improvements for each of FLLAPS’ periods. Changes were made to build DiaCLEF considering only the baseline and DiaSim’s diagnostics, but they often independently reproduced past work in French diachronic phonology, corroborating both our procedure and past endeavors; we discuss the implications of some of our findings in detail.\",\"PeriodicalId\":44637,\"journal\":{\"name\":\"Diachronica\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":0.6000,\"publicationDate\":\"2022-11-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Diachronica\",\"FirstCategoryId\":\"98\",\"ListUrlMain\":\"https://doi.org/10.1075/dia.20027.mar\",\"RegionNum\":2,\"RegionCategory\":\"文学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"0\",\"JCRName\":\"LANGUAGE & LINGUISTICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Diachronica","FirstCategoryId":"98","ListUrlMain":"https://doi.org/10.1075/dia.20027.mar","RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"0","JCRName":"LANGUAGE & LINGUISTICS","Score":null,"Total":0}
引用次数: 1

摘要

传统上,历史音韵学家依靠繁琐的手工衍生来排列影响语言音系演变的声音变化。然而,人类很容易出错,并且无法以任何有效的方式跟踪数千个并行衍生。我们展示了计算机前向重建(CFR),并行地推导每个词源,作为一个有指标优化的任务,作为一个极大地促进查询的工具。为此,我们提出了DiaSim,这是一个应用程序,它可以模拟语言词典上历时发展的“级联”,并为“调试”这些级联提供各种诊断。我们在拉丁语到法语的反射预测任务上测试了我们的方法,使用了一个新编译的、公开可用的数据集FLLex,该数据集由1368个配对的拉丁语和现代法语形式组成。我们还介绍了第二个数据集FLLAPS,该数据集绘制了310种反射图,从拉丁语到五个经证实的中间阶段,再到现代法语,这些反射来自Pope(1934)的周期发展表。我们提供了公开可用的规则级联:基线BaseCLEF和BaseCLEF*级联,基于Pope(1934)被广泛引用的法国发展观点,以及DiaCLEF,在diacim诊断的帮助下对BaseCLEF进行增量修正。DiaCLEF大大优于基线,将FLLex的原始准确度从3.2%提高到84.9%,在FLLAPS的每个周期都有类似的大幅改善。为了构建DiaCLEF,我们只考虑了基线和DiaCLEF的诊断,但它们经常独立地复制过去在法语历时音韵学中的工作,证实了我们的程序和过去的努力;我们详细讨论了我们的一些发现的含义。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Large-scale computerized forward reconstruction yields new perspectives in French diachronic phonology
Traditionally, historical phonologists have relied on tedious manual derivations to sequence the sound changes that have shaped the phonological evolution of languages. However, humans are prone to errors, and cannot track thousands of parallel derivations in any efficient manner. We demonstrate computerized forward reconstruction (CFR), deriving each etymon in parallel, as a task with metrics to optimize, and as a tool which drastically facilitates inquiry. To this end we present DiaSim, an application which simulates “cascades” of diachronic developments over a language’s lexicon and provides various diagnostics for “debugging” those cascades. We test our method on a Latin-to-French reflex prediction task, using a newly compiled, publicly available dataset FLLex consisting of 1368 paired Latin and Modern French forms. We also introduce a second dataset, FLLAPS, which maps 310 reflexes from Latin through five attested intermediate stages up to Modern French, derived from Pope’s (1934) periodic development tables. We present publicly available rule cascades: the baseline BaseCLEF and BaseCLEF* cascades, based on Pope’s (1934) widely-cited view of French development, and DiaCLEF, made from incremental corrections to BaseCLEF aided by DiaSim’s diagnostics. DiaCLEF outperforms the baselines by large margins, improving raw accuracy on FLLex from 3.2% to 84.9% of etyma, with similarly large improvements for each of FLLAPS’ periods. Changes were made to build DiaCLEF considering only the baseline and DiaSim’s diagnostics, but they often independently reproduced past work in French diachronic phonology, corroborating both our procedure and past endeavors; we discuss the implications of some of our findings in detail.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Diachronica
Diachronica Multiple-
CiteScore
1.60
自引率
0.00%
发文量
23
期刊介绍: Diachronica provides a forum for the presentation and discussion of information concerning all aspects of language change in any and all languages of the globe. Contributions which combine theoretical interest and philological acumen are especially welcome. Diachronica appears three times per year, publishing articles, review articles, book reviews, and a miscellanea section including notes, reports and discussions.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信