Beyond the tree of texts: Building an empirical model of scribal variation through graph analysis of texts and stemmata

Lit. Linguistic Comput. Pub Date : 2013-12-01 DOI:10.1093/llc/fqt032

T. Andrews, Caroline Macé

{"title":"Beyond the tree of texts: Building an empirical model of scribal variation through graph analysis of texts and stemmata","authors":"T. Andrews, Caroline Macé","doi":"10.1093/llc/fqt032","DOIUrl":null,"url":null,"abstract":"Stemmatology, or the reconstruction of the transmission history of texts, is a field that stands particularly to gain from digital methods. Many scholars already take stemmatic approaches that rely heavily on computational analysis of the collated text (e.g. Robinson and O’Hara 1996; Salemans 2000; Heikkila 2005; Windram et al. 2008 among many others). Although there is great value in computationally assisted stemmatology, providing as it does a reproducible result and allowing access to the relevant methodological process in related fields such as evolutionary biology, computational stemmatics is not without its critics. The current state-of-the-art effectively forces scholars to choose between a preconceived judgment of the significance of textual differences (the Lachmannian or neo-Lachmannian approach, and the weighted phylogenetic approach) or to make no judgment at all (the unweighted phylogenetic approach). Some basis for judgment of the significance of variation is sorely needed for medieval text criticism in particular. By this, we mean that there is a need for a statistical empirical profile of the text-genealogical significance of the different sorts of variation in different sorts of medieval texts. The rules that apply to copies of Greek and Latin classics may not apply to copies of medieval Dutch story collections; the practices of copying authoritative texts such as the Bible will most likely have been different from the practices of copying the Lives of local saints and other commonly adapted texts. It is nevertheless imperative that we have a consistent, flexible, and analytically tractable model for capturing these phenomena of transmission. In this article, we present a computational model that captures most of the phenomena of text variation, and a method for analysis of one or more stemma hypotheses against the variation model. We apply this method to three ‘artificial traditions’ (i.e. texts copied under laboratory conditions by scholars to study the properties of text variation) and four genuine medieval traditions whose transmission history is known or deduced in varying degrees. Although our findings are necessarily limited by the small number of texts at our disposal, we demonstrate here some of the wide variety of calculations that can be made using our model. Certain of our results call sharply into question the utility of excluding ‘trivial’ variation such as orthographic and spelling changes from stemmatic analysis.","PeriodicalId":235034,"journal":{"name":"Lit. Linguistic Comput.","volume":"80 2","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"40","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Lit. Linguistic Comput.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/llc/fqt032","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 40

Abstract

Stemmatology, or the reconstruction of the transmission history of texts, is a field that stands particularly to gain from digital methods. Many scholars already take stemmatic approaches that rely heavily on computational analysis of the collated text (e.g. Robinson and O’Hara 1996; Salemans 2000; Heikkila 2005; Windram et al. 2008 among many others). Although there is great value in computationally assisted stemmatology, providing as it does a reproducible result and allowing access to the relevant methodological process in related fields such as evolutionary biology, computational stemmatics is not without its critics. The current state-of-the-art effectively forces scholars to choose between a preconceived judgment of the significance of textual differences (the Lachmannian or neo-Lachmannian approach, and the weighted phylogenetic approach) or to make no judgment at all (the unweighted phylogenetic approach). Some basis for judgment of the significance of variation is sorely needed for medieval text criticism in particular. By this, we mean that there is a need for a statistical empirical profile of the text-genealogical significance of the different sorts of variation in different sorts of medieval texts. The rules that apply to copies of Greek and Latin classics may not apply to copies of medieval Dutch story collections; the practices of copying authoritative texts such as the Bible will most likely have been different from the practices of copying the Lives of local saints and other commonly adapted texts. It is nevertheless imperative that we have a consistent, flexible, and analytically tractable model for capturing these phenomena of transmission. In this article, we present a computational model that captures most of the phenomena of text variation, and a method for analysis of one or more stemma hypotheses against the variation model. We apply this method to three ‘artificial traditions’ (i.e. texts copied under laboratory conditions by scholars to study the properties of text variation) and four genuine medieval traditions whose transmission history is known or deduced in varying degrees. Although our findings are necessarily limited by the small number of texts at our disposal, we demonstrate here some of the wide variety of calculations that can be made using our model. Certain of our results call sharply into question the utility of excluding ‘trivial’ variation such as orthographic and spelling changes from stemmatic analysis.

查看原文本刊更多论文

超越文本树:通过文本和词干的图形分析建立抄写变化的经验模型

词干学，或文本传播历史的重建，是一个特别能从数字方法中获益的领域。许多学者已经采取了系统化的方法，严重依赖于对整理文本的计算分析(例如Robinson and O 'Hara 1996;业务员2000;么2005;Windram et al. 2008等)。尽管计算辅助系统学有很大的价值，因为它提供了一个可重复的结果，并允许访问相关领域(如进化生物学)的相关方法过程，但计算系统学并非没有批评者。目前的最新技术有效地迫使学者们在对文本差异的重要性进行先入为主的判断(拉赫曼方法或新拉赫曼方法，以及加权系统发育方法)或根本不做判断(非加权系统发育方法)之间做出选择。中世纪文本批评尤其需要一些判断变异意义的依据。通过这一点，我们的意思是，需要对文本的统计经验概况进行分析——不同类型中世纪文本中不同类型变异的宗谱意义。适用于希腊和拉丁经典书籍副本的规则可能不适用于中世纪荷兰故事集的副本;复制权威文本，如《圣经》的做法很可能与复制当地圣徒的生活和其他普遍改编的文本的做法不同。然而，我们必须有一个一致的、灵活的和分析上易于处理的模型来捕捉这些传播现象。在这篇文章中，我们提出了一个计算模型来捕捉文本变化的大多数现象，以及一种针对变化模型分析一个或多个系统假设的方法。我们将这种方法应用于三个“人工传统”(即学者在实验室条件下复制文本以研究文本变化的特性)和四个真正的中世纪传统，其传播历史在不同程度上已知或推断。虽然我们的发现必然受到我们所掌握的少量文本的限制，但我们在这里展示了使用我们的模型可以进行的各种各样的计算。我们的某些结果尖锐地质疑排除“琐碎”变化的效用，例如从系统分析中排除正字法和拼写变化。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Lit. Linguistic Comput.

自引率

0.00%

发文量