{"title":"Peculiarities of Avestan Manuscripts for Computational Linguistics","authors":"Thomas Jügel","doi":"10.21248/jlcl.27.2012.161","DOIUrl":null,"url":null,"abstract":"This paper will discuss several computational tools f r creating a stemma of Avestan manuscripts, such as: a letter similarity matrix, a mor phological expander, and co-occurrence networks. After a short introduction to Avestan and Avestan manuscripts and a representation of Avestan peculiarities concerning the creati on of stemmata, the operatability of the above-mentioned tools for this text corpus will be discussed. Finally, I will give a brief outlook on the complexity of a database structure f o Avestan texts. Introduction The Avesta, represented by the edition of G ELDNER (1886-96), appears to be a sort of Bible containing several books or chapters, cf. S KJÆRVØ’s “sacred book of the Zoroastrians” (2009: 44); and, indeed, in Middle Iranian times (i .e., before 600 AD) there existed a kind of text corpus, rather than ‘a book’, of holy texts (C ANTERA 2004). However, GELDNER’s edition disguises the actual texts of the manuscripts because what we have today is not a book but a collection of ceremonies attested in various manuscripts. Avestan is the term for an Old Iranian language, as such a member of the IndoEuropean language family. The actual name of the la ngu ge is not known to us. The name ‘Avestan’ is taken from Middle Persian texts which refer to their religious text corpus as the “abest ā(g)”. When manuscripts containing these religious t exts came to light for European research, they were referred to as “Avesta” and the language as “Avestan”. 2 Avestan is known to us in two varieties, called “Ol d Avestan” and “Young Avestan”. This is so because they display two different chron ol gical layers of Avestan. However, they also differ in some linguistic respect so that t ey represent two different dialects of the same language (e.g., genitive singular of xratu“wisdom” is xratə̄uš in Old Avestan but xraθβō in Young Avestan, for further examples see DE VAAN 2003: 8ff.). The Avestan manuscripts (henceforth MS) can be sort ed into several groups, the main grouping is: 1) the ‘Pahlavi-MSs’, and 2) the ‘Sade -MSs’. The Pahlavi-MSs contain the Avestan text plus its translation and commentaries, g nerally Middle Persian, but there are translations into Sanskrit, Gujarati and/or New Per sian as well. 3 The Sade-MSs (i.e., the “pure” MS) only contain ritual instructions in Midd le Persian, etc., besides the Avestan text. The Pahlavi-MS served as exegetical texts written f or scholarly use only. On the contrary, the Sade-MSs were for the daily use in the ceremoni es. These different purposes had an influence on the copying process (cf. Section 1). The aforementioned grouping can be made by first gl ance at the MS because of the various writings these MSs do or do not contain. Be sid s the grouping into Pahlaviand Sade-MSs, the MSs are further classified into diffe rent ceremonies. There are four of them: the Yasna Rapihwin, V īsprad, Yašt, and V īdēvdād ceremony. Depending on the season or on the deity who is invoked, there are further diff erences in what is otherwise the same","PeriodicalId":402489,"journal":{"name":"J. Lang. Technol. Comput. Linguistics","volume":"26 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"J. Lang. Technol. Comput. Linguistics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.21248/jlcl.27.2012.161","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
This paper will discuss several computational tools f r creating a stemma of Avestan manuscripts, such as: a letter similarity matrix, a mor phological expander, and co-occurrence networks. After a short introduction to Avestan and Avestan manuscripts and a representation of Avestan peculiarities concerning the creati on of stemmata, the operatability of the above-mentioned tools for this text corpus will be discussed. Finally, I will give a brief outlook on the complexity of a database structure f o Avestan texts. Introduction The Avesta, represented by the edition of G ELDNER (1886-96), appears to be a sort of Bible containing several books or chapters, cf. S KJÆRVØ’s “sacred book of the Zoroastrians” (2009: 44); and, indeed, in Middle Iranian times (i .e., before 600 AD) there existed a kind of text corpus, rather than ‘a book’, of holy texts (C ANTERA 2004). However, GELDNER’s edition disguises the actual texts of the manuscripts because what we have today is not a book but a collection of ceremonies attested in various manuscripts. Avestan is the term for an Old Iranian language, as such a member of the IndoEuropean language family. The actual name of the la ngu ge is not known to us. The name ‘Avestan’ is taken from Middle Persian texts which refer to their religious text corpus as the “abest ā(g)”. When manuscripts containing these religious t exts came to light for European research, they were referred to as “Avesta” and the language as “Avestan”. 2 Avestan is known to us in two varieties, called “Ol d Avestan” and “Young Avestan”. This is so because they display two different chron ol gical layers of Avestan. However, they also differ in some linguistic respect so that t ey represent two different dialects of the same language (e.g., genitive singular of xratu“wisdom” is xratə̄uš in Old Avestan but xraθβō in Young Avestan, for further examples see DE VAAN 2003: 8ff.). The Avestan manuscripts (henceforth MS) can be sort ed into several groups, the main grouping is: 1) the ‘Pahlavi-MSs’, and 2) the ‘Sade -MSs’. The Pahlavi-MSs contain the Avestan text plus its translation and commentaries, g nerally Middle Persian, but there are translations into Sanskrit, Gujarati and/or New Per sian as well. 3 The Sade-MSs (i.e., the “pure” MS) only contain ritual instructions in Midd le Persian, etc., besides the Avestan text. The Pahlavi-MS served as exegetical texts written f or scholarly use only. On the contrary, the Sade-MSs were for the daily use in the ceremoni es. These different purposes had an influence on the copying process (cf. Section 1). The aforementioned grouping can be made by first gl ance at the MS because of the various writings these MSs do or do not contain. Be sid s the grouping into Pahlaviand Sade-MSs, the MSs are further classified into diffe rent ceremonies. There are four of them: the Yasna Rapihwin, V īsprad, Yašt, and V īdēvdād ceremony. Depending on the season or on the deity who is invoked, there are further diff erences in what is otherwise the same