晚期拉丁语宪章树库:内容和注释

IF 0.8 Q3 LINGUISTICS
Corpora Pub Date : 2021-01-01 DOI:10.3366/cor.2021.0217
Timo Korkiakangas
{"title":"晚期拉丁语宪章树库:内容和注释","authors":"Timo Korkiakangas","doi":"10.3366/cor.2021.0217","DOIUrl":null,"url":null,"abstract":"This paper describes the construction and annotation of the Late Latin Charter Treebank, a set of three dependency treebanks (llct1, llct2 and llct3) which together contain 1,261 Early Medieval Latin documentary texts (i.e., original charters) written in Italy between ad 714 and 1000 (about 594,000 tokens). The paper focusses on matters which a linguistically or philologically inclined user of llct needs to know: the criteria on which the charters were selected, the special characteristics of the annotation types utilised, and the geographical and chronological distribution of the data. In addition to normal queries on forms, lemmas, morphology and syntax, complex philological research settings are enabled by the textual annotation layer of llct, which indicates abbreviated and damaged words, as well as the formulaic and non-formulaic passages of each charter.","PeriodicalId":44933,"journal":{"name":"Corpora","volume":"28 1","pages":""},"PeriodicalIF":0.8000,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"Late Latin Charter Treebank: contents and annotation\",\"authors\":\"Timo Korkiakangas\",\"doi\":\"10.3366/cor.2021.0217\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper describes the construction and annotation of the Late Latin Charter Treebank, a set of three dependency treebanks (llct1, llct2 and llct3) which together contain 1,261 Early Medieval Latin documentary texts (i.e., original charters) written in Italy between ad 714 and 1000 (about 594,000 tokens). The paper focusses on matters which a linguistically or philologically inclined user of llct needs to know: the criteria on which the charters were selected, the special characteristics of the annotation types utilised, and the geographical and chronological distribution of the data. In addition to normal queries on forms, lemmas, morphology and syntax, complex philological research settings are enabled by the textual annotation layer of llct, which indicates abbreviated and damaged words, as well as the formulaic and non-formulaic passages of each charter.\",\"PeriodicalId\":44933,\"journal\":{\"name\":\"Corpora\",\"volume\":\"28 1\",\"pages\":\"\"},\"PeriodicalIF\":0.8000,\"publicationDate\":\"2021-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Corpora\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3366/cor.2021.0217\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"LINGUISTICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Corpora","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3366/cor.2021.0217","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"LINGUISTICS","Score":null,"Total":0}
引用次数: 5

摘要

本文描述了晚期拉丁宪章树库的构建和注释,这是一个由三个依赖树库(llct1, llct2和llct3)组成的集合,共包含1261个早期中世纪拉丁文献文本(即原始宪章),写于公元714年至公元1000年之间的意大利(约594,000个标记)。本文的重点是语言学或语言学倾向的用户需要知道的事项:选择宪章的标准,所使用的注释类型的特殊特征,以及数据的地理和时间分布。除了对形式、引理、词法和句法的常规查询外,llct的文本注释层还支持复杂的文字学研究设置,它可以显示缩写和损坏的单词,以及每个宪章的公式化和非公式化段落。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Late Latin Charter Treebank: contents and annotation
This paper describes the construction and annotation of the Late Latin Charter Treebank, a set of three dependency treebanks (llct1, llct2 and llct3) which together contain 1,261 Early Medieval Latin documentary texts (i.e., original charters) written in Italy between ad 714 and 1000 (about 594,000 tokens). The paper focusses on matters which a linguistically or philologically inclined user of llct needs to know: the criteria on which the charters were selected, the special characteristics of the annotation types utilised, and the geographical and chronological distribution of the data. In addition to normal queries on forms, lemmas, morphology and syntax, complex philological research settings are enabled by the textual annotation layer of llct, which indicates abbreviated and damaged words, as well as the formulaic and non-formulaic passages of each charter.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Corpora
Corpora LINGUISTICS-
CiteScore
1.70
自引率
0.00%
发文量
20
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信