Corpora最新文献_第8页

Front matter 前页

IF 0.5

Corpora Pub Date : 2020-11-01 DOI: 10.3366/cor.2020.0198

引用次数: 0

Multilingualism in Greater Poland court records (1386–1448): tagging discourse boundaries and code-switching 大波兰法庭记录中的多语现象（1386-1448）：标记话语边界和代码转换

IF 0.5

Corpora Pub Date : 2020-11-01 DOI: 10.3366/cor.2020.0200

M. Włodarczyk, J. Kopaczyk, M. Kozák

{"title":"Multilingualism in Greater Poland court records (1386–1448): tagging discourse boundaries and code-switching","authors":"M. Włodarczyk, J. Kopaczyk, M. Kozák","doi":"10.3366/cor.2020.0200","DOIUrl":"https://doi.org/10.3366/cor.2020.0200","url":null,"abstract":"This paper introduces the Electronic Repository of Greater Poland Oaths, eROThA (1386–1446), a digitisation project of a diplomatic edition of mediaeval land court oaths recorded in Latin and Old Polish, resulting in a small, lightly tagged specialised bilingual corpus. We present the background, aims, design and methodology of the project. We also discuss the problems and limitations entrenched in turning a printed diplomatic edition into a machine-readable diplomatic edition equipped with a new interpretative layer that is sensitive to the switches between Latin and Old Polish. In addition to the automatic annotation of code-switched items on the basis of typographic characteristics of the printed edition, flexible coding of recurrent language and discourse boundary phenomena has been introduced manually to account for linguistically ambiguous or neutral forms. The project offers a fully multilingual corpus, as well as customised Polish-only and Latin-only datasets, and enables filtered metadata searches in the online front-end. Overall, the report presents a methodology for constructing multilingual corpora in the context of legal cultures in medieval Central Europe that may be extrapolated to datasets originating in other periods and regions.","PeriodicalId":44933,"journal":{"name":"Corpora","volume":" ","pages":""},"PeriodicalIF":0.5,"publicationDate":"2020-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44457108","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

A historical characterisation of American and Brazilian cultures based on lexical representations 基于词汇表征的美国和巴西文化的历史特征

IF 0.5

Corpora Pub Date : 2020-08-27 DOI: 10.3366/cor.2020.0194

Tony Berber Sardinha

引用次数: 2

Calculating and displaying key labels: the texts, sections, authors and neighbourhoods where words and collocations are likely to be prominent 计算和显示关键标签：单词和搭配可能突出的文本、章节、作者和社区

IF 0.5

Corpora Pub Date : 2020-08-27 DOI: 10.3366/cor.2020.0193

Stephen Jeaco

引用次数: 1

Review: McIntyre and Walker. 2019. Corpus Stylistics 评论：麦金太尔和沃克。2019.语料库文体学