Paula Rodríguez-Puente, Cristina Blanco-García, Iván Tamaredo
{"title":"Mark-up and Annotation in the Corpus of Historical English Law Reports (CHELAR): Potential for Historical Genre Analysis","authors":"Paula Rodríguez-Puente, Cristina Blanco-García, Iván Tamaredo","doi":"10.28914/atlantis-2019-41.2.03","DOIUrl":null,"url":null,"abstract":"Adding annotation and mark-up to linguistic corpora has become a standard practice in corpus building over the past few decades as a way to facilitate data extraction and at the same time guarantee that new corpora are compatible with existing and future tools. The purpose of this article is twofold. First, we provide an overview of the main forms of annotation and mark-up available to the research community and how they have been applied to the Corpus of Historical English Law Reports 1535-1999 (CHELAR), a specialized corpus consisting of law reports or records of judicial decisions. Second, we give an account of preliminary research based on the annotated versions of CHELAR, which so far has been primarily aimed at identifying the distinctive linguistic characteristics of law reports, as well as at investigating how the language of law reports has evolved over a time span of almost five centuries. Our article illustrates the multiple advantages of applying a simple annotation schema to a corpus and how this can enhance the potential of a corpus for historical genre analysis.Keywords: corpus annotation; corpus mark-up; law reports; TEI-XML; legal English","PeriodicalId":54016,"journal":{"name":"Atlantis-Journal of the Spanish Association of Anglo-American Studies","volume":"39 1","pages":"63-84"},"PeriodicalIF":0.4000,"publicationDate":"2019-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Atlantis-Journal of the Spanish Association of Anglo-American Studies","FirstCategoryId":"98","ListUrlMain":"https://doi.org/10.28914/atlantis-2019-41.2.03","RegionNum":3,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"0","JCRName":"LANGUAGE & LINGUISTICS","Score":null,"Total":0}
引用次数: 1
Abstract
Adding annotation and mark-up to linguistic corpora has become a standard practice in corpus building over the past few decades as a way to facilitate data extraction and at the same time guarantee that new corpora are compatible with existing and future tools. The purpose of this article is twofold. First, we provide an overview of the main forms of annotation and mark-up available to the research community and how they have been applied to the Corpus of Historical English Law Reports 1535-1999 (CHELAR), a specialized corpus consisting of law reports or records of judicial decisions. Second, we give an account of preliminary research based on the annotated versions of CHELAR, which so far has been primarily aimed at identifying the distinctive linguistic characteristics of law reports, as well as at investigating how the language of law reports has evolved over a time span of almost five centuries. Our article illustrates the multiple advantages of applying a simple annotation schema to a corpus and how this can enhance the potential of a corpus for historical genre analysis.Keywords: corpus annotation; corpus mark-up; law reports; TEI-XML; legal English