ICAME journal : computers in English linguistics最新文献

筛选
英文 中文
Supporting the corpus-based study of Shakespeare’s language: Enhancing a corpus of the First Folio 支持基于语料库的莎士比亚语言研究:加强《第一对开本》的语料库
ICAME journal : computers in English linguistics Pub Date : 2021-05-01 DOI: 10.2478/icame-2021-0002
Jonathan Culpeper, A. Hardie, J. Demmen, Jennifer Hughes, Matt Timperley
{"title":"Supporting the corpus-based study of Shakespeare’s language: Enhancing a corpus of the First Folio","authors":"Jonathan Culpeper, A. Hardie, J. Demmen, Jennifer Hughes, Matt Timperley","doi":"10.2478/icame-2021-0002","DOIUrl":"https://doi.org/10.2478/icame-2021-0002","url":null,"abstract":"Abstract This article explores challenges in the corpus linguistic analysis of Shakespeare’s language, and Early Modern English more generally, with particular focus on elaborating possible solutions and the benefits they bring. An account of work that took place within the Encyclopedia of Shakespeare’s Language Project (2016–2019) is given, which discusses the development of the project’s data resources, specifically, the Enhanced Shakespearean Corpus. Topics covered include the composition of the corpus and its subcomponents; the structure of the XML markup; the design of the extensive character metadata; and the word-level corpus annotation, including spelling regularisation, part-of-speech tagging, lemmatisation and semantic tagging. The challenges that arise from each of these undertakings are not exclusive to a corpus-based treatment of Shakespeare’s plays but it is in the context of Shakespeare’s language that they are so severe as to seem almost insurmountable. The solutions developed for the Enhanced Shakespearean Corpus – often combining automated manipulation with manual interventions, and always principled – offer a way through.","PeriodicalId":73271,"journal":{"name":"ICAME journal : computers in English linguistics","volume":"4 1","pages":"37 - 86"},"PeriodicalIF":0.0,"publicationDate":"2021-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91336014","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Better data for more researchers – using the audio features of BNCweb 为更多的研究人员提供更好的数据——利用BNCweb的音频功能
ICAME journal : computers in English linguistics Pub Date : 2021-05-01 DOI: 10.2478/icame-2021-0004
S. Hoffmann, Sabine Arndt-Lappe
{"title":"Better data for more researchers – using the audio features of BNCweb","authors":"S. Hoffmann, Sabine Arndt-Lappe","doi":"10.2478/icame-2021-0004","DOIUrl":"https://doi.org/10.2478/icame-2021-0004","url":null,"abstract":"Abstract In spite of the wide agreement among linguists as to the significance of spoken language data, actual speech data have not formed the basis of empirical work on English as much as one would think. The present paper is intended to contribute to changing this situation, on a theoretical and on a practical level. On a theoretical level, we discuss different research traditions within (English) linguistics. Whereas speech data have become increasingly important in various linguistic disciplines, major corpora of English developed within the corpus-linguistic community, carefully sampled to be representative of language usage, are usually restricted to orthographic transcriptions of spoken language. As a result, phonological phenomena have remained conspicuously understudied within traditional corpus linguistics. At the same time, work with current speech corpora often requires a considerable level of specialist knowledge and tailor-made solutions. On a practical level, we present a new feature of BNCweb (Hoffmann et al. 2008), a user-friendly interface to the British National Corpus, which gives users access to audio and phonemic transcriptions of more than five million words of spontaneous speech. With the help of a pilot study on the variability of intrusive r we illustrate the scope of the new possibilities.","PeriodicalId":73271,"journal":{"name":"ICAME journal : computers in English linguistics","volume":"5 1","pages":"125 - 154"},"PeriodicalIF":0.0,"publicationDate":"2021-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88831729","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Comparing written Indian Englishes with the new Corpus of Regional Indian Newspaper Englishes (CORINNE) 印度书面英语与新印度地区报纸英语语料库的比较
ICAME journal : computers in English linguistics Pub Date : 2021-05-01 DOI: 10.2478/icame-2021-0006
Asya Yurchenko, S. Leuckert, C. Lange
{"title":"Comparing written Indian Englishes with the new Corpus of Regional Indian Newspaper Englishes (CORINNE)","authors":"Asya Yurchenko, S. Leuckert, C. Lange","doi":"10.2478/icame-2021-0006","DOIUrl":"https://doi.org/10.2478/icame-2021-0006","url":null,"abstract":"Abstract This article introduces the new Corpus of Regional Indian Newspaper Englishes (CORINNE). The current version of CORINNE contains news and other text types from regional Indian newspapers published between 2015 and 2020, covering 13 states and regions so far. The corpus complements previous corpora, such as the Indian component of the International Corpus of English (ICE) as well as the Indian section of the South Asian Varieties of English (SAVE) corpus, by giving researchers the opportunity to analyse and compare regional (written) Englishes in India. In the first sections of the paper we discuss the rationale for creating CORINNE as well as the development of the corpus. We stress the potential of CORINNE and go into detail about selection criteria for the inclusion of newspapers as well as corpus compilation and the current word count. In order to show the potential of the corpus, the paper presents a case study of ‘intrusive as’, a syntactic feature that has made its way into formal registers of Indian English. Based on two subcorpora covering newspapers from Tamil Nadu and Uttarakhand, we compare frequencies and usage patterns of call (as) and term (as). The case study lends further weight to the hypothesis that the presence or absence of a quotative in the majority language spoken in an Indian state has an impact on the frequency of ‘intrusive as’. Finally, we foreshadow the next steps in the development of CORINNE as well as potential studies that can be carried out using the corpus.","PeriodicalId":73271,"journal":{"name":"ICAME journal : computers in English linguistics","volume":"29 1","pages":"179 - 205"},"PeriodicalIF":0.0,"publicationDate":"2021-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80451439","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Evidentiality in gendered styles in spoken English 英语口语中性别风格的证据性
ICAME journal : computers in English linguistics Pub Date : 2020-03-01 DOI: 10.2478/icame-2020-0001
E. Söderqvist
{"title":"Evidentiality in gendered styles in spoken English","authors":"E. Söderqvist","doi":"10.2478/icame-2020-0001","DOIUrl":"https://doi.org/10.2478/icame-2020-0001","url":null,"abstract":"","PeriodicalId":73271,"journal":{"name":"ICAME journal : computers in English linguistics","volume":"34 1","pages":"35 - 5"},"PeriodicalIF":0.0,"publicationDate":"2020-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78428150","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Corpus linguistics and the description of English 语料库语言学与英语描述
ICAME journal : computers in English linguistics Pub Date : 2020-03-01 DOI: 10.2478/icame-2020-0006
Stefan Diemer
{"title":"Corpus linguistics and the description of English","authors":"Stefan Diemer","doi":"10.2478/icame-2020-0006","DOIUrl":"https://doi.org/10.2478/icame-2020-0006","url":null,"abstract":"","PeriodicalId":73271,"journal":{"name":"ICAME journal : computers in English linguistics","volume":"12 1","pages":"105 - 109"},"PeriodicalIF":0.0,"publicationDate":"2020-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75267780","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Applications of pattern-driven methods in corpus linguistics 模式驱动方法在语料库语言学中的应用
ICAME journal : computers in English linguistics Pub Date : 2020-03-01 DOI: 10.2478/icame-2020-0005
C. Geisler
{"title":"Applications of pattern-driven methods in corpus linguistics","authors":"C. Geisler","doi":"10.2478/icame-2020-0005","DOIUrl":"https://doi.org/10.2478/icame-2020-0005","url":null,"abstract":"","PeriodicalId":73271,"journal":{"name":"ICAME journal : computers in English linguistics","volume":"119 1","pages":"102 - 104"},"PeriodicalIF":0.0,"publicationDate":"2020-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77481583","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Corpus linguistics and African Englishes 语料库语言学与非洲英语
ICAME journal : computers in English linguistics Pub Date : 2020-03-01 DOI: 10.2478/icame-2020-0004
Frederic Zähres
{"title":"Corpus linguistics and African Englishes","authors":"Frederic Zähres","doi":"10.2478/icame-2020-0004","DOIUrl":"https://doi.org/10.2478/icame-2020-0004","url":null,"abstract":"","PeriodicalId":73271,"journal":{"name":"ICAME journal : computers in English linguistics","volume":"4 1","pages":"101 - 97"},"PeriodicalIF":0.0,"publicationDate":"2020-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73979055","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Issues and challenges in compiling a corpus of Early Modern English plays for comparison with those of William Shakespeare 编纂早期现代英语戏剧语料库以与莎士比亚戏剧进行比较的问题和挑战
ICAME journal : computers in English linguistics Pub Date : 2020-03-01 DOI: 10.2478/icame-2020-0002
J. Demmen
{"title":"Issues and challenges in compiling a corpus of Early Modern English plays for comparison with those of William Shakespeare","authors":"J. Demmen","doi":"10.2478/icame-2020-0002","DOIUrl":"https://doi.org/10.2478/icame-2020-0002","url":null,"abstract":"Abstract In this article I discuss the issues and challenges of compiling a corpus of historical plays by a range of playwrights that is highly suitable for use in comparative, corpus-based research into language style in Shakespeare’s plays. In discussing sources for digitised historical play-texts and criteria for making a selection for the present study, I argue that not just any set of Early Modern English plays constitutes a suitable basis upon which to make reliable claims about language style in Shakespeare’s plays relative to those of his peers. I point out factors outside of authorial choice which potentially have bearing on language style, such as sub-genre features and change over time. I also highlight some particular difficulties in compiling a corpus of historical texts, notably dating and spelling variation, and I explain how these were addressed. The corpus detailed in this article extends the prospects for investigating Shakespeare’s language style by providing a context into which it can be set and, as I indicate, is a valuable new publicly accessible resource for future research.","PeriodicalId":73271,"journal":{"name":"ICAME journal : computers in English linguistics","volume":"192 1","pages":"37 - 68"},"PeriodicalIF":0.0,"publicationDate":"2020-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73366773","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
There’s more to alternations than the main diagonal of a 2×2 confusion matrix: Improvements of MuPDAR and other classificatory alternation studies 除了2×2混淆矩阵的主对角线之外,还有更多关于交替的内容:MuPDAR和其他分类交替研究的改进
ICAME journal : computers in English linguistics Pub Date : 2020-03-01 DOI: 10.2478/icame-2020-0003
S. Gries, Santa Barbara, J. Liebig, Sandra C. Deshors
{"title":"There’s more to alternations than the main diagonal of a 2×2 confusion matrix: Improvements of MuPDAR and other classificatory alternation studies","authors":"S. Gries, Santa Barbara, J. Liebig, Sandra C. Deshors","doi":"10.2478/icame-2020-0003","DOIUrl":"https://doi.org/10.2478/icame-2020-0003","url":null,"abstract":"Abstract Corpus-based studies of learner language and (especially) English varieties have become more quantitative in nature and increasingly use regression-based methods and classifiers such as classification trees, random forests, etc. One recent development more widely used is the MuPDAR (Multifactorial Prediction and Deviation Analysis using Regressions) approach of Gries and Deshors (2014) and Gries and Adelman (2014). This approach attempts to improve on traditional regression- or tree-based approaches by, firstly, training a model on the reference speakers (often native speakers (NS) in learner corpus studies or British English speakers in variety studies), then, secondly, using this model to predict what such a reference speaker would produce in the situation the target speaker is in (often non-native speakers (NNS) or indigenized-variety speakers). Crucially, the third step then consists of determining whether the target speakers made a canonical choice or not and explore that variability with a second regression model or classifier. Both regression-based modeling in general and MuPDAR in particular have led to many interesting results, but we want to propose two changes in perspective on the results they produce. First, we want to focus attention on the middle ground of the prediction space, i.e. the predictions of a regression/classifier that, essentially, are made non-confidently and translate into a statement such as ‘in this context, both/all alternants would be fine’. Second, we want to make a plug for a greater attention to misclassifications/-predictions and propose a method to identify those as well as discuss what we can learn from studying them. We exemplify our two suggestions based on a brief case study, namely the dative alternation in native and learner corpus data.","PeriodicalId":73271,"journal":{"name":"ICAME journal : computers in English linguistics","volume":"407 1","pages":"69 - 96"},"PeriodicalIF":0.0,"publicationDate":"2020-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84870247","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Early Modern Multiloquent Authors (EMMA): Designing a large-scale corpus of individuals’ languages 早期现代多语作者(EMMA):设计一个大规模的个人语言语料库
ICAME journal : computers in English linguistics Pub Date : 2019-03-01 DOI: 10.2478/icame-2019-0004
P. Petré, Lynn Anthonissen, Sara Budts, Enrique Manjavacas, Emma-Louise Silva, William H. Standing, Odile A. O. Strik
{"title":"Early Modern Multiloquent Authors (EMMA): Designing a large-scale corpus of individuals’ languages","authors":"P. Petré, Lynn Anthonissen, Sara Budts, Enrique Manjavacas, Emma-Louise Silva, William H. Standing, Odile A. O. Strik","doi":"10.2478/icame-2019-0004","DOIUrl":"https://doi.org/10.2478/icame-2019-0004","url":null,"abstract":"Abstract The present article provides a detailed description of the corpus of Early Modern Multiloquent Authors (EMMA), as well as two small case studies that illustrate its benefits. As a large-scale specialized corpus, EMMA tries to strike the right balance between big data and sociolinguistic coverage. It comprises the writings of 50 carefully selected authors across five generations, mostly taken from the 17th-century London society. EMMA enables the study of language as both a social and cognitive phenomenon and allows us to explore the interaction between the individual and aggregate levels. The first part of the article is a detailed description of EMMA’s first release as well as the sociolinguistic and methodological principles that underlie its design and compilation. We cover the conceptual decisions and practical implementations at various stages of the compilation process: from text-markup, encoding and data preprocessing to metadata enrichment and verification. In the second part, we present two small case studies to illustrate how rich contextualization can guide the interpretation of quantitative corpus-linguistic findings. The first case study compares the past tense formation of strong verbs in writers without access to higher education to that of writers with an extensive training in Latin. The second case study relates s/th-variation in the language of a single writer, Margaret Cavendish, to major shifts in her personal life.","PeriodicalId":73271,"journal":{"name":"ICAME journal : computers in English linguistics","volume":"63 5 1","pages":"122 - 83"},"PeriodicalIF":0.0,"publicationDate":"2019-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90739725","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 21
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信