Corpora最新文献

筛选
英文 中文
Introducing the Swedish Learner English Corpus: a corpus that enables investigations of the impact of extramural activities on L2 writing 介绍瑞典学习者英语语料库:一个有助于调查校外活动对 L2 写作影响的语料库
IF 0.5
Corpora Pub Date : 2024-04-01 DOI: 10.3366/cor.2024.0296
Henrik Kaatari, Ying Wang, Tove Larsson
{"title":"Introducing the Swedish Learner English Corpus: a corpus that enables investigations of the impact of extramural activities on L2 writing","authors":"Henrik Kaatari, Ying Wang, Tove Larsson","doi":"10.3366/cor.2024.0296","DOIUrl":"https://doi.org/10.3366/cor.2024.0296","url":null,"abstract":"This paper introduces the Swedish Learner English Corpus (slec), which consists of argumentative texts in English that are written by Swedish junior and senior high school students. slec includes rich metadata, enabling empirical studies of various extra-linguistic variables. Most noteworthy is the inclusion of detailed information on students’ extramural English activities (ee), such as reading, watching, conversing, gaming and engaging in social media in English. In addition, a sub-set of texts from slec have been assessed for proficiency using the Common European Framework of Reference for Languages (cefr). This paper provides an overview of the corpus compilation process, the metadata, and the available versions of slec. Researchers, teachers and students can access this resource to investigate various aspects of second language use and development, such as the impact of extramural language activities on linguistic complexity.","PeriodicalId":44933,"journal":{"name":"Corpora","volume":null,"pages":null},"PeriodicalIF":0.5,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140756954","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Introducing the Single Player Offline Game Corpus (spoc): a corpus of seven registers from digital role-playing games 单人离线游戏语料库(spoc)介绍:由数字角色扮演游戏中的七个寄存器组成的语料库
IF 0.5
Corpora Pub Date : 2024-04-01 DOI: 10.3366/cor.2024.0300
Daniel H. Dixon
{"title":"Introducing the Single Player Offline Game Corpus (spoc): a corpus of seven registers from digital role-playing games","authors":"Daniel H. Dixon","doi":"10.3366/cor.2024.0300","DOIUrl":"https://doi.org/10.3366/cor.2024.0300","url":null,"abstract":"This paper describes the compilation and design of the Single Player Offline Game Corpus (spoc), which is being made freely available for research and educational purposes. The spoc was compiled by extracting the localisation files from the digital directories of four popular commercial digital role-playing games: Divinity: Original Sin II, Fallout 4, the Elder Scrolls V: Skyrim, and the Witcher 3: Wild Hunt. The 3.7 million word corpus contains more than 30,000 texts and is unique compared with other game corpora in that it has the following three characteristics: ( 1) the texts are categorised into seven registers using Biber and Conrad’s (2019) register framework, ( 2) texts are systematically parsed into the smallest meaningful units of observation, and ( 3) all texts were compiled from the data files of the games themselves. Nearly all language use in the four games is accounted for and parsed into register categories based on their underlying situational characteristics – in particular, the communicative purposes and the associated contexts in which the texts appear in the games.","PeriodicalId":44933,"journal":{"name":"Corpora","volume":null,"pages":null},"PeriodicalIF":0.5,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140787030","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The Video Game Dialogue Corpus 视频游戏对话语料库
IF 0.5
Corpora Pub Date : 2024-04-01 DOI: 10.3366/cor.2024.0299
Stephanie Rennick, Seán Roberts
{"title":"The Video Game Dialogue Corpus","authors":"Stephanie Rennick, Seán Roberts","doi":"10.3366/cor.2024.0299","DOIUrl":"https://doi.org/10.3366/cor.2024.0299","url":null,"abstract":"This paper presents the Video Game Dialogue Corpus, the first large-scale, consistently coded, open source corpus of dialogue from video games. It contains over 6.2 million words of English dialogue from fifty games in the Role Playing Game (rpg) genre. This includes games produced between 1985 and 2020, rated for children, teenagers and adults, and in both ‘Western’ and ‘Japanese’ sub-genres. The corpus design is described, including custom data formats for representing branching dialogue. We demonstrate the use of the corpus by comparing the dialogue of female and male characters, where we find reflections of gendered language in other media as well as patterns that seem specific to video games. We provide the source code for a ‘self-inflating corpus’ – a pipeline that obtains the data then processes and parses it into a standard format. This makes the corpus available for teaching and research purposes, providing the first such resource for empirical analysis of video game dialogue.","PeriodicalId":44933,"journal":{"name":"Corpora","volume":null,"pages":null},"PeriodicalIF":0.5,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140771700","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Developing a multimodal corpus of L2 academic English from an English medium of instruction university in China 开发中国以英语为教学语言的大学的第二语言学术英语多模态语料库
IF 0.5
Corpora Pub Date : 2024-04-01 DOI: 10.3366/cor.2024.0295
Yu-Hua Chen, Simon Harrison, Michaël Stevens, Qianqian Zhou
{"title":"Developing a multimodal corpus of L2 academic English from an English medium of instruction university in China","authors":"Yu-Hua Chen, Simon Harrison, Michaël Stevens, Qianqian Zhou","doi":"10.3366/cor.2024.0295","DOIUrl":"https://doi.org/10.3366/cor.2024.0295","url":null,"abstract":"This paper describes the rationale for and design of a new multimodal corpus of L2 academic English from a Sino-British university in China: the Corpus of Chinese Academic Written and Spoken English (cawse). The unique context for this corpus provides language samples from Chinese students who use English as a second language (L2) in a preliminary-year programme, which prepares students for academic studies at university level, at a campus where English is used as the Medium of Instruction (emi). Data were collected from a variety of settings, including written (i.e., exam scripts and essays) and spoken assessments (i.e., interviews and presentations), covering the full range of grades awarded to those language samples, as well as from student group interactions during teaching and learning activities. The multimodal nature of the corpus is realised through the availability of selected audio/video recordings accompanied by the orthographically transcribed text. This open-access corpus is designed to help shed light on Chinese students' academic L2 English language use in a variety of written, spoken and multimodal discourses.","PeriodicalId":44933,"journal":{"name":"Corpora","volume":null,"pages":null},"PeriodicalIF":0.5,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140773869","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Review: Barth and Schnell. 2022. Understanding Corpus Linguistics. New York: Routledge 回顾:Barth and Schnell.2022.理解语料库语言学》。New York:Routledge
IF 0.5
Corpora Pub Date : 2024-04-01 DOI: 10.3366/cor.2024.0302
Mohsen Shirazizadeh, Narges Moeini
{"title":"Review: Barth and Schnell. 2022. Understanding Corpus Linguistics. New York: Routledge","authors":"Mohsen Shirazizadeh, Narges Moeini","doi":"10.3366/cor.2024.0302","DOIUrl":"https://doi.org/10.3366/cor.2024.0302","url":null,"abstract":"","PeriodicalId":44933,"journal":{"name":"Corpora","volume":null,"pages":null},"PeriodicalIF":0.5,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140769001","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Exploring part of speech (pos) tag sequences in a large-scale learner corpus of L2 English: a developmental perspective 探索 L2 英语大规模学习者语料库中的语篇(pos)标记序列:发展视角
IF 0.5
Corpora Pub Date : 2024-04-01 DOI: 10.3366/cor.2024.0297
Joyce Dong Ok Lim, Geraldine Mark, P. Pérez-Paredes, Anne O’Keeffe
{"title":"Exploring part of speech (pos) tag sequences in a large-scale learner corpus of L2 English: a developmental perspective","authors":"Joyce Dong Ok Lim, Geraldine Mark, P. Pérez-Paredes, Anne O’Keeffe","doi":"10.3366/cor.2024.0297","DOIUrl":"https://doi.org/10.3366/cor.2024.0297","url":null,"abstract":"This research explores the pos tag sequences that shape the transition from upper intermediate (B2 cefr) to near-native proficiency (C2 cefr) in a corpus of essays ( n=32,410) from the Cambridge Learner Corpus. Gilquin (2018) and others have shown that pos tag sequences offer a holistic approach to extracting the most commonly used patterns without a starting point of an a priori set of words and word sequences. Using corpus linguistics informed by usage-based theories of language learning, this paper examines the frequency and distribution of 4-slot pos-tag sequences in L2 English writing, drawing on the taxonomy of pattern grammar ( Francis et al., 1996 , 1998 ; and Hunston and Francis, 2000 ). Findings point to the presence of both core and emergent pos-tag sequences in learner language in the two proficiency levels analysed. These sequences point to the presence of dynamic language restructuring processes as learners become more proficient and re-evaluate their understanding of frequency and distribution in English. This paper shows evidence of how language competence increases with proficiency. The research offers new evidence in our understanding of the development of L2 writing in efl contexts.","PeriodicalId":44933,"journal":{"name":"Corpora","volume":null,"pages":null},"PeriodicalIF":0.5,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140764945","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Triangulating visual and textual corpus-assisted discourse analysis to study social actor representations: the case of Saudi women in the British and Saudi news media 三角视觉和文本语料库辅助话语分析研究社会行为者的表征:英国和沙特新闻媒体中的沙特妇女案例
IF 0.5
Corpora Pub Date : 2024-04-01 DOI: 10.3366/cor.2024.0298
Dina Sibai, Sylvia Jaworska
{"title":"Triangulating visual and textual corpus-assisted discourse analysis to study social actor representations: the case of Saudi women in the British and Saudi news media","authors":"Dina Sibai, Sylvia Jaworska","doi":"10.3366/cor.2024.0298","DOIUrl":"https://doi.org/10.3366/cor.2024.0298","url":null,"abstract":"Investigations of social actor representations across media present a large and important body of research in corpus-assisted discourse studies (cads). However, most studies focus exclusively on one mode, the text, whilst other modes of communication (for example, visuals) are either considered partially or not at all. Whilst insights from textual analyses are invaluable in revealing salient and nuanced patterns of social actor representations in the media, visual accompaniments can reinforce particular ‘angles’ creating lasting perceptions for readers and viewers. Though some approaches exist to study considerable numbers of images, visual media data can be complex rendering them difficult to be studied alongside textual cads. This paper uses a triangulation of visual and textual cads analysis to explore social actor representations in media texts and images. It does so by focussing on the representations of Saudi women in the UK and Saudi news media within the context of evolving women’s rights in Saudi Arabia. The study shows how such triangulation can be conducted in a doable and systematic way and how it can enrich cads research on discursive representations of social actors across contexts.","PeriodicalId":44933,"journal":{"name":"Corpora","volume":null,"pages":null},"PeriodicalIF":0.5,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140777820","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Review: Islentyeva. 2020. Corpus-based Analysis of Ideological Bias: Migration in the British Press. London: Routledge 评论:Islenteva。2020.基于语料库的意识形态偏见分析:英国媒体的移民。伦敦:劳特利奇
IF 0.5
Corpora Pub Date : 2023-08-01 DOI: 10.3366/cor.2023.0285
A. Black
{"title":"Review: Islentyeva. 2020. Corpus-based Analysis of Ideological Bias: Migration in the British Press. London: Routledge","authors":"A. Black","doi":"10.3366/cor.2023.0285","DOIUrl":"https://doi.org/10.3366/cor.2023.0285","url":null,"abstract":"","PeriodicalId":44933,"journal":{"name":"Corpora","volume":null,"pages":null},"PeriodicalIF":0.5,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48989826","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Twenty-first century ideological discourses about US migrant education that transcend registers 21世纪关于美国移民教育的意识形态话语
IF 0.5
Corpora Pub Date : 2023-08-01 DOI: 10.3366/cor.2023.0280
Shannon Fitzsimmons‐Doolan
{"title":"Twenty-first century ideological discourses about US migrant education that transcend registers","authors":"Shannon Fitzsimmons‐Doolan","doi":"10.3366/cor.2023.0280","DOIUrl":"https://doi.org/10.3366/cor.2023.0280","url":null,"abstract":"Widely distributed and often repeated discursive patterns which represent migrants can influence the education of migrant students ( Calavita, 1996 ; Santa Ana, 2002 ; Cutler, 2017 ; and Dabach et al., 2017 ). Ideological discourses (e.g., ‘immigrants are threats’) are particularly potent structures that mediate language, cognition and social life. Whilst there has been a recent increase in studies of texts on the topic of migration generally, there are few that focus on the intersection of migration and education or on discursive patterns that transcend registers. This study introduces a multi-dimensional analysis approach for the identification of ideological discourses from a 9 million-word corpus of twenty-first century, US texts about migrant education from multiple registers (online comments, national and regional newspaper texts, and federal and state government webpages) using the distribution of lexical variables that characterise variants of migrant/ migration. Eleven ideological discourses (e.g., ‘US immigration policies are problematic, but there is no consensus for solutions’) were found. Of these, several had not been previously identified, one confirmed a previously identified discourse, and several complemented and extended previously identified discursive patterns on this topic. Together, these findings reveal the highly naturalised ideologically discursive landscape that shapes educational opportunities for US migrant students.","PeriodicalId":44933,"journal":{"name":"Corpora","volume":null,"pages":null},"PeriodicalIF":0.5,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48339384","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Towards increased reliability and transparency in projects with manual linguistic coding 提高手动语言编码项目的可靠性和透明度
IF 0.5
Corpora Pub Date : 2023-08-01 DOI: 10.3366/cor.2023.0284
Nicole Hober, Tülay Dixon, Tove Larsson
{"title":"Towards increased reliability and transparency in projects with manual linguistic coding","authors":"Nicole Hober, Tülay Dixon, Tove Larsson","doi":"10.3366/cor.2023.0284","DOIUrl":"https://doi.org/10.3366/cor.2023.0284","url":null,"abstract":"Manually coded data form the basis of many of our analyses in corpus linguistics. It is thus imperative that we work towards increased reliability and enhanced transparency in our coding practices, since failing to do so may ultimately lead us to draw erroneous conclusions about language. Using spoken data from a study on adverb usage for illustration, this methods paper discusses some strategies for identifying threats to the reliability of our coding and offers suggestions for how to mitigate these and ensure that our coding can be assessed and replicated. The paper also includes suggestions for best practices for manual linguistic coding and concludes with a discussion of the benefits of such practices. With this paper, we expand on the ongoing discussions in the field on issues of reliability and transparency as they relate to manual coding. We argue that while tests of inter-rater reliability offer a helpful starting point, further steps are needed to ensure increased reliability and transparency.","PeriodicalId":44933,"journal":{"name":"Corpora","volume":null,"pages":null},"PeriodicalIF":0.5,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41419454","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信