Applied Corpus Linguistics最新文献

筛选
英文 中文
Replication as a means of assessing corpus representativeness and the generalizability of specialized word lists 复制作为一种评估语料库代表性和专业词表可泛化性的方法
Applied Corpus Linguistics Pub Date : 2022-12-01 DOI: 10.1016/j.acorp.2022.100027
Don Miller
{"title":"Replication as a means of assessing corpus representativeness and the generalizability of specialized word lists","authors":"Don Miller","doi":"10.1016/j.acorp.2022.100027","DOIUrl":"10.1016/j.acorp.2022.100027","url":null,"abstract":"<div><p>Considerable energy has gone into designing lists of words that are salient in discourse domains of varying breadth. Over the past two decades, most efforts in designing and validating corpus-based frequency lists have focused on three areas: corpus compilation, item selection criteria, and coverage-based demonstrations of list robustness. As a result, modern corpora are now often much larger and better balanced; the application of additional dispersion statistics allows for better targeting of items with desired distributions; and contemporary lexical frequency lists are proving increasingly efficient, providing ever higher coverage of target texts or achieving such coverage with fewer words. However, despite these important advances, relatively minimal attention has been paid to word list reliability—the extent to which lists can be generalized to the wider discourse domain that has been represented by the corpora upon which they are based. This study begins to address this gap, demonstrating via two word list development case studies (one for Environmental Science and one for Applied Linguistics) that adding iterative reliability analysis—via methodological replication with corpora of increasing size and comparison of items on resulting lists—can be used to: 1) inform corpus design beyond what Biber (1991) terms “situational” parameters, allowing us to see whether corpora are adequately representative of lexical distributions in target discourse domains; and 2) provide valuable insight into the degree of generalizability of word lists we have developed.</p></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666799122000120/pdfft?md5=99bdd61e7345f961aa3e0dbbbda0d186&pid=1-s2.0-S2666799122000120-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49471849","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Corpus-aided EAP writing workshops to support international scholarly publication 语料库辅助EAP写作工作坊,支持国际学术出版
Applied Corpus Linguistics Pub Date : 2022-12-01 DOI: 10.1016/j.acorp.2022.100029
Ana Frankenberg-Garcia , Paula Tavares Pinto , Ana Eliza Pereira Bocorny , Simone Sarmento
{"title":"Corpus-aided EAP writing workshops to support international scholarly publication","authors":"Ana Frankenberg-Garcia ,&nbsp;Paula Tavares Pinto ,&nbsp;Ana Eliza Pereira Bocorny ,&nbsp;Simone Sarmento","doi":"10.1016/j.acorp.2022.100029","DOIUrl":"10.1016/j.acorp.2022.100029","url":null,"abstract":"<div><p>Writing for international scholarly publication is hard, and arguably harder for researchers with English as an additional language. English teachers could help them, but most teachers have little or no experience of research writing or the specialized languages researchers use. This study trialled and evaluated workshops for Brazilian researchers and English teachers learning together to use corpora and corpus-based tools to develop autonomy in writing and teaching academic English writing for scholarly publication.</p></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666799122000144/pdfft?md5=fa1c82c2ee110a621abaa295dc402598&pid=1-s2.0-S2666799122000144-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47583185","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Review of Durrant, Brenchley, and McCallum (2021) Understanding development and proficiency in writing: Quantitative corpus linguistic approaches Durrant, Brenchley和McCallum(2021)理解写作的发展和熟练程度:定量语料库语言学方法
Applied Corpus Linguistics Pub Date : 2022-12-01 DOI: 10.1016/j.acorp.2022.100024
Ashleigh Cox
{"title":"Review of Durrant, Brenchley, and McCallum (2021) Understanding development and proficiency in writing: Quantitative corpus linguistic approaches","authors":"Ashleigh Cox","doi":"10.1016/j.acorp.2022.100024","DOIUrl":"10.1016/j.acorp.2022.100024","url":null,"abstract":"","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47207770","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Usable Amharic text corpus for natural language processing applications 可用的阿姆哈拉语文本语料库用于自然语言处理应用程序
Applied Corpus Linguistics Pub Date : 2022-12-01 DOI: 10.1016/j.acorp.2022.100033
Michael Melese Woldeyohannis, Million Meshesha
{"title":"Usable Amharic text corpus for natural language processing applications","authors":"Michael Melese Woldeyohannis,&nbsp;Million Meshesha","doi":"10.1016/j.acorp.2022.100033","DOIUrl":"10.1016/j.acorp.2022.100033","url":null,"abstract":"<div><p>In this paper, we describe the preparation of a usable Amharic text corpus for different Natural Language Processing (NLP) applications. Natural language applications, such as document classification, topic modeling, machine translation, speech recognition, and others, suffer greatly from a lack of digital resources. This is especially true for Amharic, a resource-constrained, morphologically rich, and complex language. In response to this, a total of 67,739 Amharic news documents consisting of 8 different categories from online sources are collected. The collected corpus passes through a number of pre-processing steps including; data cleaning, text normalization and punctuation correction. To validate the usability of the collected corpora from different domains, a baseline document classification experiment was conducted. Experimental results show that, 84.53% accuracy is registered using deep learning in the absence of linguistic information. Finding indicated that it is possible to use the prepared corpora for different natural language applications in the absence of linguistic resources such as stemmer and dictionary despite the complexity of Amharic language. We are further working towards Amharic news document classification by incorporating a linguistic independent stop-word detection, stemming and unsupervised morphological segmentation of Amharic documents.</p></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46475960","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Teaching, learning, and researching with corpora 用语料库进行教学、学习和研究
Applied Corpus Linguistics Pub Date : 2022-12-01 DOI: 10.1016/j.acorp.2022.100025
Tove Larsson , Shelley Staples , Jesse Egbert
{"title":"Teaching, learning, and researching with corpora","authors":"Tove Larsson ,&nbsp;Shelley Staples ,&nbsp;Jesse Egbert","doi":"10.1016/j.acorp.2022.100025","DOIUrl":"10.1016/j.acorp.2022.100025","url":null,"abstract":"","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666799122000107/pdfft?md5=f51d5341aae2c12e60f6219cf05a08ee&pid=1-s2.0-S2666799122000107-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46245949","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Principled pattern curation to guide data-driven learning design 原则模式策划,指导数据驱动的学习设计
Applied Corpus Linguistics Pub Date : 2022-12-01 DOI: 10.1016/j.acorp.2022.100028
Anne O'Keeffe , Geraldine Mark
{"title":"Principled pattern curation to guide data-driven learning design","authors":"Anne O'Keeffe ,&nbsp;Geraldine Mark","doi":"10.1016/j.acorp.2022.100028","DOIUrl":"10.1016/j.acorp.2022.100028","url":null,"abstract":"<div><p>Insights from corpus linguistics (CL) have informed language learning and materials design, among many other areas. An important nexus between CL and language learning is the use of Data-Driven Learning (DDL), which draws on the use of corpus data in the classroom and which brings opportunities for inductive language discovery.</p><p>Within the ethos of DDL, learners are encouraged to discover patterns of language and, in so doing, foster more complex cognitive processes such as making inferences. While many studies on DDL concur on the success of this approach, it is still perceived as a marginal practice. Its success so far has been largely limited to intermediate to advanced level learners in higher education settings (Boulton and Cobb 2017). This paper aims to offer guiding principles for how DDL might have wider application across all levels (not just at Intermediate and above) and to set out exemplars for their application at different levels of proficiency. Based on insights from second language acquisition (SLA) and learner corpus research (LCR), the focus of this paper will be on identifying principles for the curation of language patterns that are differentiated for stage of learning. In particular, we are keen to build on recent and important work which looks at SLA through the lens of the usage-based (UB) models (that is, models that view language as being acquired through the use of and exposure to language).</p></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666799122000132/pdfft?md5=f53afdebc49d6e7b54500fd05f50d11b&pid=1-s2.0-S2666799122000132-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49216980","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Classification and identification level ambiguity in error annotation 错误标注中的分类和识别级别歧义
Applied Corpus Linguistics Pub Date : 2022-12-01 DOI: 10.1016/j.acorp.2022.100035
Alexandros Tantos, Nikolaos Amvrazis
{"title":"Classification and identification level ambiguity in error annotation","authors":"Alexandros Tantos,&nbsp;Nikolaos Amvrazis","doi":"10.1016/j.acorp.2022.100035","DOIUrl":"10.1016/j.acorp.2022.100035","url":null,"abstract":"<div><p>The vast majority of corpus annotation projects goes through a piloting phase in which the annotation scheme is gradually shaped through iterative annotation cycles until its final version is produced and applied to the collected data. The differences in annotators’ choices are usually recorded and reflected by the ‘Inter-annotator Agreement’ (IAA) that serves as a proxy to understand and resolve the raised issues. However, little has been reported on how to formulate a systematic approach to: (i) tracing the source of the differences in the annotators’ choices and (ii) provide attainable solutions that would considerably increase IAA. In this paper, the ‘Greek Learner Corpus II’ (GLCII) -the largest online greek learner corpus will serve as a basis to shed light on two commonly met types of ambiguity in error annotation that are closely related to target languages in which syncretism is ubiquitous in grammar (e.g., Greek and Romanian): a classification level and an identification level ambiguity.</p></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46834109","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A tutorial on norming linguistic stimuli for clinical populations 规范临床人群语言刺激的教程
Applied Corpus Linguistics Pub Date : 2022-12-01 DOI: 10.1016/j.acorp.2022.100022
Oliver Delgaram-Nejad , Gerasimos Chatzidamianos , Dawn Archer , Alex Bartha , Louise Robinson
{"title":"A tutorial on norming linguistic stimuli for clinical populations","authors":"Oliver Delgaram-Nejad ,&nbsp;Gerasimos Chatzidamianos ,&nbsp;Dawn Archer ,&nbsp;Alex Bartha ,&nbsp;Louise Robinson","doi":"10.1016/j.acorp.2022.100022","DOIUrl":"10.1016/j.acorp.2022.100022","url":null,"abstract":"<div><p>Stimuli norming (the process of controlling experimental items to minimise bias) is important for the validity of psycholinguistic experiments. Survey norming (asking large numbers of people to rate or otherwise define the items) is typically used for this purpose but requires large samples. Clinical populations are not always large, nor easy to reach. Clinical participants often have ongoing symptomatology, and some cohorts experience language and communication difficulties. We present a corpus-linguistic method suitable for clinical populations for which survey norming is difficult or inappropriate. We also include the experiment generated, which measures metaphor-creation behaviour in schizophrenia to test Cognitive Constraint Theory (CCT) in clinical and nonclinical populations (see S2.1). We describe the design rationale before outlining the design stages in tutorial form. This allows us to show readers why the approach was needed and support them to consider and respond to the challenges that we encountered. We conclude that it is easier to consider norming and design practices in parallel when experimental units are defined linguistically. Corpus stimuli norming provides a versatile alternative when survey norming is prohibitive, especially in speech pathology.</p></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666799122000077/pdfft?md5=40b8aaab346c1faa805c35598a6254f4&pid=1-s2.0-S2666799122000077-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45726550","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Nicole Mockler(2022)《构建教师身份:印刷媒体如何定义和代表教师及其工作》。
Applied Corpus Linguistics Pub Date : 2022-12-01 DOI: 10.1016/j.acorp.2022.100034
Jamie McKeown
{"title":"","authors":"Jamie McKeown","doi":"10.1016/j.acorp.2022.100034","DOIUrl":"10.1016/j.acorp.2022.100034","url":null,"abstract":"","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46229291","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Comparing the situational and linguistic characteristics of first year writing and engineering writing 大一写作和工科写作的情景和语言特征比较
Applied Corpus Linguistics Pub Date : 2022-12-01 DOI: 10.1016/j.acorp.2022.100031
Shelley Staples , Ashley JoEtta
{"title":"Comparing the situational and linguistic characteristics of first year writing and engineering writing","authors":"Shelley Staples ,&nbsp;Ashley JoEtta","doi":"10.1016/j.acorp.2022.100031","DOIUrl":"10.1016/j.acorp.2022.100031","url":null,"abstract":"<div><p>First year writing (FYW) courses aim to prepare students for disciplinary writing. However, research suggests that FYW often fails to provide sufficient preparation for writing across genres and disciplines (Leki, 2007). A register-functional approach to corpus linguistics has elucidated key differences across disciplines and genres for both published and student academic writing (Biber and Gray, 2016; Staples et al., 2016; Staples and Reppen, 2016). To date, however, no studies have compared these features across FYW and First Year Engineering (FYE) writing.</p><p>This research uses a corpus of FYE and FYW texts developed by the authors. The subset for this study includes papers written by undergraduate students majoring in Engineering and taking FYE and FYW courses in the same semester. Technical Briefs (TB) and Design Reports (DR) were selected from the FYE corpus and Rhetorical Analysis (RA) and Research Reports (RR) from the FYW corpus. We investigated the situational context and normed frequencies of linguistic features hypothesized to show similarities and differences.</p><p>Our situational analysis shows key differences in characteristics of the RA and TB, particularly regarding audiences (clients for the TB, and instructors for the RA) and the object of analysis (advertisements for the RA and mathematical models for the TB). There were more similarities between the RR and DR, including a shared focus on a solution to a problem and the presence of both a methods and results section. Results from the linguistic analysis show the impact of the situational characteristics. For example, conditional clauses and premodifying nouns were used at similar rates of occurrence in the DR and RR, reflecting their inclusion of research questions and their sharing detailed information about the problem and solution. Implications of these findings for teaching in these contexts will be discussed.</p></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666799122000168/pdfft?md5=495e055e62e32825e71ff86704ea1eec&pid=1-s2.0-S2666799122000168-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47181612","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信