Applied Corpus Linguistics最新文献

筛选
英文 中文
AI-generated vs human-authored texts: A multidimensional comparison 人工智能生成的文本与人类撰写的文本:多维比较
Applied Corpus Linguistics Pub Date : 2023-12-20 DOI: 10.1016/j.acorp.2023.100083
Tony Berber Sardinha
{"title":"AI-generated vs human-authored texts: A multidimensional comparison","authors":"Tony Berber Sardinha","doi":"10.1016/j.acorp.2023.100083","DOIUrl":"10.1016/j.acorp.2023.100083","url":null,"abstract":"<div><p>The goal of this study is to assess the degree of resemblance between texts generated by artificial intelligence (GPT) and (written and spoken) texts produced by human individuals in real-world settings. A comparative analysis was conducted along the five main dimensions of variation that Biber (1988) identified. The findings revealed significant disparities between AI-generated and human-authored texts, with the AI-generated texts generally failing to exhibit resemblance to their human counterparts. Furthermore, a linear discriminant analysis, performed to measure the predictive potential of dimension scores for identifying the authorship of texts, demonstrated that AI-generated texts could be identified with relative ease based on their multidimensional profile. Collectively, the results underscore the current limitations of AI text generation in emulating natural human communication. This finding counters popular fears that AI will replace humans in textual communication. Rather, our findings suggest that, at present, AI's ability to capture the intricate patterns of natural language remains limited.</p></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":"4 1","pages":"Article 100083"},"PeriodicalIF":0.0,"publicationDate":"2023-12-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666799123000436/pdfft?md5=eec63f0662cd28b0d80ac041ac33eae7&pid=1-s2.0-S2666799123000436-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139026627","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Prototype-by-component analysis: A corpus-based, intensional approach to ordinary meaning in statutory interpretation 按成分分析原型:基于语料库的成文法解释普通含义方法
Applied Corpus Linguistics Pub Date : 2023-12-20 DOI: 10.1016/j.acorp.2023.100078
Jesse Egbert , Thomas R. Lee
{"title":"Prototype-by-component analysis: A corpus-based, intensional approach to ordinary meaning in statutory interpretation","authors":"Jesse Egbert ,&nbsp;Thomas R. Lee","doi":"10.1016/j.acorp.2023.100078","DOIUrl":"10.1016/j.acorp.2023.100078","url":null,"abstract":"<div><p>When faced with a word or phrase that is not defined in a statute, judges generally interpret the language of the law as it is likely to be understood by an ordinary user of the language. However, there is little agreement about what ordinary meaning is and how it can be determined. Proponents of corpus-based legal interpretation argue that corpora provide scientific rigor and increased validity and transparency, but there is currently no consensus on best practices for legal corpus linguistics. Our objective in this paper is to propose some refinements to the theory of ordinary meaning and corpus-based methods of analyzing it. We argue that the scope of legal language is established by conceptual (<em>intensional</em>) meaning, and not limited to attested referents. Yet, most current corpus-based approaches are purely referential (<em>extensional</em>). Therefore, we introduce a new methodology—<em>prototype by component (PBC)</em> analysis<em>—</em>in which we bring together aspects of the componential approach and prototype theory by assuming that categories are gradient entities that are characterized by gradient semantic components. We introduce the analytical steps in PBC analysis and apply them to <em>Nix v. Hedden</em> (1893) to determine whether <em>tomato</em> is a member of the category vegetable. We conclude that conceptual categories have a prototypical reality and a componential reality. As a result, attested referents in a corpus can provide insights into the conceptual meaning of terms and the degree to which concepts are members of categories.</p></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":"4 1","pages":"Article 100078"},"PeriodicalIF":0.0,"publicationDate":"2023-12-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666799123000382/pdfft?md5=f402bdd08e64a2ca946fa7003eabe040&pid=1-s2.0-S2666799123000382-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139014688","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Corpus-linguistic approaches to lexical statutory meaning: Extensionalist vs. intensionalist approaches 词汇法定意义的语料库语言学方法:外延主义与内涵主义方法
Applied Corpus Linguistics Pub Date : 2023-12-19 DOI: 10.1016/j.acorp.2023.100079
Stefan Th. Gries, Brian G. Slocum, Kevin Tobia
{"title":"Corpus-linguistic approaches to lexical statutory meaning: Extensionalist vs. intensionalist approaches","authors":"Stefan Th. Gries,&nbsp;Brian G. Slocum,&nbsp;Kevin Tobia","doi":"10.1016/j.acorp.2023.100079","DOIUrl":"https://doi.org/10.1016/j.acorp.2023.100079","url":null,"abstract":"<div><p>Scholars and practitioners interested in legal interpretation have become increasingly interested in corpus-linguistic methodology. <span>Lee and Mouritsen (2018)</span> developed and helped popularize the use of concordancing and collocate displays (of mostly COCA and COHA) to operationalize a central notion in legal interpretation, the <strong>ordinary meaning</strong> of expressions. This approach provides a good first approximation but is ultimately limited. Here, we outline an approach to ordinary meaning that is <strong>intensionalist</strong> (i.e., 'feature-based'), top-down, and informed by the notion of <strong>cue validity in prototype theory</strong>. The key advantages of this approach are that (i) it avoids the which-value-on-a-dimension problem of extensionalist approaches, (ii) it provides quantifiable prototypicality values for things whose membership status in a category is in question, and (iii) it can be extended even to cases for which no textual data are yet available. We exemplify the approach with two case studies that offer the option of utilizing survey data and/or word embeddings trained on corpora by deriving cue validities from word similarities. We exemplify this latter approach with the word <em>vehicle</em> on the basis of (i) an embedding model trained on 840 billion words crawled from the web, but now also with the more realistic application (in terms of corpus size and time frame) of (ii) an embedding model trained on the 1950s time slice of COHA to address the question to what degree Segways, which didn't exist in the 1950s, qualify as vehicles in this intensional approach.</p></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":"4 1","pages":"Article 100079"},"PeriodicalIF":0.0,"publicationDate":"2023-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666799123000394/pdfft?md5=fffa64c5cf04e01a22d462ddb9e4441e&pid=1-s2.0-S2666799123000394-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139099518","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Generative AI for corpus approaches to discourse studies: A critical evaluation of ChatGPT 用于语料库话语研究方法的生成式人工智能:对 ChatGPT 的批判性评估
Applied Corpus Linguistics Pub Date : 2023-12-19 DOI: 10.1016/j.acorp.2023.100082
Niall Curry , Paul Baker , Gavin Brookes
{"title":"Generative AI for corpus approaches to discourse studies: A critical evaluation of ChatGPT","authors":"Niall Curry ,&nbsp;Paul Baker ,&nbsp;Gavin Brookes","doi":"10.1016/j.acorp.2023.100082","DOIUrl":"10.1016/j.acorp.2023.100082","url":null,"abstract":"<div><p>This paper explores the potential of generative artificial intelligence technology, specifically ChatGPT, for advancing corpus approaches to discourse studies. The contribution of artificial intelligence technologies to linguistics research has been transformational, both in the contexts of corpus linguistics and discourse analysis. However, shortcomings in the efficacy of such technologies for conducting automated qualitative analysis have limited their utility for corpus approaches to discourse studies. Acknowledging that new technologies in data analysis can replace and supplement existing approaches, and in view of the potential affordances of ChatGPT for automated qualitative analysis, this paper presents three replication case studies designed to investigate the applicability of ChatGPT for supporting automated qualitative analysis within studies using corpus approaches to discourse analysis.</p><p>The findings indicate that, generally, ChatGPT performs reasonably well when semantically categorising keywords; however, as the categorisation is based on decontextualised keywords, the categories can appear quite generic, limiting the value of such an approach for analysing corpora representing specialised genres and/or contexts. For concordance analysis, ChatGPT performs poorly, as the results include false inferences about the concordance lines and, at times, modifications of the input data. Finally, for function-to-form analysis, ChatGPT also performs poorly, as it fails to identify and analyse direct and indirect questions. Overall, the results raise questions about the affordances of ChatGPT for supporting automated qualitative analysis within corpus approaches to discourse studies, signalling issues of repeatability and replicability, ethical challenges surrounding data integrity, and the challenges associated with using non-deterministic technology for empirical linguistic research.</p></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":"4 1","pages":"Article 100082"},"PeriodicalIF":0.0,"publicationDate":"2023-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666799123000424/pdfft?md5=ae9708bc5113ac915574372c9ad6a9d7&pid=1-s2.0-S2666799123000424-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139023094","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Erratum regarding missing Declaration of Competing Interest statements in previously published articles 关于先前发表的文章中缺少竞争利益声明的勘误表
Applied Corpus Linguistics Pub Date : 2023-12-01 DOI: 10.1016/j.acorp.2023.100071
{"title":"Erratum regarding missing Declaration of Competing Interest statements in previously published articles","authors":"","doi":"10.1016/j.acorp.2023.100071","DOIUrl":"https://doi.org/10.1016/j.acorp.2023.100071","url":null,"abstract":"","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":"3 3","pages":"Article 100071"},"PeriodicalIF":0.0,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S266679912300031X/pdfft?md5=b062715ba46158ca342b354088c8e319&pid=1-s2.0-S266679912300031X-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138484866","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Exploring early L2 writing development through the lens of grammatical complexity 从语法复杂性的角度探讨早期二语写作的发展
Applied Corpus Linguistics Pub Date : 2023-10-30 DOI: 10.1016/j.acorp.2023.100077
Tove Larsson , Tony Berber Sardinha , Bethany Gray , Douglas Biber
{"title":"Exploring early L2 writing development through the lens of grammatical complexity","authors":"Tove Larsson ,&nbsp;Tony Berber Sardinha ,&nbsp;Bethany Gray ,&nbsp;Douglas Biber","doi":"10.1016/j.acorp.2023.100077","DOIUrl":"https://doi.org/10.1016/j.acorp.2023.100077","url":null,"abstract":"<div><p>The present study explores the development of grammatical complexity in L2 English writing at the beginner, lower intermediate, and upper intermediate levels to see (i) to what extent the developmental stages proposed in Biber et al. (2011) are evident in low-proficiency L2 writing, and if so, what the patterns of progression are, and (ii) whether students gradually move away from speech-like production toward more advanced written production. We use data from COBRA, a corpus of L1 Brazilian Portuguese learner production, along with BR-ICLE and BR-LINDSEI. All the data were tagged using the Biber tagger (Biber, 1988) and the Developmental Complexity tagger (Gray et al., 2019), and subsequently analyzed using a technique developed in Staples et al. (2022) to quantify developmental profiles across levels. The technique considers not only overall change in frequency across levels, but also the incremental variation across each adjacent level (based on % frequency changes). The results show that the features were infrequent overall, with a majority of both clausal and phrasal features exhibiting an increase in frequency across the levels, albeit to varying degrees. This general pattern is contrary to predictions based on findings from previous studies, which found phrasal features increasing in use and clausal features <em>decreasing</em> in use. Nonetheless, for the features associated with each developmental stage, the frequencies generally increased, becoming more similar to advanced written production and more dissimilar to spoken production, as hypothesized in Biber et al. (2011).</p></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":"3 3","pages":"Article 100077"},"PeriodicalIF":0.0,"publicationDate":"2023-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91989988","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Effective corpus use in second language learning: A meta-analytic approach 第二语言学习中语料库的有效使用:一种元分析方法
Applied Corpus Linguistics Pub Date : 2023-10-21 DOI: 10.1016/j.acorp.2023.100076
Shotaro Ueno , Osamu Takeuchi
{"title":"Effective corpus use in second language learning: A meta-analytic approach","authors":"Shotaro Ueno ,&nbsp;Osamu Takeuchi","doi":"10.1016/j.acorp.2023.100076","DOIUrl":"https://doi.org/10.1016/j.acorp.2023.100076","url":null,"abstract":"<div><p>Data-driven learning (DDL) refers to the use of corpora by second and foreign language (L2) learners to explore and inductively discover patterns of their target language use from authentic language data without interventions from others. Although previous meta-analyses have demonstrated the positive effects of DDL on L2 learning (Boulton and Cobb, 2017), the number of empirical studies has been increasing since then. Therefore, this study included more recent studies and used meta-analyses to examine the extent to which: (1) DDL exerts an effect on L2 learning; and (2) moderator variables affect DDL's influence on L2 learning. The results demonstrated small to medium effect sizes for experimental/control group comparisons and pre/post and pre/delayed designs. Moreover, the moderator analyses found that moderator variables, such as publication types, learners’ factors, and research designs, influence the magnitude of DDL effectiveness in L2 learning.</p></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":"3 3","pages":"Article 100076"},"PeriodicalIF":0.0,"publicationDate":"2023-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91957142","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Using corpus linguistics to create tasks for teaching and assessing Aeronautical English 运用语料库语言学创建航空英语教学和评估任务
Applied Corpus Linguistics Pub Date : 2023-10-11 DOI: 10.1016/j.acorp.2023.100075
Aline Pacheco , Angela Carolina de Moraes Garcia , Ana Lúcia Tavares Monteiro , Malila Carvalho de Almeida Prado , Patrícia Tosqui-Lucks
{"title":"Using corpus linguistics to create tasks for teaching and assessing Aeronautical English","authors":"Aline Pacheco ,&nbsp;Angela Carolina de Moraes Garcia ,&nbsp;Ana Lúcia Tavares Monteiro ,&nbsp;Malila Carvalho de Almeida Prado ,&nbsp;Patrícia Tosqui-Lucks","doi":"10.1016/j.acorp.2023.100075","DOIUrl":"https://doi.org/10.1016/j.acorp.2023.100075","url":null,"abstract":"<div><p><span>This article presents the theoretical basis for corpus linguistics applied to Aeronautical English teaching and assessment followed by practical examples on how to use corpora to develop tasks for both purposes. It originates from the design of two webinars held remotely at the end of 2020, and promoted by the International </span>Civil Aviation English Association. The webinars were targeted at Aeronautical English teachers, material designers, and test developers with little or no previous knowledge of corpus linguistics with the aim of guiding the audience in preparing step–by–step tasks using corpora. We share the work involved in the task design suggested, bridging the gap between research and practice. We conclude by outlining limitations, and suggesting prospects for future research.</p></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":"3 3","pages":"Article 100075"},"PeriodicalIF":0.0,"publicationDate":"2023-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49863545","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Lexical change and stability in 100 years of English in US newspapers 美国报纸100年来英语词汇的变化与稳定
Applied Corpus Linguistics Pub Date : 2023-09-08 DOI: 10.1016/j.acorp.2023.100073
Robert Poole , Qudus Ayinde Adebayo
{"title":"Lexical change and stability in 100 years of English in US newspapers","authors":"Robert Poole ,&nbsp;Qudus Ayinde Adebayo","doi":"10.1016/j.acorp.2023.100073","DOIUrl":"10.1016/j.acorp.2023.100073","url":null,"abstract":"<div><p>This study explores diachronic variation across approximately one hundred years of the newspaper register in US American English from 1920 to 2019 as captured in the Corpus of Historical American English (Davies, 2010). Informed by a similar study of lexical change in British English (Baker, 2011), the analysis identified high-frequency words exhibiting the greatest increases and decreases in use as well as those words demonstrating stability across the four sampling periods: 1920–29, 1950–59, 1980–89, 2010–19. The process to identify words of change and stability began first with the application of a cumulative frequency threshold; coefficient of variance and Kendall's Tau correlation coefficient were then calculated to aid in identification. In other words, the process targeted high-frequency words whose use has demonstrated the greatest change or stability. The discussion presents the three resulting word lists (increasing, decreasing, stable) and reports concordance and collocation analysis of select words from each list to gain insight into the underlying factors informing lexical change and stability.</p></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":"3 3","pages":"Article 100073"},"PeriodicalIF":0.0,"publicationDate":"2023-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46738896","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Data-driven Learning Meets Generative AI: Introducing the Framework of Metacognitive Resource Use 数据驱动学习与生成式人工智能:引入元认知资源使用框架
Applied Corpus Linguistics Pub Date : 2023-09-07 DOI: 10.1016/j.acorp.2023.100074
Atsushi Mizumoto
{"title":"Data-driven Learning Meets Generative AI: Introducing the Framework of Metacognitive Resource Use","authors":"Atsushi Mizumoto","doi":"10.1016/j.acorp.2023.100074","DOIUrl":"10.1016/j.acorp.2023.100074","url":null,"abstract":"<div><p>This paper explores the intersection of data-driven learning (DDL) and generative AI (GenAI), represented by technologies like ChatGPT, in the realm of language learning and teaching. It presents two complementary perspectives on how to integrate these approaches. The first viewpoint advocates for a blended methodology that synergizes DDL and GenAI, capitalizing on their complementary strengths while offsetting their individual limitations. The second introduces the Metacognitive Resource Use (MRU) framework, a novel paradigm that positions DDL within an expansive ecosystem of language resources, which also includes GenAI tools. Anchored in the foundational principles of metacognition, the MRU framework centers on two pivotal dimensions: metacognitive knowledge and metacognitive regulation. The paper proposes pedagogical recommendations designed to enable learners to strategically utilize a wide range of language resources, from corpora to GenAI technologies, guided by their self-awareness, the specifics of the task, and relevant strategies. The paper concludes by highlighting promising avenues for future research, notably the empirical assessment of both the integrated DDL-GenAI approach and the MRU framework.</p></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":"3 3","pages":"Article 100074"},"PeriodicalIF":0.0,"publicationDate":"2023-09-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48929007","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信