面向中学学习者的数字科学资源词汇

Rebeca Arndt
{"title":"面向中学学习者的数字科学资源词汇","authors":"Rebeca Arndt","doi":"10.1016/j.acorp.2022.100023","DOIUrl":null,"url":null,"abstract":"<div><p>This corpus-based study examined the vocabulary in a 2.7-million-token corpus composed of digital science resources for middle school (6–8 grade) students in the United States. The findings of this study show that to reach the suggested 95%–98% lexical coverage thresholds of the Digital Science Corpus (DSC) that are conventionally deemed to facilitate minimal and optimal reading comprehension (Laufer, 2020), middle school (MS) students grade 6–8 must recognize the first 6,000 and 14,000 most frequent word families in the BNC/COCA (Nation, 2012), respectively, plus proper nouns and marginal words. The results of the lexical analysis across the three sub-corpora in the DSC suggest that the Life Science sub-corpora has a considerably larger vocabulary load than the Physical Science and Earth and Space Science sub-corpora. Additionally, while 98.60% of the most frequent 1,000 BNC/COCA word families occurred at least six times in the DSC, the 2,000–7,000 BNC/COCA word families provided significantly fewer opportunities for repeated occurrence. Since more than half of the words in the 5,000–7,000 BNC/COCA bands occurred five times or less in the overall corpus, most words across these bands do not have high enough frequency in the digital science resources to allow MS students to learn them incidentally from reading the texts found in digital science resources. Several pedagogically relevant suggestions for middle school science teachers are discussed.</p></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Vocabulary in digital science resources for middle school learners\",\"authors\":\"Rebeca Arndt\",\"doi\":\"10.1016/j.acorp.2022.100023\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>This corpus-based study examined the vocabulary in a 2.7-million-token corpus composed of digital science resources for middle school (6–8 grade) students in the United States. The findings of this study show that to reach the suggested 95%–98% lexical coverage thresholds of the Digital Science Corpus (DSC) that are conventionally deemed to facilitate minimal and optimal reading comprehension (Laufer, 2020), middle school (MS) students grade 6–8 must recognize the first 6,000 and 14,000 most frequent word families in the BNC/COCA (Nation, 2012), respectively, plus proper nouns and marginal words. The results of the lexical analysis across the three sub-corpora in the DSC suggest that the Life Science sub-corpora has a considerably larger vocabulary load than the Physical Science and Earth and Space Science sub-corpora. Additionally, while 98.60% of the most frequent 1,000 BNC/COCA word families occurred at least six times in the DSC, the 2,000–7,000 BNC/COCA word families provided significantly fewer opportunities for repeated occurrence. Since more than half of the words in the 5,000–7,000 BNC/COCA bands occurred five times or less in the overall corpus, most words across these bands do not have high enough frequency in the digital science resources to allow MS students to learn them incidentally from reading the texts found in digital science resources. Several pedagogically relevant suggestions for middle school science teachers are discussed.</p></div>\",\"PeriodicalId\":72254,\"journal\":{\"name\":\"Applied Corpus Linguistics\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Applied Corpus Linguistics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2666799122000089\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Corpus Linguistics","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666799122000089","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

这项基于语料库的研究检查了由美国中学(6-8年级)学生的数字科学资源组成的270万个token语料库中的词汇。本研究的结果表明,要达到数字科学语料库(DSC) 95%-98%的词汇覆盖阈值,即通常被认为有助于最小和最佳阅读理解(Laufer, 2020), 6-8年级的中学生必须分别识别BNC/COCA (Nation, 2012)中出现频率最高的前6,000和14,000个词族,以及专有名词和边缘词。对DSC中三个子语料库的词汇量分析结果表明,生命科学子语料库的词汇量明显大于物理科学和地球与空间科学子语料库。此外,在频率最高的1000个BNC/COCA词族中,98.60%的词族在DSC中至少出现6次,而在2000 - 7000个BNC/COCA词族中,重复出现的机会显著减少。由于5000 - 7000个BNC/COCA频带中超过一半的单词在整个语料库中出现了5次或更少的次数,因此这些频带中的大多数单词在数字科学资源中的频率不够高,无法让MS学生通过阅读数字科学资源中的文本来偶然学习它们。对中学科学教师的教学建议进行了探讨。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Vocabulary in digital science resources for middle school learners

This corpus-based study examined the vocabulary in a 2.7-million-token corpus composed of digital science resources for middle school (6–8 grade) students in the United States. The findings of this study show that to reach the suggested 95%–98% lexical coverage thresholds of the Digital Science Corpus (DSC) that are conventionally deemed to facilitate minimal and optimal reading comprehension (Laufer, 2020), middle school (MS) students grade 6–8 must recognize the first 6,000 and 14,000 most frequent word families in the BNC/COCA (Nation, 2012), respectively, plus proper nouns and marginal words. The results of the lexical analysis across the three sub-corpora in the DSC suggest that the Life Science sub-corpora has a considerably larger vocabulary load than the Physical Science and Earth and Space Science sub-corpora. Additionally, while 98.60% of the most frequent 1,000 BNC/COCA word families occurred at least six times in the DSC, the 2,000–7,000 BNC/COCA word families provided significantly fewer opportunities for repeated occurrence. Since more than half of the words in the 5,000–7,000 BNC/COCA bands occurred five times or less in the overall corpus, most words across these bands do not have high enough frequency in the digital science resources to allow MS students to learn them incidentally from reading the texts found in digital science resources. Several pedagogically relevant suggestions for middle school science teachers are discussed.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Applied Corpus Linguistics
Applied Corpus Linguistics Linguistics and Language
CiteScore
1.30
自引率
0.00%
发文量
0
审稿时长
70 days
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信