Frequency, Dispersion and Abstractness in the Lexical Sophistication Analysis of A Learner-Based Word Bank: Dimensionality Reduction and Identification

IF 0.7 2区 文学 0 LANGUAGE & LINGUISTICS
H. Zhang, Yuting Han, Xingzi Zhang, Liuran Cui
{"title":"Frequency, Dispersion and Abstractness in the Lexical Sophistication Analysis of A Learner-Based Word Bank: Dimensionality Reduction and Identification","authors":"H. Zhang, Yuting Han, Xingzi Zhang, Liuran Cui","doi":"10.1080/09296174.2020.1782716","DOIUrl":null,"url":null,"abstract":"ABSTRACT The current study incorporated a number of lexical sophistication indices including frequency, dispersion and abstractness of words. A learner-based word bank (inclusive of a Chinese middle-school vocabulary list, a Chinese high-school vocabulary list and a Chinese college-English-test vocabulary list) was manually coded based on two existing corpora: Corpus of Contemporary American English (COCA) and British National Corpus (BNC). Indices of frequency, dispersion and abstractness of the word bank were analysed to shed light on the predetermined categorization of lexical sophistication among second language learners. Based on the principal component analysis, the results demonstrated that dispersion was a unique factor loaded on all entered eight variables while word frequency and abstractness were extracted by the same factor in the learner-based word bank. Moreover, a follow-up MANOVA analysis with post hoc comparisons showed that lexical sophistication indices in general produced pronounced differences among the three levels of word lists. More critically, dispersion was found to be the only significant indicator to differentiate the three levels of word lists. Discussion centred on the uniqueness of dispersion in lexical sophistication and the shared algorithm in frequency and abstractness.","PeriodicalId":45514,"journal":{"name":"Journal of Quantitative Linguistics","volume":"29 1","pages":"195 - 211"},"PeriodicalIF":0.7000,"publicationDate":"2020-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/09296174.2020.1782716","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Quantitative Linguistics","FirstCategoryId":"98","ListUrlMain":"https://doi.org/10.1080/09296174.2020.1782716","RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"0","JCRName":"LANGUAGE & LINGUISTICS","Score":null,"Total":0}
引用次数: 1

Abstract

ABSTRACT The current study incorporated a number of lexical sophistication indices including frequency, dispersion and abstractness of words. A learner-based word bank (inclusive of a Chinese middle-school vocabulary list, a Chinese high-school vocabulary list and a Chinese college-English-test vocabulary list) was manually coded based on two existing corpora: Corpus of Contemporary American English (COCA) and British National Corpus (BNC). Indices of frequency, dispersion and abstractness of the word bank were analysed to shed light on the predetermined categorization of lexical sophistication among second language learners. Based on the principal component analysis, the results demonstrated that dispersion was a unique factor loaded on all entered eight variables while word frequency and abstractness were extracted by the same factor in the learner-based word bank. Moreover, a follow-up MANOVA analysis with post hoc comparisons showed that lexical sophistication indices in general produced pronounced differences among the three levels of word lists. More critically, dispersion was found to be the only significant indicator to differentiate the three levels of word lists. Discussion centred on the uniqueness of dispersion in lexical sophistication and the shared algorithm in frequency and abstractness.
基于学习者的词库词汇复杂度分析中的频度、离散度和抽象性:降维与识别
摘要本研究纳入了大量词汇复杂度指标,包括词汇的频率、离散度和抽象度。基于现有的两个语料库:当代美国英语语料库(COCA)和英国国家语料库(BNC),人工编码了一个基于学习者的单词库(包括中国中学词汇表、中国高中词汇表和中国大学英语测试词汇表)。分析了单词库的频率、离散度和抽象性指标,以揭示第二语言学习者对词汇复杂度的预先分类。基于主成分分析,结果表明,在基于学习者的单词库中,分散度是加载在所有输入的八个变量上的唯一因素,而词频和抽象度是由同一因素提取的。此外,后续的MANOVA分析和事后比较表明,词汇复杂度指数通常会在三个级别的单词表之间产生显著差异。更关键的是,分散度被发现是区分单词表三个级别的唯一重要指标。讨论集中在词汇复杂度的离散性的唯一性以及频率和抽象性的共享算法上。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
2.90
自引率
7.10%
发文量
7
期刊介绍: The Journal of Quantitative Linguistics is an international forum for the publication and discussion of research on the quantitative characteristics of language and text in an exact mathematical form. This approach, which is of growing interest, opens up important and exciting theoretical perspectives, as well as solutions for a wide range of practical problems such as machine learning or statistical parsing, by introducing into linguistics the methods and models of advanced scientific disciplines such as the natural sciences, economics, and psychology.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信