Developing fine-grained sense-aware lexical sophistication indices based on the CEFR levels of word senses.

IF 3.9 2区 心理学 Q1 PSYCHOLOGY, EXPERIMENTAL
Nan Hu, Xiaofei Lu, Renfen Hu
{"title":"Developing fine-grained sense-aware lexical sophistication indices based on the CEFR levels of word senses.","authors":"Nan Hu, Xiaofei Lu, Renfen Hu","doi":"10.3758/s13428-025-02741-z","DOIUrl":null,"url":null,"abstract":"<p><p>Lexical sophistication has garnered attention across diverse research domains in which language production and text complexity are relevant areas of study. Nevertheless, among the myriad existing lexical sophistication measures, the vast majority do not systematically differentiate different senses of polysemous words but rather treat all senses of a polysemous word as equally sophisticated. To address this limitation, the current study introduces a system that automatically assigns the words in a text to CEFR (i.e., the Common European Framework of Reference for Languages) levels based on their senses used in context, using the English Vocabulary Profile as a reference. We further propose a set of fine-grained sense-aware lexical sophistication indices based on the CEFR levels of word senses and evaluate the extent to which these indices can predict holistic scores of second language (L2) English writing quality using 1,236 exam scripts from the CLC-FCE dataset (Yannakoudakis et al., 2011). The results show that these fine-grained sense-aware indices are more strongly correlated with scores than existing lexical sophistication measures, with three significant predictors explaining 11.8% of the variance in holistic scores. A regression model that combines the new indices with existing ones achieves substantially greater predictive power than models built with either set of indices alone. We discuss the potential implications of our findings for future research in L2 lexical sophistication.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 8","pages":"226"},"PeriodicalIF":3.9000,"publicationDate":"2025-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Behavior Research Methods","FirstCategoryId":"102","ListUrlMain":"https://doi.org/10.3758/s13428-025-02741-z","RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PSYCHOLOGY, EXPERIMENTAL","Score":null,"Total":0}
引用次数: 0

Abstract

Lexical sophistication has garnered attention across diverse research domains in which language production and text complexity are relevant areas of study. Nevertheless, among the myriad existing lexical sophistication measures, the vast majority do not systematically differentiate different senses of polysemous words but rather treat all senses of a polysemous word as equally sophisticated. To address this limitation, the current study introduces a system that automatically assigns the words in a text to CEFR (i.e., the Common European Framework of Reference for Languages) levels based on their senses used in context, using the English Vocabulary Profile as a reference. We further propose a set of fine-grained sense-aware lexical sophistication indices based on the CEFR levels of word senses and evaluate the extent to which these indices can predict holistic scores of second language (L2) English writing quality using 1,236 exam scripts from the CLC-FCE dataset (Yannakoudakis et al., 2011). The results show that these fine-grained sense-aware indices are more strongly correlated with scores than existing lexical sophistication measures, with three significant predictors explaining 11.8% of the variance in holistic scores. A regression model that combines the new indices with existing ones achieves substantially greater predictive power than models built with either set of indices alone. We discuss the potential implications of our findings for future research in L2 lexical sophistication.

基于词义的CEFR级别开发细粒度的感知词义的词汇复杂性索引。
词汇复杂性已经引起了不同研究领域的关注,其中语言产生和文本复杂性是相关的研究领域。然而,在现有的众多词汇复杂度测度中,绝大多数并没有系统地区分多义词的不同义项,而是将一个多义词的所有义项同等复杂。为了解决这一限制,目前的研究引入了一个系统,该系统使用英语词汇表作为参考,根据文本中的单词在上下文中使用的含义,自动将文本中的单词分配到CEFR(即欧洲语言共同参考框架)级别。我们进一步提出了一套基于CEFR语义水平的细粒度语义感知词汇复杂性指数,并使用来自CLC-FCE数据集的1,236个考试脚本评估这些指数可以预测第二语言(L2)英语写作质量整体分数的程度(Yannakoudakis等人,2011)。结果表明,这些细粒度的感官感知指数与得分的相关性比现有的词汇复杂度指标更强,三个显著的预测因子解释了11.8%的整体得分方差。将新指标与现有指标相结合的回归模型比单独使用任何一组指标构建的模型具有更大的预测能力。我们讨论了我们的发现对未来二语词汇复杂性研究的潜在意义。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
10.30
自引率
9.30%
发文量
266
期刊介绍: Behavior Research Methods publishes articles concerned with the methods, techniques, and instrumentation of research in experimental psychology. The journal focuses particularly on the use of computer technology in psychological research. An annual special issue is devoted to this field.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信