Evaluating the impact of nonverbal behavior on language ability ratings

IF 2.4 1区文学 0 LANGUAGE & LINGUISTICS

Language Testing Pub Date : 2024-08-08 DOI:10.1177/02655322241255709

J. Dylan Burton

{"title":"Evaluating the impact of nonverbal behavior on language ability ratings","authors":"J. Dylan Burton","doi":"10.1177/02655322241255709","DOIUrl":null,"url":null,"abstract":"Nonverbal behavior can impact language proficiency scores in speaking tests, but there is little empirical information of the size or consistency of its effects or whether language proficiency may be a moderating variable. In this study, 100 novice raters watched and scored 30 recordings of test takers taking an international, high stakes proficiency test. The speech samples were each 2 minutes long and ranged in proficiency levels. The raters scored each sample on fluency, vocabulary, grammar, and comprehensibility using 7-point semantic differential scales. Nonverbal behavior was extracted using an automated machine learning software called iMotions, and data was analyzed with ordinal mixed effects regression. Results showed that attentional variance predicted fluency, vocabulary, and grammar scores, but only when accounting for proficiency. Higher standard deviations of attention corresponded with lower scores for the lower-proficiency group, but not the mid/higher-proficiency group. Comprehensibility scores were only predicted by mean valence when proficiency was an interaction term. Higher mean valence, or positive emotional behavior, corresponded with higher scores in the lower-proficiency group, but not the mid/higher-proficiency group. Effect sizes for these predictors were quite small, with small amounts of variance explained. These results have implications for construct representation and test fairness.","PeriodicalId":17928,"journal":{"name":"Language Testing","volume":"23 1","pages":""},"PeriodicalIF":2.4000,"publicationDate":"2024-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Language Testing","FirstCategoryId":"98","ListUrlMain":"https://doi.org/10.1177/02655322241255709","RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"0","JCRName":"LANGUAGE & LINGUISTICS","Score":null,"Total":0}

引用次数: 0

Abstract

Nonverbal behavior can impact language proficiency scores in speaking tests, but there is little empirical information of the size or consistency of its effects or whether language proficiency may be a moderating variable. In this study, 100 novice raters watched and scored 30 recordings of test takers taking an international, high stakes proficiency test. The speech samples were each 2 minutes long and ranged in proficiency levels. The raters scored each sample on fluency, vocabulary, grammar, and comprehensibility using 7-point semantic differential scales. Nonverbal behavior was extracted using an automated machine learning software called iMotions, and data was analyzed with ordinal mixed effects regression. Results showed that attentional variance predicted fluency, vocabulary, and grammar scores, but only when accounting for proficiency. Higher standard deviations of attention corresponded with lower scores for the lower-proficiency group, but not the mid/higher-proficiency group. Comprehensibility scores were only predicted by mean valence when proficiency was an interaction term. Higher mean valence, or positive emotional behavior, corresponded with higher scores in the lower-proficiency group, but not the mid/higher-proficiency group. Effect sizes for these predictors were quite small, with small amounts of variance explained. These results have implications for construct representation and test fairness.

查看原文本刊更多论文

评估非语言行为对语言能力评级的影响

非言语行为会影响口语测试中的语言能力得分，但关于其影响的大小或一致性，以及语言能力是否可能是一个调节变量的经验信息却很少。在这项研究中，100 名新手评分员观看了 30 份参加国际高风险水平测试的考生录音，并进行了评分。每个语音样本时长为 2 分钟，水平参差不齐。评分者使用 7 点语义差异量表对每个样本的流利程度、词汇量、语法和可理解性进行评分。使用名为 iMotions 的自动机器学习软件提取非语言行为，并使用序数混合效应回归法分析数据。结果表明，注意力差异可以预测流利程度、词汇量和语法得分，但只有在考虑到熟练程度的情况下才能预测。注意力标准差越高，低能力组的得分越低，但中/高能力组则不然。只有当能力是一个交互项时，可理解性得分才会受到平均情绪的影响。较高的平均情感或积极情绪行为与较低能力组的较高分数相对应，但与中/较高能力组无关。这些预测因子的效应大小相当小，所解释的方差也很小。这些结果对建构表征和测试公平性有一定的影响。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Language Testing Multiple-

CiteScore

6.70

自引率

9.80%

发文量

期刊介绍： Language Testing is a fully peer reviewed international journal that publishes original research and review articles on language testing and assessment. It provides a forum for the exchange of ideas and information between people working in the fields of first and second language testing and assessment. This includes researchers and practitioners in EFL and ESL testing, and assessment in child language acquisition and language pathology. In addition, special attention is focused on issues of testing theory, experimental investigations, and the following up of practical implications.