What Makes Children's Responses to Creativity Assessments Difficult to Judge Reliably?

IF 2.8 2区 心理学 Q2 PSYCHOLOGY, EDUCATIONAL
Denis Dumas, Selcuk Acar, Kelly Berthiaume, Peter Organisciak, David Eby, Katalin Grajzel, Theadora Vlaamster, Michele Newman, Melanie Carrera
{"title":"What Makes Children's Responses to Creativity Assessments Difficult to Judge Reliably?","authors":"Denis Dumas,&nbsp;Selcuk Acar,&nbsp;Kelly Berthiaume,&nbsp;Peter Organisciak,&nbsp;David Eby,&nbsp;Katalin Grajzel,&nbsp;Theadora Vlaamster,&nbsp;Michele Newman,&nbsp;Melanie Carrera","doi":"10.1002/jocb.588","DOIUrl":null,"url":null,"abstract":"<p>Open-ended verbal creativity assessments are commonly administered in psychological research and in educational practice to elementary-aged children. Children's responses are then typically rated by teams of judges who are trained to identify original ideas, hopefully with a degree of inter-rater agreement. Even in cases where the judges are reliable, some residual disagreement on the originality of the responses is inevitable. Here, we modeled the predictors of inter-rater disagreement in a large (i.e., 387 elementary school students and 10,449 individual item responses) dataset of children's creativity assessment responses. Our five trained judges rated the responses with a high degree of consistency reliability (<i>α</i> = 0.844), but we undertook this study to predict the residual disagreement. We used an adaptive LASSO model to predict 72% of the variance in our judges' residual disagreement and found that there were certain types of responses on which our judges tended to disagree more. The main effects in our model showed that responses that were less original, more elaborate, prompted by a Uses task, from younger children, or from male students, were all more difficult for the judges to rate reliably. Among the interaction effects, we found that our judges were also more likely to disagree on highly original responses from Gifted/Talented students, responses from Latinx students who were identified as English Language Learners, or responses from Asian students who took a lot of time on the task. Given that human judgments such as these are currently being used to train artificial intelligence systems to rate responses to creativity assessments, we believe understanding their nuances is important.</p>","PeriodicalId":39915,"journal":{"name":"Journal of Creative Behavior","volume":"57 3","pages":"419-438"},"PeriodicalIF":2.8000,"publicationDate":"2023-05-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/jocb.588","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Creative Behavior","FirstCategoryId":"102","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/jocb.588","RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"PSYCHOLOGY, EDUCATIONAL","Score":null,"Total":0}
引用次数: 1

Abstract

Open-ended verbal creativity assessments are commonly administered in psychological research and in educational practice to elementary-aged children. Children's responses are then typically rated by teams of judges who are trained to identify original ideas, hopefully with a degree of inter-rater agreement. Even in cases where the judges are reliable, some residual disagreement on the originality of the responses is inevitable. Here, we modeled the predictors of inter-rater disagreement in a large (i.e., 387 elementary school students and 10,449 individual item responses) dataset of children's creativity assessment responses. Our five trained judges rated the responses with a high degree of consistency reliability (α = 0.844), but we undertook this study to predict the residual disagreement. We used an adaptive LASSO model to predict 72% of the variance in our judges' residual disagreement and found that there were certain types of responses on which our judges tended to disagree more. The main effects in our model showed that responses that were less original, more elaborate, prompted by a Uses task, from younger children, or from male students, were all more difficult for the judges to rate reliably. Among the interaction effects, we found that our judges were also more likely to disagree on highly original responses from Gifted/Talented students, responses from Latinx students who were identified as English Language Learners, or responses from Asian students who took a lot of time on the task. Given that human judgments such as these are currently being used to train artificial intelligence systems to rate responses to creativity assessments, we believe understanding their nuances is important.

Abstract Image

是什么让孩子对创造力评估的反应难以可靠判断?
开放式语言创造力评估通常在心理学研究和教育实践中对小学年龄的儿童进行。然后,孩子们的回答通常会由一组评委打分,他们接受过识别原创想法的培训,希望评分者之间能达成一定程度的一致。即使在法官是可靠的情况下,对回应的原创性的一些残余分歧也是不可避免的。本研究以387名小学生和10449个单项回答为数据集,对儿童创造力评估的差异进行了预测。我们的五名训练有素的法官对回答的评价具有高度的一致性信度(α = 0.844),但我们进行这项研究是为了预测剩余的不一致。我们使用自适应LASSO模型预测了法官剩余分歧的72%方差,发现在某些类型的回答上,我们的法官倾向于更不同意。在我们的模型中,主要的影响是,由use任务提示的、年龄更小的孩子或男学生的回答,原创性更低、更详细,对评委来说都更难以进行可靠的评分。在互动效应中,我们发现我们的评委也更有可能不同意来自资优/天才学生的高度原创的回答,来自被认定为英语学习者的拉丁裔学生的回答,或者来自花了很多时间在任务上的亚洲学生的回答。考虑到诸如此类的人类判断目前正被用于训练人工智能系统,以评估对创造力评估的反应,我们认为理解它们的细微差别很重要。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Journal of Creative Behavior
Journal of Creative Behavior Arts and Humanities-Visual Arts and Performing Arts
CiteScore
7.50
自引率
7.70%
发文量
44
期刊介绍: The Journal of Creative Behavior is our quarterly academic journal citing the most current research in creative thinking. For nearly four decades JCB has been the benchmark scientific periodical in the field. It provides up to date cutting-edge ideas about creativity in education, psychology, business, arts and more.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信