Constructing a norm for children's scientific drawing: Distribution features based on semantic similarity of large language models.

IF 1.3 Q3 BIOCHEMICAL RESEARCH METHODS
Biology Methods and Protocols Pub Date : 2025-08-11 eCollection Date: 2025-01-01 DOI:10.1093/biomethods/bpaf062
Yi Zhang, Fan Wei, Jingyi Li, Yan Wang, Yanyan Yu, Jianli Chen, Zipo Cai, Xinyu Liu, Wei Wang, Sensen Yao, Peng Wang, Zhong Wang
{"title":"Constructing a norm for children's scientific drawing: Distribution features based on semantic similarity of large language models.","authors":"Yi Zhang, Fan Wei, Jingyi Li, Yan Wang, Yanyan Yu, Jianli Chen, Zipo Cai, Xinyu Liu, Wei Wang, Sensen Yao, Peng Wang, Zhong Wang","doi":"10.1093/biomethods/bpaf062","DOIUrl":null,"url":null,"abstract":"<p><p>The use of children's drawings to examining their conceptual understanding has been proven to be an effective method, but there are two major problems with previous research: (i) The content of the drawings heavily relies on the task, and the ecological validity of the conclusions is low. (ii) The interpretation of drawings relies too much on the subjective feelings of the researchers. To address this issue, this study uses the Large Language Model (LLM) to identify 1420 children's scientific drawings (covering nine scientific themes/concepts) and uses the word2vec algorithm to calculate their semantic similarity. The study explores whether there are consistent drawing representations for children on the same theme and attempts to establish a norm for children's scientific drawings, providing a baseline reference for follow-up children's drawing research. The results show that the representation of most drawings has consistency, manifested as most semantic similarity >0.8. At the same time, it was found that the consistency of the representation is independent of the accuracy (of LLM's recognition), indicating the existence of consistency bias. In the subsequent exploration of influencing factors, we used Kendall rank correlation coefficient to investigate the effects of \"sample size,\" \"abstract degree,\" and \"focus points\" on drawings and used word frequency statistics to explore whether children represented abstract themes/concepts by reproducing what was taught in class. It was found that accuracy (of LLM's recognition) is the most sensitive indicator, and data such as sample size and semantic similarity are related to it. The consistency between classroom experiments and teaching purpose is also an important factor, many students focus more on the experiments themselves rather than what they explain. In addition, most children tend to use examples they have seen in class to represent more abstract themes/concepts, indicating that they may need concrete examples to understand abstract things.</p>","PeriodicalId":36528,"journal":{"name":"Biology Methods and Protocols","volume":"10 1","pages":"bpaf062"},"PeriodicalIF":1.3000,"publicationDate":"2025-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12380450/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Biology Methods and Protocols","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/biomethods/bpaf062","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q3","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
引用次数: 0

Abstract

The use of children's drawings to examining their conceptual understanding has been proven to be an effective method, but there are two major problems with previous research: (i) The content of the drawings heavily relies on the task, and the ecological validity of the conclusions is low. (ii) The interpretation of drawings relies too much on the subjective feelings of the researchers. To address this issue, this study uses the Large Language Model (LLM) to identify 1420 children's scientific drawings (covering nine scientific themes/concepts) and uses the word2vec algorithm to calculate their semantic similarity. The study explores whether there are consistent drawing representations for children on the same theme and attempts to establish a norm for children's scientific drawings, providing a baseline reference for follow-up children's drawing research. The results show that the representation of most drawings has consistency, manifested as most semantic similarity >0.8. At the same time, it was found that the consistency of the representation is independent of the accuracy (of LLM's recognition), indicating the existence of consistency bias. In the subsequent exploration of influencing factors, we used Kendall rank correlation coefficient to investigate the effects of "sample size," "abstract degree," and "focus points" on drawings and used word frequency statistics to explore whether children represented abstract themes/concepts by reproducing what was taught in class. It was found that accuracy (of LLM's recognition) is the most sensitive indicator, and data such as sample size and semantic similarity are related to it. The consistency between classroom experiments and teaching purpose is also an important factor, many students focus more on the experiments themselves rather than what they explain. In addition, most children tend to use examples they have seen in class to represent more abstract themes/concepts, indicating that they may need concrete examples to understand abstract things.

Abstract Image

Abstract Image

Abstract Image

构建儿童科学绘画规范:基于大型语言模型语义相似度的分布特征。
使用儿童的图画来检验他们的概念理解已被证明是一种有效的方法,但以往的研究存在两个主要问题:(1)图画的内容严重依赖于任务,结论的生态效度较低。(ii)对图画的解读过于依赖研究者的主观感受。为了解决这一问题,本研究使用大语言模型(LLM)对1420幅儿童科学绘画(涵盖9个科学主题/概念)进行识别,并使用word2vec算法计算其语义相似度。本研究探讨儿童在同一主题上是否存在一致的绘画表征,试图建立儿童科学绘画的规范,为后续儿童绘画研究提供基线参考。结果表明,大多数图的表示具有一致性,表现为大多数语义相似度>0.8。同时,我们发现表征的一致性与(LLM识别的)准确性无关,表明存在一致性偏差。在随后的影响因素探索中,我们使用肯德尔秩相关系数来研究“样本量”、“抽象程度”和“焦点”对绘画的影响,并使用词频统计来探索儿童是否通过再现课堂上所教的内容来代表抽象主题/概念。研究发现,LLM识别的准确率是最敏感的指标,样本量、语义相似度等数据与之相关。课堂实验与教学目的的一致性也是一个重要因素,许多学生更关注实验本身,而不是实验所解释的内容。此外,大多数孩子倾向于用他们在课堂上看到的例子来代表更抽象的主题/概念,这表明他们可能需要具体的例子来理解抽象的事物。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Biology Methods and Protocols
Biology Methods and Protocols Agricultural and Biological Sciences-Agricultural and Biological Sciences (all)
CiteScore
3.80
自引率
2.80%
发文量
28
审稿时长
19 weeks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信