通过微调的大型语言模型评估练习意识高阶思维技能

IF 7.6 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Knowledge-Based Systems Pub Date : 2025-06-03 DOI:10.1016/j.knosys.2025.113808

Xiuling He, Xiong Xiao, Jing Fang, Yue Li, Yangyang Li, Ruijie Zhou

{"title":"通过微调的大型语言模型评估练习意识高阶思维技能","authors":"Xiuling He, Xiong Xiao, Jing Fang, Yue Li, Yangyang Li, Ruijie Zhou","doi":"10.1016/j.knosys.2025.113808","DOIUrl":null,"url":null,"abstract":"<div><div>Higher-order thinking Skills (HOTS) are complex cognitive skills that go beyond the basic levels of memory and comprehension. It is critical to developing an individual’s critical thinking and problem-solving skills. Current methods for assessing HOTS typically rely on expert judgment or specially designed assessment tasks, which have become well-established and reliable paradigms but are time-consuming and difficult to transfer. Accordingly, this study aims to develop an automated HOTS assessment model that will reduce reliance on experts and enhance efficiency and accuracy. Large language models (LLM) are pre-trained models using deep learning techniques to deal with natural language processing tasks. They have excellent knowledge base and reasoning capabilities, and researchers have applied them to a range of domains. In this paper, we proposed the Exercise-Aware higher-order Thinking skills Assessment (EATA) model based on fine-tuning the LLM. The EATA comprises the Exercise Awareness (EA) and the HOTS Assessment (HA) modules. The EA module includes a pre-trained Text2Vec and a multilayer perceptron (MLP). It integrates the exercise text information with the HOTS labeling information to generate the higher-order exercise embedding matrix. The HA module employs a pre-trained LLM as the underlying network, which takes the student’s learning records with the higher-order exercise embedding as inputs and automatically assesses the student’s HOTS through a fine-tuning technique. In this way, EATA can emulate traditional assessment methods, but replace experts with LLM. It reduces the interference of human factors, thus improving efficiency. To verify the validity of EATA, we collect 43070 online exercise data from 181 undergraduate students in a university. The experiments show that EATA can effectively assess students’ HOTS, indicating the potential value of LLM in HOTS assessment tasks. The implementations are available at <span><span>https://github.com/xxiongGG/EATA</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"324 ","pages":"Article 113808"},"PeriodicalIF":7.6000,"publicationDate":"2025-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Exercise-Aware higher-order Thinking skills Assessment via fine-tuned large language model\",\"authors\":\"Xiuling He, Xiong Xiao, Jing Fang, Yue Li, Yangyang Li, Ruijie Zhou\",\"doi\":\"10.1016/j.knosys.2025.113808\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Higher-order thinking Skills (HOTS) are complex cognitive skills that go beyond the basic levels of memory and comprehension. It is critical to developing an individual’s critical thinking and problem-solving skills. Current methods for assessing HOTS typically rely on expert judgment or specially designed assessment tasks, which have become well-established and reliable paradigms but are time-consuming and difficult to transfer. Accordingly, this study aims to develop an automated HOTS assessment model that will reduce reliance on experts and enhance efficiency and accuracy. Large language models (LLM) are pre-trained models using deep learning techniques to deal with natural language processing tasks. They have excellent knowledge base and reasoning capabilities, and researchers have applied them to a range of domains. In this paper, we proposed the Exercise-Aware higher-order Thinking skills Assessment (EATA) model based on fine-tuning the LLM. The EATA comprises the Exercise Awareness (EA) and the HOTS Assessment (HA) modules. The EA module includes a pre-trained Text2Vec and a multilayer perceptron (MLP). It integrates the exercise text information with the HOTS labeling information to generate the higher-order exercise embedding matrix. The HA module employs a pre-trained LLM as the underlying network, which takes the student’s learning records with the higher-order exercise embedding as inputs and automatically assesses the student’s HOTS through a fine-tuning technique. In this way, EATA can emulate traditional assessment methods, but replace experts with LLM. It reduces the interference of human factors, thus improving efficiency. To verify the validity of EATA, we collect 43070 online exercise data from 181 undergraduate students in a university. The experiments show that EATA can effectively assess students’ HOTS, indicating the potential value of LLM in HOTS assessment tasks. The implementations are available at <span><span>https://github.com/xxiongGG/EATA</span><svg><path></path></svg></span>.</div></div>\",\"PeriodicalId\":49939,\"journal\":{\"name\":\"Knowledge-Based Systems\",\"volume\":\"324 \",\"pages\":\"Article 113808\"},\"PeriodicalIF\":7.6000,\"publicationDate\":\"2025-06-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Knowledge-Based Systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0950705125008548\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Knowledge-Based Systems","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0950705125008548","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

高阶思维技能（HOTS）是一种复杂的认知技能，超越了基本的记忆和理解水平。这对培养一个人的批判性思维和解决问题的能力至关重要。目前评估HOTS的方法通常依赖于专家判断或专门设计的评估任务，这些方法已成为成熟可靠的范例，但耗时且难以转移。因此，本研究旨在开发一个自动化的HOTS评估模型，以减少对专家的依赖，提高效率和准确性。大型语言模型（LLM）是使用深度学习技术来处理自然语言处理任务的预训练模型。它们具有出色的知识基础和推理能力，研究人员已将其应用于一系列领域。本文提出了基于LLM微调的练习感知高阶思维技能评估（EATA）模型。EATA包括运动意识（EA）和热休克评估（HA）两个模块。EA模块包括一个预训练的Text2Vec和一个多层感知器（MLP）。将运动文本信息与HOTS标注信息相结合，生成高阶运动嵌入矩阵。HA模块采用预训练的LLM作为底层网络，该网络将学生的高阶练习嵌入的学习记录作为输入，并通过微调技术自动评估学生的HOTS。通过这种方式，EATA可以模拟传统的评估方法，但是用LLM代替专家。减少了人为因素的干扰，从而提高了效率。为了验证EATA的有效性，我们收集了某高校181名本科生的43070份在线运动数据。实验表明，EATA可以有效地评估学生的HOTS，这表明LLM在HOTS评估任务中的潜在价值。实现可以在https://github.com/xxiongGG/EATA上获得。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Exercise-Aware higher-order Thinking skills Assessment via fine-tuned large language model

Higher-order thinking Skills (HOTS) are complex cognitive skills that go beyond the basic levels of memory and comprehension. It is critical to developing an individual’s critical thinking and problem-solving skills. Current methods for assessing HOTS typically rely on expert judgment or specially designed assessment tasks, which have become well-established and reliable paradigms but are time-consuming and difficult to transfer. Accordingly, this study aims to develop an automated HOTS assessment model that will reduce reliance on experts and enhance efficiency and accuracy. Large language models (LLM) are pre-trained models using deep learning techniques to deal with natural language processing tasks. They have excellent knowledge base and reasoning capabilities, and researchers have applied them to a range of domains. In this paper, we proposed the Exercise-Aware higher-order Thinking skills Assessment (EATA) model based on fine-tuning the LLM. The EATA comprises the Exercise Awareness (EA) and the HOTS Assessment (HA) modules. The EA module includes a pre-trained Text2Vec and a multilayer perceptron (MLP). It integrates the exercise text information with the HOTS labeling information to generate the higher-order exercise embedding matrix. The HA module employs a pre-trained LLM as the underlying network, which takes the student’s learning records with the higher-order exercise embedding as inputs and automatically assesses the student’s HOTS through a fine-tuning technique. In this way, EATA can emulate traditional assessment methods, but replace experts with LLM. It reduces the interference of human factors, thus improving efficiency. To verify the validity of EATA, we collect 43070 online exercise data from 181 undergraduate students in a university. The experiments show that EATA can effectively assess students’ HOTS, indicating the potential value of LLM in HOTS assessment tasks. The implementations are available at https://github.com/xxiongGG/EATA.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Knowledge-Based Systems 工程技术-计算机：人工智能

CiteScore

14.80

自引率

12.50%

发文量

1245

审稿时长

7.8 months

期刊介绍： Knowledge-Based Systems, an international and interdisciplinary journal in artificial intelligence, publishes original, innovative, and creative research results in the field. It focuses on knowledge-based and other artificial intelligence techniques-based systems. The journal aims to support human prediction and decision-making through data science and computation techniques, provide a balanced coverage of theory and practical study, and encourage the development and implementation of knowledge-based intelligence models, methods, systems, and software tools. Applications in business, government, education, engineering, and healthcare are emphasized.