Using text mining and machine learning to predict reasoning activities from think-aloud transcripts in computer assisted learning

IF 4.5 2区 教育学 Q1 EDUCATION & EDUCATIONAL RESEARCH
Shan Li, Xiaoshan Huang, Tingting Wang, Juan Zheng, Susanne P. Lajoie
{"title":"Using text mining and machine learning to predict reasoning activities from think-aloud transcripts in computer assisted learning","authors":"Shan Li, Xiaoshan Huang, Tingting Wang, Juan Zheng, Susanne P. Lajoie","doi":"10.1007/s12528-024-09404-6","DOIUrl":null,"url":null,"abstract":"<p>Coding think-aloud transcripts is time-consuming and labor-intensive. In this study, we examined the feasibility of predicting students’ reasoning activities based on their think-aloud transcripts by leveraging the affordances of text mining and machine learning techniques. We collected the think-aloud data of 34 medical students as they diagnosed virtual patients in an intelligent tutoring system. The think-aloud data were transcribed and segmented into 2,792 meaningful units. We used a text mining tool to analyze the linguistic features of think-aloud segments. Meanwhile, we manually coded the think-aloud segments using a medical reasoning coding scheme. We then trained eight types of supervised machine learning algorithms to predict reasoning activities based on the linguistic features of students’ think-aloud transcripts. We further investigated if the performance of prediction models differed between high and low performers. The results suggested that students’ reasoning activities could be predicted relatively accurately by the linguistic features of their think-aloud transcripts. Moreover, training the predictive models using the data instances of either high or low performers did not lower the models’ performance. This study has significant methodological and practical implications regarding the automatic analysis of think-aloud protocols and real-time assessment of students’ reasoning activities.</p>","PeriodicalId":15404,"journal":{"name":"Journal of Computing in Higher Education","volume":"41 1","pages":""},"PeriodicalIF":4.5000,"publicationDate":"2024-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Computing in Higher Education","FirstCategoryId":"95","ListUrlMain":"https://doi.org/10.1007/s12528-024-09404-6","RegionNum":2,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"EDUCATION & EDUCATIONAL RESEARCH","Score":null,"Total":0}
引用次数: 0

Abstract

Coding think-aloud transcripts is time-consuming and labor-intensive. In this study, we examined the feasibility of predicting students’ reasoning activities based on their think-aloud transcripts by leveraging the affordances of text mining and machine learning techniques. We collected the think-aloud data of 34 medical students as they diagnosed virtual patients in an intelligent tutoring system. The think-aloud data were transcribed and segmented into 2,792 meaningful units. We used a text mining tool to analyze the linguistic features of think-aloud segments. Meanwhile, we manually coded the think-aloud segments using a medical reasoning coding scheme. We then trained eight types of supervised machine learning algorithms to predict reasoning activities based on the linguistic features of students’ think-aloud transcripts. We further investigated if the performance of prediction models differed between high and low performers. The results suggested that students’ reasoning activities could be predicted relatively accurately by the linguistic features of their think-aloud transcripts. Moreover, training the predictive models using the data instances of either high or low performers did not lower the models’ performance. This study has significant methodological and practical implications regarding the automatic analysis of think-aloud protocols and real-time assessment of students’ reasoning activities.

Abstract Image

利用文本挖掘和机器学习从计算机辅助学习中的思考-朗读记录中预测推理活动
对思考-朗读记录进行编码既耗时又耗力。在本研究中,我们利用文本挖掘和机器学习技术的优势,研究了根据学生的思考语音记录预测其推理活动的可行性。我们收集了 34 名医科学生在智能辅导系统中诊断虚拟病人时的思考语音数据。我们将思考语音数据转录并分割成 2,792 个有意义的单元。我们使用文本挖掘工具分析了思考-朗读片段的语言特点。同时,我们使用医学推理编码方案对思考语音片段进行人工编码。然后,我们训练了八种有监督的机器学习算法,以根据学生思考-朗读记录的语言特点预测推理活动。我们进一步研究了预测模型的性能在成绩优秀和成绩较差的学生之间是否存在差异。结果表明,学生的推理活动可以通过其思考-朗读记录的语言特点得到相对准确的预测。此外,使用成绩优秀或成绩较差学生的数据实例来训练预测模型并不会降低模型的性能。这项研究对于自动分析思考-朗读协议和实时评估学生的推理活动具有重要的方法论和实践意义。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Journal of Computing in Higher Education
Journal of Computing in Higher Education EDUCATION & EDUCATIONAL RESEARCH-
CiteScore
15.10
自引率
3.60%
发文量
40
期刊介绍: Journal of Computing in Higher Education (JCHE) contributes to our understanding of the design, development, and implementation of instructional processes and technologies in higher education. JCHE publishes original research, literature reviews, implementation and evaluation studies, and theoretical, conceptual, and policy papers that provide perspectives on instructional technology’s role in improving access, affordability, and outcomes of postsecondary education.  Priority is given to well-documented original papers that demonstrate a strong grounding in learning theory and/or rigorous educational research design.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信