SelfCode: An Annotated Corpus and a Model for Automated Assessment of Self-Explanation During Source Code Comprehension

The International FLAIRS Conference Proceedings Pub Date : 2023-05-08 DOI:10.32473/flairs.36.133385

Jeevan Chapagain, Zak Risha, Rabin Banjade, P. Oli, L. Tamang, Peter Brusilovsky, V. Rus

{"title":"SelfCode: An Annotated Corpus and a Model for Automated Assessment of Self-Explanation During Source Code Comprehension","authors":"Jeevan Chapagain, Zak Risha, Rabin Banjade, P. Oli, L. Tamang, Peter Brusilovsky, V. Rus","doi":"10.32473/flairs.36.133385","DOIUrl":null,"url":null,"abstract":"The ability to automatically assess learners' activities is the key to user modeling and personalization in adaptive educational systems.The work presented in this paper opens an opportunity to expand the scope of automated assessment from traditional programming problems to code comprehension tasks where students are requested to explain the critical steps of a program. The ability to automatically assess these self-explanations offers a unique opportunity to understand the current state of student knowledge, recognize possible misconceptions, and provide feedback. Annotated datasets are needed to train Artificial Intelligence/Machine Learning approaches for the automated assessment of student explanations. To answer this need, we present a novel corpus called SelfCode which consists of 1,770 sentence pairs of student and expert self-explanations of Java code examples, along with semantic similarity judgments provided by experts. We also present a baseline automated assessment model that relies on textual features. The corpus is available at the GitHub repository (https://github.com/jeevanchaps/SelfCode).","PeriodicalId":302103,"journal":{"name":"The International FLAIRS Conference Proceedings","volume":"55 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"The International FLAIRS Conference Proceedings","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.32473/flairs.36.133385","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

The ability to automatically assess learners' activities is the key to user modeling and personalization in adaptive educational systems.The work presented in this paper opens an opportunity to expand the scope of automated assessment from traditional programming problems to code comprehension tasks where students are requested to explain the critical steps of a program. The ability to automatically assess these self-explanations offers a unique opportunity to understand the current state of student knowledge, recognize possible misconceptions, and provide feedback. Annotated datasets are needed to train Artificial Intelligence/Machine Learning approaches for the automated assessment of student explanations. To answer this need, we present a novel corpus called SelfCode which consists of 1,770 sentence pairs of student and expert self-explanations of Java code examples, along with semantic similarity judgments provided by experts. We also present a baseline automated assessment model that relies on textual features. The corpus is available at the GitHub repository (https://github.com/jeevanchaps/SelfCode).

查看原文本刊更多论文

SelfCode:一个注释语料库和源代码理解过程中自我解释的自动评估模型

自动评估学习者活动的能力是自适应教育系统中用户建模和个性化的关键。本文提出的工作为将自动化评估的范围从传统的编程问题扩展到要求学生解释程序的关键步骤的代码理解任务提供了机会。自动评估这些自我解释的能力提供了一个独特的机会来了解学生知识的当前状态，识别可能的误解，并提供反馈。需要带注释的数据集来训练用于自动评估学生解释的人工智能/机器学习方法。为了满足这一需求，我们提出了一个名为SelfCode的新语料库，该语料库由1770对学生和专家对Java代码示例的自我解释组成，以及专家提供的语义相似性判断。我们还提出了一个基于文本特征的基线自动评估模型。语料库可在GitHub存储库(https://github.com/jeevanchaps/SelfCode)获得。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

The International FLAIRS Conference Proceedings

自引率

0.00%

发文量