CS1高危学生的超轻量级早期预测

Proceedings of the 54th ACM Technical Symposium on Computer Science Education V. 1 Pub Date : 2023-03-02 DOI:10.1145/3545945.3569764

Chelsea Gordon, Stan Zhao, Frank Vahid

{"title":"CS1高危学生的超轻量级早期预测","authors":"Chelsea Gordon, Stan Zhao, Frank Vahid","doi":"10.1145/3545945.3569764","DOIUrl":null,"url":null,"abstract":"Early prediction of students at risk of doing poorly in CS1 can enable early interventions or class adjustments. Preferably, prediction methods would be lightweight, not requiring much extra activity or data-collection work from instructors beyond what they already do. Previous methods included giving surveys, collecting (potentially sensitive) demographic data, introducing clicker questions into lectures, or using locally-developed systems that analyze programming behavior, each requiring some effort by instructors. Today, a widely used textbook / learning system in CS1 classes is zyBooks, used by several hundred thousand students annually. The system automatically collects data related to reading, homework, and programming assignments. For a 300+ student CS1 class, we found that three data metrics, auto-collected by that system in early weeks (1-4), were good at predicting performance on the week-6 midterm exam: non-earnest completion of the assigned readings, struggle on the coding homework, and low scores on the programming assignments, with correlation magnitudes of 0.44, 0.58, and 0.72, respectively. We combined those metrics in a decision tree model to predict students at-risk of failing the midterm exam (<70%, meaning D or F), and achieved 85% prediction accuracy with 82% sensitivity and 89% specificity, which is higher than previously-published early-prediction approaches. The approach may mean that thousands of instructors already using zyBooks (or a similar system) can get a more accurate early prediction of at-risk students, without requiring extra effort or activities, and avoiding collection of sensitive demographic data.","PeriodicalId":371326,"journal":{"name":"Proceedings of the 54th ACM Technical Symposium on Computer Science Education V. 1","volume":"50 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-03-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Ultra-Lightweight Early Prediction of At-Risk Students in CS1\",\"authors\":\"Chelsea Gordon, Stan Zhao, Frank Vahid\",\"doi\":\"10.1145/3545945.3569764\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Early prediction of students at risk of doing poorly in CS1 can enable early interventions or class adjustments. Preferably, prediction methods would be lightweight, not requiring much extra activity or data-collection work from instructors beyond what they already do. Previous methods included giving surveys, collecting (potentially sensitive) demographic data, introducing clicker questions into lectures, or using locally-developed systems that analyze programming behavior, each requiring some effort by instructors. Today, a widely used textbook / learning system in CS1 classes is zyBooks, used by several hundred thousand students annually. The system automatically collects data related to reading, homework, and programming assignments. For a 300+ student CS1 class, we found that three data metrics, auto-collected by that system in early weeks (1-4), were good at predicting performance on the week-6 midterm exam: non-earnest completion of the assigned readings, struggle on the coding homework, and low scores on the programming assignments, with correlation magnitudes of 0.44, 0.58, and 0.72, respectively. We combined those metrics in a decision tree model to predict students at-risk of failing the midterm exam (<70%, meaning D or F), and achieved 85% prediction accuracy with 82% sensitivity and 89% specificity, which is higher than previously-published early-prediction approaches. The approach may mean that thousands of instructors already using zyBooks (or a similar system) can get a more accurate early prediction of at-risk students, without requiring extra effort or activities, and avoiding collection of sensitive demographic data.\",\"PeriodicalId\":371326,\"journal\":{\"name\":\"Proceedings of the 54th ACM Technical Symposium on Computer Science Education V. 1\",\"volume\":\"50 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-03-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 54th ACM Technical Symposium on Computer Science Education V. 1\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3545945.3569764\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 54th ACM Technical Symposium on Computer Science Education V. 1","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3545945.3569764","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

早期预测学生在CS1中表现不佳的风险可以进行早期干预或班级调整。最好是轻量级的预测方法，不需要讲师在他们已经完成的工作之外进行额外的活动或数据收集工作。以前的方法包括进行调查，收集(可能敏感的)人口统计数据，在讲座中引入点击问题，或者使用本地开发的系统来分析编程行为，每一种方法都需要教师付出一些努力。今天，CS1课程中广泛使用的教科书/学习系统是zyBooks，每年有数十万学生使用。系统自动收集阅读、家庭作业、编程作业等相关数据。对于一个300多名CS1学生的班级，我们发现系统在最初几周(1-4周)自动收集的三个数据指标很好地预测了第6周期中考试的表现:不认真完成指定的阅读材料，在编码作业中遇到困难，以及在编程作业中得分低，相关系数分别为0.44,0.58和0.72。我们将这些指标结合到一个决策树模型中来预测学生期中考试不及格的风险(<70%，意味着D或F)，并实现了85%的预测准确率，82%的灵敏度和89%的特异性，高于之前发表的早期预测方法。这种方法可能意味着已经使用zyBooks(或类似系统)的数千名教师可以更准确地对有风险的学生进行早期预测，而不需要额外的努力或活动，也避免了收集敏感的人口统计数据。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Ultra-Lightweight Early Prediction of At-Risk Students in CS1

Early prediction of students at risk of doing poorly in CS1 can enable early interventions or class adjustments. Preferably, prediction methods would be lightweight, not requiring much extra activity or data-collection work from instructors beyond what they already do. Previous methods included giving surveys, collecting (potentially sensitive) demographic data, introducing clicker questions into lectures, or using locally-developed systems that analyze programming behavior, each requiring some effort by instructors. Today, a widely used textbook / learning system in CS1 classes is zyBooks, used by several hundred thousand students annually. The system automatically collects data related to reading, homework, and programming assignments. For a 300+ student CS1 class, we found that three data metrics, auto-collected by that system in early weeks (1-4), were good at predicting performance on the week-6 midterm exam: non-earnest completion of the assigned readings, struggle on the coding homework, and low scores on the programming assignments, with correlation magnitudes of 0.44, 0.58, and 0.72, respectively. We combined those metrics in a decision tree model to predict students at-risk of failing the midterm exam (<70%, meaning D or F), and achieved 85% prediction accuracy with 82% sensitivity and 89% specificity, which is higher than previously-published early-prediction approaches. The approach may mean that thousands of instructors already using zyBooks (or a similar system) can get a more accurate early prediction of at-risk students, without requiring extra effort or activities, and avoiding collection of sensitive demographic data.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the 54th ACM Technical Symposium on Computer Science Education V. 1

自引率

0.00%

发文量