How do professors format exams?: an analysis of question variety at scale

Proceedings of the Fifth Annual ACM Conference on Learning at Scale Pub Date : 2018-06-26 DOI:10.1145/3231644.3231667

Paul Laskowski, Sergey Karayev, Marti A. Hearst

{"title":"How do professors format exams?: an analysis of question variety at scale","authors":"Paul Laskowski, Sergey Karayev, Marti A. Hearst","doi":"10.1145/3231644.3231667","DOIUrl":null,"url":null,"abstract":"This study analyzes the use of paper exams in college-level STEM courses. It leverages a unique dataset of nearly 1,800 exams, which were scanned into a web application, then processed by a team of annotators to yield a detailed snapshot of the way instructors currently structure exams. The focus of the investigation is on the variety of question formats, and how they are applied across different course topics. The analysis divides questions according to seven top-level categories, finding significant differences among these in terms of positioning, use across subjects, and student performance. The analysis also reveals a strong tendency within the collection for instructors to order questions from easier to harder. A linear mixed effects model is used to estimate the reliability of different question types. Long writing questions stand out for their high reliability, while binary and multiple choice questions have low reliability. The model suggests that over three multiple choice questions, or over five binary questions, are required to attain the same reliability as a single long writing question. A correlation analysis across seven response types finds that student abilities for different questions types exceed 70 percent for all pairs, although binary and multiple-choice questions stand out for having unusually low correlations with all other question types.","PeriodicalId":20634,"journal":{"name":"Proceedings of the Fifth Annual ACM Conference on Learning at Scale","volume":"3 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2018-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Fifth Annual ACM Conference on Learning at Scale","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3231644.3231667","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

Abstract

This study analyzes the use of paper exams in college-level STEM courses. It leverages a unique dataset of nearly 1,800 exams, which were scanned into a web application, then processed by a team of annotators to yield a detailed snapshot of the way instructors currently structure exams. The focus of the investigation is on the variety of question formats, and how they are applied across different course topics. The analysis divides questions according to seven top-level categories, finding significant differences among these in terms of positioning, use across subjects, and student performance. The analysis also reveals a strong tendency within the collection for instructors to order questions from easier to harder. A linear mixed effects model is used to estimate the reliability of different question types. Long writing questions stand out for their high reliability, while binary and multiple choice questions have low reliability. The model suggests that over three multiple choice questions, or over five binary questions, are required to attain the same reliability as a single long writing question. A correlation analysis across seven response types finds that student abilities for different questions types exceed 70 percent for all pairs, although binary and multiple-choice questions stand out for having unusually low correlations with all other question types.

查看原文本刊更多论文

教授是如何安排考试的?对问题多样性的分析

本研究分析了大学水平STEM课程中纸卷考试的使用情况。它利用了一个包含近1800个考试的独特数据集，这些考试被扫描到一个网络应用程序中，然后由一组注释者进行处理，以生成教师当前组织考试方式的详细快照。调查的重点是各种各样的问题格式，以及它们如何在不同的课程主题中应用。该分析将问题分为七个顶级类别，发现这些类别在定位、跨学科使用和学生表现方面存在显著差异。分析还揭示了一个强烈的趋势，在收集教师排序问题从容易到难。采用线性混合效应模型估计不同题型的信度。长篇写作题的信度较高，而二选题和多项选择题的信度较低。该模型表明，超过三个选择题，或超过五个二元问题，需要达到与一个长写作问题相同的可靠性。对七种答题类型的相关分析发现，学生对不同答题类型的答题能力在所有答题对中都超过70%，尽管二元选择题和多项选择题与其他所有答题类型的相关性异常低。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the Fifth Annual ACM Conference on Learning at Scale

自引率

0.00%

发文量