Investigating Static Analysis Errors in Student Java Programs

Proceedings of the 2017 ACM Conference on International Computing Education Research Pub Date : 2017-08-14 DOI:10.1145/3105726.3106182

S. Edwards, Nischel Kandru, Mukund B. M. Rajagopal

{"title":"Investigating Static Analysis Errors in Student Java Programs","authors":"S. Edwards, Nischel Kandru, Mukund B. M. Rajagopal","doi":"10.1145/3105726.3106182","DOIUrl":null,"url":null,"abstract":"Research on students learning to program has produced studies on both compile-time errors (syntax errors) and run-time errors (exceptions). Both of these types of errors are natural targets, since detection is built into the programming language. In this paper, we present an empirical investigation of static analysis errors present in syntactically correct code. Static analysis errors can be revealed by tools that examine a program's source code, but this error detection is typically not built into common programming languages and instead requires separate tools. Static analysis can be used to check formatting or commenting expectations, but it also can be used to identify problematic code or to find some kinds of conceptual or logic errors. We study nearly 10 million static analysis errors found in over 500 thousand program submissions made by students over a five-semester period. The study includes data from four separate courses, including a non-majors introductory course as well as the CS1/CS2/CS3 sequence for CS majors. We examine the differences between the error rates of CS major and non-major beginners, and also examine how these patterns change over time as students progress through the CS major course sequence. Our investigation shows that while formatting and Javadoc issues are the most common, static checks that identify coding flaws that are likely to be errors are strongly correlated with producing correct programs, even when students eventually fix the problems. With experience, students produce fewer errors, but the errors that are most frequent are consistent between both computer science majors and non-majors, and across experience levels. These results can highlight student struggles or misunderstandings that have escaped past analyses focused on syntax or run-time errors.","PeriodicalId":267640,"journal":{"name":"Proceedings of the 2017 ACM Conference on International Computing Education Research","volume":"60 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"73","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2017 ACM Conference on International Computing Education Research","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3105726.3106182","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 73

Abstract

Research on students learning to program has produced studies on both compile-time errors (syntax errors) and run-time errors (exceptions). Both of these types of errors are natural targets, since detection is built into the programming language. In this paper, we present an empirical investigation of static analysis errors present in syntactically correct code. Static analysis errors can be revealed by tools that examine a program's source code, but this error detection is typically not built into common programming languages and instead requires separate tools. Static analysis can be used to check formatting or commenting expectations, but it also can be used to identify problematic code or to find some kinds of conceptual or logic errors. We study nearly 10 million static analysis errors found in over 500 thousand program submissions made by students over a five-semester period. The study includes data from four separate courses, including a non-majors introductory course as well as the CS1/CS2/CS3 sequence for CS majors. We examine the differences between the error rates of CS major and non-major beginners, and also examine how these patterns change over time as students progress through the CS major course sequence. Our investigation shows that while formatting and Javadoc issues are the most common, static checks that identify coding flaws that are likely to be errors are strongly correlated with producing correct programs, even when students eventually fix the problems. With experience, students produce fewer errors, but the errors that are most frequent are consistent between both computer science majors and non-majors, and across experience levels. These results can highlight student struggles or misunderstandings that have escaped past analyses focused on syntax or run-time errors.

查看原文本刊更多论文

调查学生Java程序中的静态分析错误

对学生学习编程的研究已经产生了编译时错误(语法错误)和运行时错误(异常)的研究。这两种类型的错误都是自然的目标，因为检测是内置在编程语言中的。在本文中，我们对语法正确的代码中存在的静态分析错误进行了实证研究。静态分析错误可以通过检查程序源代码的工具来发现，但是这种错误检测通常不会内置到公共编程语言中，而是需要单独的工具。静态分析可用于检查格式或注释期望，但也可用于识别有问题的代码或查找某些类型的概念或逻辑错误。我们研究了近1000万个静态分析错误，这些错误来自学生在五个学期期间提交的50多万份课程。该研究包括四门独立课程的数据，包括一门非专业入门课程以及CS专业的CS1/CS2/CS3序列。我们研究了计算机科学专业和非专业初学者的错误率之间的差异，并研究了这些模式是如何随着学生在计算机科学专业课程序列中的进步而变化的。我们的调查显示，虽然格式和Javadoc问题是最常见的，但识别可能是错误的编码缺陷的静态检查与生成正确的程序密切相关，即使学生最终解决了问题。有了经验，学生犯的错误就少了，但最常见的错误在计算机科学专业和非专业之间是一致的，在不同的经验水平之间也是一致的。这些结果可以突出学生的挣扎或误解，这些误解在过去的分析中被忽略了，这些分析主要关注语法或运行时错误。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 2017 ACM Conference on International Computing Education Research

自引率

0.00%

发文量