S. Edwards, Nischel Kandru, Mukund B. M. Rajagopal
{"title":"Investigating Static Analysis Errors in Student Java Programs","authors":"S. Edwards, Nischel Kandru, Mukund B. M. Rajagopal","doi":"10.1145/3105726.3106182","DOIUrl":null,"url":null,"abstract":"Research on students learning to program has produced studies on both compile-time errors (syntax errors) and run-time errors (exceptions). Both of these types of errors are natural targets, since detection is built into the programming language. In this paper, we present an empirical investigation of static analysis errors present in syntactically correct code. Static analysis errors can be revealed by tools that examine a program's source code, but this error detection is typically not built into common programming languages and instead requires separate tools. Static analysis can be used to check formatting or commenting expectations, but it also can be used to identify problematic code or to find some kinds of conceptual or logic errors. We study nearly 10 million static analysis errors found in over 500 thousand program submissions made by students over a five-semester period. The study includes data from four separate courses, including a non-majors introductory course as well as the CS1/CS2/CS3 sequence for CS majors. We examine the differences between the error rates of CS major and non-major beginners, and also examine how these patterns change over time as students progress through the CS major course sequence. Our investigation shows that while formatting and Javadoc issues are the most common, static checks that identify coding flaws that are likely to be errors are strongly correlated with producing correct programs, even when students eventually fix the problems. With experience, students produce fewer errors, but the errors that are most frequent are consistent between both computer science majors and non-majors, and across experience levels. These results can highlight student struggles or misunderstandings that have escaped past analyses focused on syntax or run-time errors.","PeriodicalId":267640,"journal":{"name":"Proceedings of the 2017 ACM Conference on International Computing Education Research","volume":"60 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"73","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2017 ACM Conference on International Computing Education Research","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3105726.3106182","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 73
Abstract
Research on students learning to program has produced studies on both compile-time errors (syntax errors) and run-time errors (exceptions). Both of these types of errors are natural targets, since detection is built into the programming language. In this paper, we present an empirical investigation of static analysis errors present in syntactically correct code. Static analysis errors can be revealed by tools that examine a program's source code, but this error detection is typically not built into common programming languages and instead requires separate tools. Static analysis can be used to check formatting or commenting expectations, but it also can be used to identify problematic code or to find some kinds of conceptual or logic errors. We study nearly 10 million static analysis errors found in over 500 thousand program submissions made by students over a five-semester period. The study includes data from four separate courses, including a non-majors introductory course as well as the CS1/CS2/CS3 sequence for CS majors. We examine the differences between the error rates of CS major and non-major beginners, and also examine how these patterns change over time as students progress through the CS major course sequence. Our investigation shows that while formatting and Javadoc issues are the most common, static checks that identify coding flaws that are likely to be errors are strongly correlated with producing correct programs, even when students eventually fix the problems. With experience, students produce fewer errors, but the errors that are most frequent are consistent between both computer science majors and non-majors, and across experience levels. These results can highlight student struggles or misunderstandings that have escaped past analyses focused on syntax or run-time errors.