Test Coverage in Python Programs

2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR) Pub Date : 2019-05-26 DOI:10.1109/MSR.2019.00027

Hongyu Zhai, Casey Casalnuovo, Premkumar T. Devanbu

{"title":"Test Coverage in Python Programs","authors":"Hongyu Zhai, Casey Casalnuovo, Premkumar T. Devanbu","doi":"10.1109/MSR.2019.00027","DOIUrl":null,"url":null,"abstract":"We study code coverage in several popular Python projects: flask, matplotlib, pandas, scikit-learn, and scrapy. Coverage data on these projects is gathered and hosted on the Codecov website, from where this data can be mined. Using this data, and a syntactic parse of the code, we examine the effect of control flow structure, statement type (e.g., if, for) and code age on test coverage. We find that coverage depends on control flow structure, with more deeply nested statements being significantly less likely to be covered. This is a clear effect, which holds up in every project, even when controlling for the age of the line (as determined by git blame). We find that the age of a line per se has a small (but statistically significant) positive effect on coverage. Finally, we find that the kind of statement (try, if, except, raise, etc) has varying effects on coverage, with exception-handling statements being covered much less often. These results suggest that developers in Python projects have difficulty writing test sets that cover deeply-nested and error-handling statements, and might need assistance covering such code.","PeriodicalId":6706,"journal":{"name":"2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR)","volume":"51 1","pages":"116-120"},"PeriodicalIF":0.0000,"publicationDate":"2019-05-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"13","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MSR.2019.00027","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 13

Abstract

We study code coverage in several popular Python projects: flask, matplotlib, pandas, scikit-learn, and scrapy. Coverage data on these projects is gathered and hosted on the Codecov website, from where this data can be mined. Using this data, and a syntactic parse of the code, we examine the effect of control flow structure, statement type (e.g., if, for) and code age on test coverage. We find that coverage depends on control flow structure, with more deeply nested statements being significantly less likely to be covered. This is a clear effect, which holds up in every project, even when controlling for the age of the line (as determined by git blame). We find that the age of a line per se has a small (but statistically significant) positive effect on coverage. Finally, we find that the kind of statement (try, if, except, raise, etc) has varying effects on coverage, with exception-handling statements being covered much less often. These results suggest that developers in Python projects have difficulty writing test sets that cover deeply-nested and error-handling statements, and might need assistance covering such code.

查看原文本刊更多论文

Python程序中的测试覆盖率

我们研究了几个流行的Python项目中的代码覆盖率:flask、matplotlib、pandas、scikit-learn和scrapy。这些项目的覆盖数据被收集并托管在Codecov网站上，从那里可以挖掘这些数据。使用这些数据，以及代码的语法解析，我们检查控制流结构、语句类型(例如，if, for)和代码年龄对测试覆盖率的影响。我们发现覆盖范围取决于控制流结构，嵌套更深的语句被覆盖的可能性更小。这是一个明显的效应，在每个项目中都是如此，即使控制了线路的年龄(由git责备决定)。我们发现，线路本身的年龄对覆盖率有一个小的(但统计上显着的)积极影响。最后，我们发现语句的类型(try、if、except、raise等)对覆盖率有不同的影响，异常处理语句的覆盖率要低得多。这些结果表明，Python项目中的开发人员很难编写覆盖深度嵌套和错误处理语句的测试集，并且可能需要帮助来覆盖此类代码。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR)

自引率

0.00%

发文量