DyPyBench: A Benchmark of Executable Python Software

ArXiv Pub Date : 2024-03-01 DOI:10.1145/3643742
Islem Bouzenia, Bajaj Piyush Krishan, Michael Pradel
{"title":"DyPyBench: A Benchmark of Executable Python Software","authors":"Islem Bouzenia, Bajaj Piyush Krishan, Michael Pradel","doi":"10.1145/3643742","DOIUrl":null,"url":null,"abstract":"Python has emerged as one of the most popular programming languages, extensively utilized in domains such as machine learning, data analysis, and web applications. Python's dynamic nature and extensive usage make it an attractive candidate for dynamic program analysis. However, unlike for other popular languages, there currently is no comprehensive benchmark suite of executable Python projects, which hinders the development of dynamic analyses. This work addresses this gap by presenting DyPyBench, the first benchmark of Python projects that is large scale, diverse, ready to run (i.e., with fully configured and prepared test suites), and ready to analyze (by integrating with the DynaPyt dynamic analysis framework). The benchmark encompasses 50 popular opensource projects from various application domains, with a total of 681k lines of Python code, and 30k test cases. DyPyBench enables various applications in testing and dynamic analysis, of which we explore three in this work: (i) Gathering dynamic call graphs and empirically comparing them to statically computed call graphs, which exposes and quantifies limitations of existing call graph construction techniques for Python. (ii) Using DyPyBench to build a training data set for LExecutor, a neural model that learns to predict values that otherwise would be missing at runtime. (iii) Using dynamically gathered execution traces to mine API usage specifications, which establishes a baseline for future work on specification mining for Python. We envision DyPyBench to provide a basis for other dynamic analyses and for studying the runtime behavior of Python code.","PeriodicalId":513202,"journal":{"name":"ArXiv","volume":"47 29","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ArXiv","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3643742","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

Python has emerged as one of the most popular programming languages, extensively utilized in domains such as machine learning, data analysis, and web applications. Python's dynamic nature and extensive usage make it an attractive candidate for dynamic program analysis. However, unlike for other popular languages, there currently is no comprehensive benchmark suite of executable Python projects, which hinders the development of dynamic analyses. This work addresses this gap by presenting DyPyBench, the first benchmark of Python projects that is large scale, diverse, ready to run (i.e., with fully configured and prepared test suites), and ready to analyze (by integrating with the DynaPyt dynamic analysis framework). The benchmark encompasses 50 popular opensource projects from various application domains, with a total of 681k lines of Python code, and 30k test cases. DyPyBench enables various applications in testing and dynamic analysis, of which we explore three in this work: (i) Gathering dynamic call graphs and empirically comparing them to statically computed call graphs, which exposes and quantifies limitations of existing call graph construction techniques for Python. (ii) Using DyPyBench to build a training data set for LExecutor, a neural model that learns to predict values that otherwise would be missing at runtime. (iii) Using dynamically gathered execution traces to mine API usage specifications, which establishes a baseline for future work on specification mining for Python. We envision DyPyBench to provide a basis for other dynamic analyses and for studying the runtime behavior of Python code.
DyPyBench:可执行 Python 软件基准
Python 已成为最流行的编程语言之一,广泛应用于机器学习、数据分析和网络应用等领域。Python 的动态特性和广泛应用使其成为动态程序分析的理想对象。然而,与其他流行语言不同的是,目前还没有可执行 Python 项目的综合基准套件,这阻碍了动态分析的发展。为了弥补这一缺陷,本研究提出了 DyPyBench,它是第一个大规模、多样化、可运行(即具有完全配置和准备好的测试套件)和可分析(通过与 DynaPyt 动态分析框架集成)的 Python 项目基准。该基准包括来自不同应用领域的 50 个流行开源项目,共计 681k 行 Python 代码和 30k 个测试用例。DyPyBench 支持测试和动态分析中的各种应用,我们在这项工作中探讨了其中的三个应用:(i) 收集动态调用图,并将其与静态计算的调用图进行经验比较,从而揭示和量化现有 Python 调用图构建技术的局限性。(ii) 使用 DyPyBench 为 LExecutor 构建训练数据集,Lexecutor 是一个神经模型,可学习预测运行时缺失的值。(iii) 利用动态收集的执行跟踪挖掘 API 的使用规范,为 Python 未来的规范挖掘工作建立基线。我们设想 DyPyBench 将为其他动态分析和研究 Python 代码的运行时行为提供基础。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信