DynaPyt: Python的动态分析框架

A. Eghbali, Michael Pradel
{"title":"DynaPyt: Python的动态分析框架","authors":"A. Eghbali, Michael Pradel","doi":"10.1145/3540250.3549126","DOIUrl":null,"url":null,"abstract":"Python is a widely used programming language that powers important application domains such as machine learning, data analysis, and web applications. For many programs in these domains it is consequential to analyze aspects like security and performance, and with Python’s dynamic nature, it is crucial to be able to dynamically analyze Python programs. However, existing tools and frameworks do not provide the means to implement dynamic analyses easily and practitioners resort to implementing an ad-hoc dynamic analysis for their own use case. This work presents DynaPyt, the first general-purpose framework for heavy-weight dynamic analysis of Python programs. Compared to existing tools for other programming languages, our framework provides a wider range of analysis hooks arranged in a hierarchical structure, which allows developers to concisely implement analyses. DynaPyt features selective instrumentation and execution modification as well. We evaluate our framework on test suites of 9 popular open-source Python projects, 1,268,545 lines of code in total, and show that it, by and large, preserves the semantics of the original execution. The running time of DynaPyt is between 1.2x and 16x times the original execution time, which is in line with similar frameworks designed for other languages, and 5.6%–88.6% faster than analyses using a built-in tracing API offered by Python. We also implement multiple analyses, show the simplicity of implementing them and some potential use cases of DynaPyt. Among the analyses implemented are: an analysis to detect a memory blow up in Pytorch programs, a taint analysis to detect SQL injections, and an analysis to warn about a runtime performance anti-pattern.","PeriodicalId":68155,"journal":{"name":"软件产业与工程","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2022-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"DynaPyt: a dynamic analysis framework for Python\",\"authors\":\"A. Eghbali, Michael Pradel\",\"doi\":\"10.1145/3540250.3549126\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Python is a widely used programming language that powers important application domains such as machine learning, data analysis, and web applications. For many programs in these domains it is consequential to analyze aspects like security and performance, and with Python’s dynamic nature, it is crucial to be able to dynamically analyze Python programs. However, existing tools and frameworks do not provide the means to implement dynamic analyses easily and practitioners resort to implementing an ad-hoc dynamic analysis for their own use case. This work presents DynaPyt, the first general-purpose framework for heavy-weight dynamic analysis of Python programs. Compared to existing tools for other programming languages, our framework provides a wider range of analysis hooks arranged in a hierarchical structure, which allows developers to concisely implement analyses. DynaPyt features selective instrumentation and execution modification as well. We evaluate our framework on test suites of 9 popular open-source Python projects, 1,268,545 lines of code in total, and show that it, by and large, preserves the semantics of the original execution. The running time of DynaPyt is between 1.2x and 16x times the original execution time, which is in line with similar frameworks designed for other languages, and 5.6%–88.6% faster than analyses using a built-in tracing API offered by Python. We also implement multiple analyses, show the simplicity of implementing them and some potential use cases of DynaPyt. Among the analyses implemented are: an analysis to detect a memory blow up in Pytorch programs, a taint analysis to detect SQL injections, and an analysis to warn about a runtime performance anti-pattern.\",\"PeriodicalId\":68155,\"journal\":{\"name\":\"软件产业与工程\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-11-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"软件产业与工程\",\"FirstCategoryId\":\"1089\",\"ListUrlMain\":\"https://doi.org/10.1145/3540250.3549126\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"软件产业与工程","FirstCategoryId":"1089","ListUrlMain":"https://doi.org/10.1145/3540250.3549126","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

摘要

Python是一种广泛使用的编程语言,支持重要的应用领域,如机器学习、数据分析和web应用程序。对于这些领域中的许多程序来说,分析安全性和性能等方面是重要的,并且由于Python的动态特性,能够动态分析Python程序是至关重要的。然而,现有的工具和框架并不能很容易地提供实现动态分析的方法,从业者只能为他们自己的用例实现特别的动态分析。本文介绍了DynaPyt,第一个用于Python程序的重量级动态分析的通用框架。与其他编程语言的现有工具相比,我们的框架以分层结构提供了更广泛的分析钩子,这允许开发人员简明地实现分析。DynaPyt还具有选择性插装和执行修改功能。我们在9个流行的开源Python项目(总共1,268,545行代码)的测试套件上评估了我们的框架,并表明它基本上保留了原始执行的语义。DynaPyt的运行时间是原始执行时间的1.2倍到16倍,与为其他语言设计的类似框架一致,比使用Python提供的内置跟踪API进行分析快5.6%到88.6%。我们还实现了多个分析,展示了实现它们的简单性以及DynaPyt的一些潜在用例。实现的分析包括:检测Pytorch程序中内存爆炸的分析,检测SQL注入的污染分析,以及对运行时性能反模式发出警告的分析。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
DynaPyt: a dynamic analysis framework for Python
Python is a widely used programming language that powers important application domains such as machine learning, data analysis, and web applications. For many programs in these domains it is consequential to analyze aspects like security and performance, and with Python’s dynamic nature, it is crucial to be able to dynamically analyze Python programs. However, existing tools and frameworks do not provide the means to implement dynamic analyses easily and practitioners resort to implementing an ad-hoc dynamic analysis for their own use case. This work presents DynaPyt, the first general-purpose framework for heavy-weight dynamic analysis of Python programs. Compared to existing tools for other programming languages, our framework provides a wider range of analysis hooks arranged in a hierarchical structure, which allows developers to concisely implement analyses. DynaPyt features selective instrumentation and execution modification as well. We evaluate our framework on test suites of 9 popular open-source Python projects, 1,268,545 lines of code in total, and show that it, by and large, preserves the semantics of the original execution. The running time of DynaPyt is between 1.2x and 16x times the original execution time, which is in line with similar frameworks designed for other languages, and 5.6%–88.6% faster than analyses using a built-in tracing API offered by Python. We also implement multiple analyses, show the simplicity of implementing them and some potential use cases of DynaPyt. Among the analyses implemented are: an analysis to detect a memory blow up in Pytorch programs, a taint analysis to detect SQL injections, and an analysis to warn about a runtime performance anti-pattern.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
676
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信