基于静态分析和机器学习的变量误用检测

Gleb Morgachev, V. Ignatyev, A. Belevantsev
{"title":"基于静态分析和机器学习的变量误用检测","authors":"Gleb Morgachev, V. Ignatyev, A. Belevantsev","doi":"10.1109/ISPRAS47671.2019.00009","DOIUrl":null,"url":null,"abstract":"Industrial static analyzers are able to detect only several narrow classes of algorithmic errors, for example actual arguments order swapped with formal parameters, forgotten renaming of variable after copy-paste. However, even for these categories essential part of errors is lost because of heuristical design of a checker. We propose the generalization of specified errors in the form of variable misuse problem and deal with it using machine learning. The proposed method uses message propagation through the program model represented as a graph, combining data from multiple analysis levels, including AST, dataflow. We introduce several error criteria, which were evaluated on the set of open source projects with millions of LoC. Testing in close to industrial conditions shows good false positive and missed errors ratio comparable with remaining detectors and allows to include developed checker (after a minor rework) into a general purpose production static analyzer for error detection.","PeriodicalId":154688,"journal":{"name":"2019 Ivannikov Ispras Open Conference (ISPRAS)","volume":"47 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Detection of Variable Misuse Using Static Analysis Combined with Machine Learning\",\"authors\":\"Gleb Morgachev, V. Ignatyev, A. Belevantsev\",\"doi\":\"10.1109/ISPRAS47671.2019.00009\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Industrial static analyzers are able to detect only several narrow classes of algorithmic errors, for example actual arguments order swapped with formal parameters, forgotten renaming of variable after copy-paste. However, even for these categories essential part of errors is lost because of heuristical design of a checker. We propose the generalization of specified errors in the form of variable misuse problem and deal with it using machine learning. The proposed method uses message propagation through the program model represented as a graph, combining data from multiple analysis levels, including AST, dataflow. We introduce several error criteria, which were evaluated on the set of open source projects with millions of LoC. Testing in close to industrial conditions shows good false positive and missed errors ratio comparable with remaining detectors and allows to include developed checker (after a minor rework) into a general purpose production static analyzer for error detection.\",\"PeriodicalId\":154688,\"journal\":{\"name\":\"2019 Ivannikov Ispras Open Conference (ISPRAS)\",\"volume\":\"47 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 Ivannikov Ispras Open Conference (ISPRAS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ISPRAS47671.2019.00009\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 Ivannikov Ispras Open Conference (ISPRAS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISPRAS47671.2019.00009","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

工业静态分析器只能检测到几种狭窄类别的算法错误,例如实际参数顺序与形式参数交换,在复制-粘贴之后忘记重命名变量。然而,即使对于这些类别,由于检查器的启发式设计,错误的基本部分也丢失了。我们提出以变量误用问题的形式泛化指定误差,并使用机器学习来处理它。该方法将来自AST、数据流等多个分析层次的数据结合起来,通过图表示的程序模型进行消息传播。我们引入了几个错误标准,并对具有数百万LoC的开源项目集进行了评估。在接近工业条件下的测试显示,与其他检测器相比,该检测器具有良好的误报率和漏报率,并且允许将开发的检测器(经过轻微的返工)包含到通用生产静态分析仪中,用于错误检测。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Detection of Variable Misuse Using Static Analysis Combined with Machine Learning
Industrial static analyzers are able to detect only several narrow classes of algorithmic errors, for example actual arguments order swapped with formal parameters, forgotten renaming of variable after copy-paste. However, even for these categories essential part of errors is lost because of heuristical design of a checker. We propose the generalization of specified errors in the form of variable misuse problem and deal with it using machine learning. The proposed method uses message propagation through the program model represented as a graph, combining data from multiple analysis levels, including AST, dataflow. We introduce several error criteria, which were evaluated on the set of open source projects with millions of LoC. Testing in close to industrial conditions shows good false positive and missed errors ratio comparable with remaining detectors and allows to include developed checker (after a minor rework) into a general purpose production static analyzer for error detection.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信