关于使用机器学习进行源代码审查的调查

2018 3rd International Conference on Information Systems Engineering (ICISE) Pub Date : 2018-05-04 DOI:10.1109/ICISE.2018.00018

Wang Xiaomeng, Zhang Tao, Xin Wei, Hou Changyu

{"title":"关于使用机器学习进行源代码审查的调查","authors":"Wang Xiaomeng, Zhang Tao, Xin Wei, Hou Changyu","doi":"10.1109/ICISE.2018.00018","DOIUrl":null,"url":null,"abstract":"Source code review constrains software system security sufficiently. Scalability and precision are of importance for the deployment of code review tools. However, traditional tools can only detect some security flaws automatically with high false positive and false negative by tedious reviewing large-scale source code. Various flaws and vulnerabilities show specific characteristic in source code. Machine learning systems founded feature matrixes of source code as input, including variables, functions and files, generating ad-hoc label by distinguish or generation methodologies to review source code automatically and intelligently. Source code, whatever the programming language, is text information in nature. Both secure and vulnerable feature can be curved from source code. Fortunately, a variety of machine learning approaches have been developed to learn and detect flaws and vulnerabilities in intelligent source code security review. Combination of code semantic and syntactic feature contribute to the optimation of false positive and false negative during source code review. In this paper, we give the review of literature related to intelligent source code security review using machine learning methods. It illustrate the primary evidence of approaching ML in source code security review. We believe machine learning and its branches will become out-standing in source code review.","PeriodicalId":207897,"journal":{"name":"2018 3rd International Conference on Information Systems Engineering (ICISE)","volume":"45 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"A Survey on Source Code Review Using Machine Learning\",\"authors\":\"Wang Xiaomeng, Zhang Tao, Xin Wei, Hou Changyu\",\"doi\":\"10.1109/ICISE.2018.00018\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Source code review constrains software system security sufficiently. Scalability and precision are of importance for the deployment of code review tools. However, traditional tools can only detect some security flaws automatically with high false positive and false negative by tedious reviewing large-scale source code. Various flaws and vulnerabilities show specific characteristic in source code. Machine learning systems founded feature matrixes of source code as input, including variables, functions and files, generating ad-hoc label by distinguish or generation methodologies to review source code automatically and intelligently. Source code, whatever the programming language, is text information in nature. Both secure and vulnerable feature can be curved from source code. Fortunately, a variety of machine learning approaches have been developed to learn and detect flaws and vulnerabilities in intelligent source code security review. Combination of code semantic and syntactic feature contribute to the optimation of false positive and false negative during source code review. In this paper, we give the review of literature related to intelligent source code security review using machine learning methods. It illustrate the primary evidence of approaching ML in source code security review. We believe machine learning and its branches will become out-standing in source code review.\",\"PeriodicalId\":207897,\"journal\":{\"name\":\"2018 3rd International Conference on Information Systems Engineering (ICISE)\",\"volume\":\"45 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-05-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 3rd International Conference on Information Systems Engineering (ICISE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICISE.2018.00018\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 3rd International Conference on Information Systems Engineering (ICISE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICISE.2018.00018","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

摘要

源代码审查充分约束了软件系统的安全性。可伸缩性和精确性对于代码审查工具的部署非常重要。然而，传统的工具只能通过繁琐的审查大规模源代码，自动检测出一些高假阳性和假阴性的安全漏洞。各种缺陷和漏洞在源代码中表现出特定的特征。机器学习系统建立源代码的特征矩阵作为输入，包括变量、函数和文件，通过区分或生成方法生成特设标签，自动智能地审查源代码。不管编程语言是什么，源代码本质上都是文本信息。安全和易受攻击的特性都可以从源代码中得到。幸运的是，已经开发了各种机器学习方法来学习和检测智能源代码安全审查中的缺陷和漏洞。代码语义特征和语法特征的结合有助于优化源代码审查过程中的假阳性和假阴性。在本文中，我们回顾了与使用机器学习方法的智能源代码安全审查相关的文献。它说明了在源代码安全审查中使用ML的主要证据。我们相信机器学习及其分支将在源代码审查中脱颖而出。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A Survey on Source Code Review Using Machine Learning

Source code review constrains software system security sufficiently. Scalability and precision are of importance for the deployment of code review tools. However, traditional tools can only detect some security flaws automatically with high false positive and false negative by tedious reviewing large-scale source code. Various flaws and vulnerabilities show specific characteristic in source code. Machine learning systems founded feature matrixes of source code as input, including variables, functions and files, generating ad-hoc label by distinguish or generation methodologies to review source code automatically and intelligently. Source code, whatever the programming language, is text information in nature. Both secure and vulnerable feature can be curved from source code. Fortunately, a variety of machine learning approaches have been developed to learn and detect flaws and vulnerabilities in intelligent source code security review. Combination of code semantic and syntactic feature contribute to the optimation of false positive and false negative during source code review. In this paper, we give the review of literature related to intelligent source code security review using machine learning methods. It illustrate the primary evidence of approaching ML in source code security review. We believe machine learning and its branches will become out-standing in source code review.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2018 3rd International Conference on Information Systems Engineering (ICISE)

自引率

0.00%

发文量