Static Vulnerability Analysis Using Intermediate Representations: A Literature Review

European Conference on Cyber Warfare and Security Pub Date : 2023-06-19 DOI:10.34190/eccws.22.1.1154

Adam M Spanier, W. Mahoney

{"title":"Static Vulnerability Analysis Using Intermediate Representations: A Literature Review","authors":"Adam M Spanier, W. Mahoney","doi":"10.34190/eccws.22.1.1154","DOIUrl":null,"url":null,"abstract":"Analysis (SA) in Cybersecurity is a practice aimed at detecting vulnerabilities within the source code of a program. Modern SA applications, though highly sophisticated, lack programming language agnostic generalization, instead requiring codebase specific implementations for each programming language. The manner in which SA is implemented today, though functional, requires significant man hours to develop and maintain, higher costs due to custom applications for each language, and creates inconsistencies in implementation from SA-tool to SA-tool. A source of programming language generalization occurs within compilers. During the compilation process, source code is converted into a grammatically consistent Intermediate Representation (IR) (e.g. LLVM-IR) before being converted to an output format. The grammatical consistencies provided by the IR theoretically allow the same program written in different languages to be analyzed using the same mechanism. By using the IRs of compiled programming languages as the codebase of SA practices, multiple programming languages can be encompassed by a single SA tool. To begin understanding the possibilities the combination of SA and IRs may reveal, this research presents the following outcomes: 1) a systematic literature search, 2) a literature review, and 3) the classification of existing work pertaining to SA practices using IRs. The results of the study indicate that generalized Static Analysis using the LLVM IR is already a common practice in all compilers, but that the extended use of the LLVM IR in Cybersecurity SA practices aimed at finding vulnerabilities in source code remains underdeveloped.","PeriodicalId":258360,"journal":{"name":"European Conference on Cyber Warfare and Security","volume":"78 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"European Conference on Cyber Warfare and Security","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.34190/eccws.22.1.1154","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Analysis (SA) in Cybersecurity is a practice aimed at detecting vulnerabilities within the source code of a program. Modern SA applications, though highly sophisticated, lack programming language agnostic generalization, instead requiring codebase specific implementations for each programming language. The manner in which SA is implemented today, though functional, requires significant man hours to develop and maintain, higher costs due to custom applications for each language, and creates inconsistencies in implementation from SA-tool to SA-tool. A source of programming language generalization occurs within compilers. During the compilation process, source code is converted into a grammatically consistent Intermediate Representation (IR) (e.g. LLVM-IR) before being converted to an output format. The grammatical consistencies provided by the IR theoretically allow the same program written in different languages to be analyzed using the same mechanism. By using the IRs of compiled programming languages as the codebase of SA practices, multiple programming languages can be encompassed by a single SA tool. To begin understanding the possibilities the combination of SA and IRs may reveal, this research presents the following outcomes: 1) a systematic literature search, 2) a literature review, and 3) the classification of existing work pertaining to SA practices using IRs. The results of the study indicate that generalized Static Analysis using the LLVM IR is already a common practice in all compilers, but that the extended use of the LLVM IR in Cybersecurity SA practices aimed at finding vulnerabilities in source code remains underdeveloped.

查看原文本刊更多论文

使用中间表示的静态脆弱性分析:文献综述

网络安全中的分析(SA)是一种旨在检测程序源代码中的漏洞的实践。现代SA应用程序虽然非常复杂，但缺乏与编程语言无关的泛化，而是需要针对每种编程语言的特定代码库实现。目前实现SA的方式，尽管是功能性的，但需要大量的人力来开发和维护，由于每种语言的定制应用程序，成本更高，并且在从SA工具到SA工具的实现中造成不一致。编程语言泛化的一个来源发生在编译器中。在编译过程中，源代码在转换为输出格式之前被转换为语法一致的中间表示(IR)(例如LLVM-IR)。IR提供的语法一致性理论上允许用不同语言编写的相同程序使用相同的机制进行分析。通过使用已编译编程语言的IRs作为SA实践的代码库，单个SA工具可以包含多种编程语言。为了开始理解SA和IRs结合可能揭示的可能性，本研究提出了以下结果:1)系统的文献检索，2)文献综述，以及3)与使用IRs的SA实践相关的现有工作分类。研究结果表明，使用LLVM IR的广义静态分析已经是所有编译器中的一种常见做法，但在旨在发现源代码漏洞的网络安全SA实践中扩展使用LLVM IR仍然不发达。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

European Conference on Cyber Warfare and Security

自引率

0.00%

发文量