A New Approach to Web Application Security: Utilizing GPT Language Models for Source Code Inspection

IF 2.8 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Future Internet Pub Date : 2023-09-28 DOI:10.3390/fi15100326

Zoltán Szabó, Vilmos Bilicki

{"title":"A New Approach to Web Application Security: Utilizing GPT Language Models for Source Code Inspection","authors":"Zoltán Szabó, Vilmos Bilicki","doi":"10.3390/fi15100326","DOIUrl":null,"url":null,"abstract":"Due to the proliferation of large language models (LLMs) and their widespread use in applications such as ChatGPT, there has been a significant increase in interest in AI over the past year. Multiple researchers have raised the question: how will AI be applied and in what areas? Programming, including the generation, interpretation, analysis, and documentation of static program code based on promptsis one of the most promising fields. With the GPT API, we have explored a new aspect of this: static analysis of the source code of front-end applications at the endpoints of the data path. Our focus was the detection of the CWE-653 vulnerability—inadequately isolated sensitive code segments that could lead to unauthorized access or data leakage. This type of vulnerability detection consists of the detection of code segments dealing with sensitive data and the categorization of the isolation and protection levels of those segments that were previously not feasible without human intervention. However, we believed that the interpretive capabilities of GPT models could be explored to create a set of prompts to detect these cases on a file-by-file basis for the applications under study, and the efficiency of the method could pave the way for additional analysis tasks that were previously unavailable for automation. In the introduction to our paper, we characterize in detail the problem space of vulnerability and weakness detection, the challenges of the domain, and the advances that have been achieved in similarly complex areas using GPT or other LLMs. Then, we present our methodology, which includes our classification of sensitive data and protection levels. This is followed by the process of preprocessing, analyzing, and evaluating static code. This was achieved through a series of GPT prompts containing parts of static source code, utilizing few-shot examples and chain-of-thought techniques that detected sensitive code segments and mapped the complex code base into manageable JSON structures.Finally, we present our findings and evaluation of the open source project analysis, comparing the results of the GPT-based pipelines with manual evaluations, highlighting that the field yields a high research value. The results show a vulnerability detection rate for this particular type of model of 88.76%, among others.","PeriodicalId":37982,"journal":{"name":"Future Internet","volume":"58 1","pages":"0"},"PeriodicalIF":2.8000,"publicationDate":"2023-09-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Future Internet","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/fi15100326","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 1

Abstract

Due to the proliferation of large language models (LLMs) and their widespread use in applications such as ChatGPT, there has been a significant increase in interest in AI over the past year. Multiple researchers have raised the question: how will AI be applied and in what areas? Programming, including the generation, interpretation, analysis, and documentation of static program code based on promptsis one of the most promising fields. With the GPT API, we have explored a new aspect of this: static analysis of the source code of front-end applications at the endpoints of the data path. Our focus was the detection of the CWE-653 vulnerability—inadequately isolated sensitive code segments that could lead to unauthorized access or data leakage. This type of vulnerability detection consists of the detection of code segments dealing with sensitive data and the categorization of the isolation and protection levels of those segments that were previously not feasible without human intervention. However, we believed that the interpretive capabilities of GPT models could be explored to create a set of prompts to detect these cases on a file-by-file basis for the applications under study, and the efficiency of the method could pave the way for additional analysis tasks that were previously unavailable for automation. In the introduction to our paper, we characterize in detail the problem space of vulnerability and weakness detection, the challenges of the domain, and the advances that have been achieved in similarly complex areas using GPT or other LLMs. Then, we present our methodology, which includes our classification of sensitive data and protection levels. This is followed by the process of preprocessing, analyzing, and evaluating static code. This was achieved through a series of GPT prompts containing parts of static source code, utilizing few-shot examples and chain-of-thought techniques that detected sensitive code segments and mapped the complex code base into manageable JSON structures.Finally, we present our findings and evaluation of the open source project analysis, comparing the results of the GPT-based pipelines with manual evaluations, highlighting that the field yields a high research value. The results show a vulnerability detection rate for this particular type of model of 88.76%, among others.

查看原文本刊更多论文

Web应用程序安全的新方法:利用GPT语言模型进行源代码检查

由于大型语言模型(llm)的激增及其在ChatGPT等应用程序中的广泛使用，在过去的一年中，人们对人工智能的兴趣显著增加。许多研究人员提出了这样一个问题:人工智能将如何应用?应用在哪些领域?编程，包括基于提示的静态程序代码的生成、解释、分析和文档，是最有前途的领域之一。使用GPT API，我们探索了这方面的一个新方面:在数据路径的端点处对前端应用程序的源代码进行静态分析。我们的重点是检测CWE-653漏洞——隔离不充分的敏感代码段，可能导致未经授权的访问或数据泄露。这种类型的漏洞检测包括检测处理敏感数据的代码段，并对这些段的隔离和保护级别进行分类，这些段以前如果没有人为干预是不可行的。然而，我们相信可以探索GPT模型的解释能力，为正在研究的应用程序创建一组提示，以逐个文件地检测这些案例，并且该方法的效率可以为以前无法自动化的附加分析任务铺平道路。在我们论文的引言中，我们详细描述了漏洞和弱点检测的问题空间，领域的挑战，以及使用GPT或其他llm在类似复杂领域取得的进展。然后，我们介绍了我们的方法，其中包括我们对敏感数据和保护级别的分类。接下来是预处理、分析和评估静态代码的过程。这是通过一系列包含静态源代码部分的GPT提示实现的，使用少量示例和思维链技术来检测敏感代码段并将复杂的代码库映射到可管理的JSON结构中。最后，我们展示了我们的发现和对开源项目分析的评估，将基于gbp的管道的结果与人工评估的结果进行了比较，强调了该领域具有很高的研究价值。结果显示，该特定类型模型的漏洞检测率为88.76%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Future Internet Computer Science-Computer Networks and Communications

CiteScore

7.10

自引率

5.90%

发文量

303

审稿时长

11 weeks

期刊介绍： Future Internet is a scholarly open access journal which provides an advanced forum for science and research concerned with evolution of Internet technologies and related smart systems for “Net-Living” development. The general reference subject is therefore the evolution towards the future internet ecosystem, which is feeding a continuous, intensive, artificial transformation of the lived environment, for a widespread and significant improvement of well-being in all spheres of human life (private, public, professional). Included topics are: • advanced communications network infrastructures • evolution of internet basic services • internet of things • netted peripheral sensors • industrial internet • centralized and distributed data centers • embedded computing • cloud computing • software defined network functions and network virtualization • cloud-let and fog-computing • big data, open data and analytical tools • cyber-physical systems • network and distributed operating systems • web services • semantic structures and related software tools • artificial and augmented intelligence • augmented reality • system interoperability and flexible service composition • smart mission-critical system architectures • smart terminals and applications • pro-sumer tools for application design and development • cyber security compliance • privacy compliance • reliability compliance • dependability compliance • accountability compliance • trust compliance • technical quality of basic services.