大型语言模型辅助程序膨胀

IF 5.6 1区计算机科学 Q1 COMPUTER SCIENCE, SOFTWARE ENGINEERING

IEEE Transactions on Software Engineering Pub Date : 2025-08-01 DOI:10.1109/TSE.2025.3594673

Bo Lin;Shangwen Wang;Yihao Qin;Liqian Chen;Xiaoguang Mao

{"title":"大型语言模型辅助程序膨胀","authors":"Bo Lin;Shangwen Wang;Yihao Qin;Liqian Chen;Xiaoguang Mao","doi":"10.1109/TSE.2025.3594673","DOIUrl":null,"url":null,"abstract":"As software grows in complexity to accommodate diverse features and platforms, software bloating has emerged as a significant challenge, adversely affecting performance and security. However, existing approaches inadequately address the dual objectives of debloating: maintaining functionality by preserving essential features and enhancing security by reducing security issues. Specifically, current software debloating techniques often rely on input-based analysis, using user inputs as proxies for the specifications of desired features. However, these approaches frequently overfit provided inputs, leading to functionality loss and potential security vulnerabilities. To address these limitations, we propose <monospace>LEADER</monospace>, a program debloating framework enhanced by Large Language Models (LLMs), which leverages their semantic understanding, generative capabilities, and decision-making strengths. <monospace>LEADER</monospace> mainly consists of two modules: (1) a documentation-guided test augmentation module designed to preserve functionality, which leverages LLMs to comprehend program documentation and generates sufficient tests to cover the desired features comprehensively, and (2) a multi-advisor-aided program debloating module that employs a neuro-symbolic pipeline to ensure that the security of the software can be perceived during debloating. This module combines debloating and security advisors for analysis and employs an LLM as a decision-maker to eliminate undesired code securely. Extensive evaluations on widely used benchmarks demonstrate the efficacy of <monospace>LEADER</monospace>. It achieves a 95.5% test case pass rate and reduces program size by 42.5%. Notably, it reduces the introduction of vulnerabilities during debloating by 79.1% and decreases pre-existing vulnerabilities by 16.5% more than CovA. These results demonstrate that <monospace>LEADER</monospace> surpasses the state-of-the-art tool CovA in functionality and security. These results underscore the potential of <monospace>LEADER</monospace> to set a new standard in program debloating by effectively balancing functionality and security.","PeriodicalId":13324,"journal":{"name":"IEEE Transactions on Software Engineering","volume":"51 9","pages":"2651-2670"},"PeriodicalIF":5.6000,"publicationDate":"2025-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Large Language Models-Aided Program Debloating\",\"authors\":\"Bo Lin;Shangwen Wang;Yihao Qin;Liqian Chen;Xiaoguang Mao\",\"doi\":\"10.1109/TSE.2025.3594673\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"As software grows in complexity to accommodate diverse features and platforms, software bloating has emerged as a significant challenge, adversely affecting performance and security. However, existing approaches inadequately address the dual objectives of debloating: maintaining functionality by preserving essential features and enhancing security by reducing security issues. Specifically, current software debloating techniques often rely on input-based analysis, using user inputs as proxies for the specifications of desired features. However, these approaches frequently overfit provided inputs, leading to functionality loss and potential security vulnerabilities. To address these limitations, we propose <monospace>LEADER</monospace>, a program debloating framework enhanced by Large Language Models (LLMs), which leverages their semantic understanding, generative capabilities, and decision-making strengths. <monospace>LEADER</monospace> mainly consists of two modules: (1) a documentation-guided test augmentation module designed to preserve functionality, which leverages LLMs to comprehend program documentation and generates sufficient tests to cover the desired features comprehensively, and (2) a multi-advisor-aided program debloating module that employs a neuro-symbolic pipeline to ensure that the security of the software can be perceived during debloating. This module combines debloating and security advisors for analysis and employs an LLM as a decision-maker to eliminate undesired code securely. Extensive evaluations on widely used benchmarks demonstrate the efficacy of <monospace>LEADER</monospace>. It achieves a 95.5% test case pass rate and reduces program size by 42.5%. Notably, it reduces the introduction of vulnerabilities during debloating by 79.1% and decreases pre-existing vulnerabilities by 16.5% more than CovA. These results demonstrate that <monospace>LEADER</monospace> surpasses the state-of-the-art tool CovA in functionality and security. These results underscore the potential of <monospace>LEADER</monospace> to set a new standard in program debloating by effectively balancing functionality and security.\",\"PeriodicalId\":13324,\"journal\":{\"name\":\"IEEE Transactions on Software Engineering\",\"volume\":\"51 9\",\"pages\":\"2651-2670\"},\"PeriodicalIF\":5.6000,\"publicationDate\":\"2025-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Software Engineering\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/11106926/\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, SOFTWARE ENGINEERING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Software Engineering","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/11106926/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}

引用次数: 0

摘要

随着软件越来越复杂，以适应不同的功能和平台，软件膨胀已经成为一个重大挑战，对性能和安全性产生不利影响。然而，现有的方法不能充分地解决裁减的双重目标：通过保留基本特性来维护功能，并通过减少安全性问题来增强安全性。具体来说，当前的软件膨胀技术通常依赖于基于输入的分析，使用用户输入作为所需特性规范的代理。然而，这些方法经常过拟合提供的输入，导致功能损失和潜在的安全漏洞。为了解决这些限制，我们提出了LEADER，这是一个由大型语言模型（llm）增强的程序讨论框架，它利用了它们的语义理解、生成能力和决策优势。LEADER主要由两个模块组成：(1)一个文档引导的测试增强模块，旨在保持功能，它利用llm来理解程序文档并生成足够的测试，以全面覆盖所需的功能；(2)一个多顾问辅助的程序膨胀模块，采用神经符号管道，以确保在膨胀过程中可以感知软件的安全性。该模块结合了膨胀和安全顾问进行分析，并采用LLM作为决策者来安全消除不需要的代码。对广泛使用的基准进行的广泛评估证明了LEADER的有效性。它达到了95.5%的测试用例通过率，并将程序大小减少了42.5%。值得注意的是，它比CovA减少了79.1%的漏洞引入，并减少了16.5%的先前存在的漏洞。这些结果表明，LEADER在功能和安全性方面超过了最先进的工具CovA。这些结果强调了LEADER的潜力，通过有效地平衡功能和安全性，在程序膨胀方面树立了一个新的标准。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Large Language Models-Aided Program Debloating

As software grows in complexity to accommodate diverse features and platforms, software bloating has emerged as a significant challenge, adversely affecting performance and security. However, existing approaches inadequately address the dual objectives of debloating: maintaining functionality by preserving essential features and enhancing security by reducing security issues. Specifically, current software debloating techniques often rely on input-based analysis, using user inputs as proxies for the specifications of desired features. However, these approaches frequently overfit provided inputs, leading to functionality loss and potential security vulnerabilities. To address these limitations, we propose LEADER, a program debloating framework enhanced by Large Language Models (LLMs), which leverages their semantic understanding, generative capabilities, and decision-making strengths. LEADER mainly consists of two modules: (1) a documentation-guided test augmentation module designed to preserve functionality, which leverages LLMs to comprehend program documentation and generates sufficient tests to cover the desired features comprehensively, and (2) a multi-advisor-aided program debloating module that employs a neuro-symbolic pipeline to ensure that the security of the software can be perceived during debloating. This module combines debloating and security advisors for analysis and employs an LLM as a decision-maker to eliminate undesired code securely. Extensive evaluations on widely used benchmarks demonstrate the efficacy of LEADER. It achieves a 95.5% test case pass rate and reduces program size by 42.5%. Notably, it reduces the introduction of vulnerabilities during debloating by 79.1% and decreases pre-existing vulnerabilities by 16.5% more than CovA. These results demonstrate that LEADER surpasses the state-of-the-art tool CovA in functionality and security. These results underscore the potential of LEADER to set a new standard in program debloating by effectively balancing functionality and security.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Transactions on Software Engineering 工程技术-工程：电子与电气

CiteScore

9.70

自引率

10.80%

发文量

724

审稿时长

6 months

期刊介绍： IEEE Transactions on Software Engineering seeks contributions comprising well-defined theoretical results and empirical studies with potential impacts on software construction, analysis, or management. The scope of this Transactions extends from fundamental mechanisms to the development of principles and their application in specific environments. Specific topic areas include: a) Development and maintenance methods and models: Techniques and principles for specifying, designing, and implementing software systems, encompassing notations and process models. b) Assessment methods: Software tests, validation, reliability models, test and diagnosis procedures, software redundancy, design for error control, and measurements and evaluation of process and product aspects. c) Software project management: Productivity factors, cost models, schedule and organizational issues, and standards. d) Tools and environments: Specific tools, integrated tool environments, associated architectures, databases, and parallel and distributed processing issues. e) System issues: Hardware-software trade-offs. f) State-of-the-art surveys: Syntheses and comprehensive reviews of the historical development within specific areas of interest.